Hello *,
Recently for my own uses, I’ve been experimenting with (and somewhat standardizing on) the usage of ClickHouse for a lot of my time series/OLAP/logging information – it’s a screaming fast column store for OLAP workloads. Recently I found a question on the forums about ingesting/reading trace files, which made it apparent a lot of people are using one-off tools for logging/analytics. I thought it might be a good use of ClickHouse to try ingesting and querying the FDB trace logs: they’re “wide” event logs with potentially many columns for each event which is a decent fit for columnar designs.
Here’s the result of this experiment: a tool, contained in a Docker container, that you can launch to watch for log files being rotated, and ingest them whenever fdbserver
syncs/flushes them and creates a new file. It depends on the (relatively new) --trace_format json
directive in fdbserver
.
Additions to the various column types, CODEC
choices, etc would be appreciated.
In the future, ClickHouse will also (ideally) support reading/scanning/ingesting data from S3 data sources directly without tools like Kafka, which would open up another mechanism for doing log ingestion (instead this tool would only have to write cleaned-up logs to S3, and schema management, query construction, etc could be done elsewhere.) This would have the obvious value-add of re-using any existing S3 endpoints in your infrastructure, which is already necessary for robust and scalable FoundationDB backups.
There are several more details about how things work inside the README. I’d appreciate any particular feedback, but probably the most important thing is:
- Is the general idea of waiting for
close(2)
events on trace files throughinotify
a valid and legitimate way of tracking log file rotations? The whole assumption here is that a singlefdbserver
writes to a single trace file and once it’s closed, it’s rotated and never touched again. Providing this guarantee is met, I think everything else is relatively valid, right?
Very fast TL;DR: you can run these two docker images to pull up a demo ClickHouse server and a copy of this tool, which will ingest any .json
trace logs in /var/log/foundationdb
into ClickHouse
$ docker run -d --rm \
--ulimit nofile=262144:262144 \
-p 8123:8123 \
--name clickhouse-server \
yandex/clickhouse-server
$ docker run -d --rm \
--link clickhouse-server \
-e CLICKHOUSE_ADDR=http://clickhouse-server:8123 \
-e CLICKHOUSE_DB=testing \
-e CLICKHOUSE_TABLE=cluster01 \
-v /var/log/foundationdb:/data/logs \
thoughtpolice/fdblog2clickhouse:latest
After a while, you can run a clickhouse-client
and check out the data:
$ docker run -it --rm \
--link clickhouse-server \
yandex/clickhouse-client \
--host clickhouse-server
ClickHouse client version 19.5.2.6 (official build).
Connecting to clickhouse-server:9000 as user default.
Connected to ClickHouse server version 19.5.2 revision 54417.
e328593055b3 :) describe table testing.cluster01;
DESCRIBE TABLE testing.cluster01
┌─name───────┬─type─────────────┬─default_type─┬─default_expression─┬─comment─────────┬─codec_expression─┐
│ As │ Nullable(String) │ │ │ Lorem ipsum │ NONE │
│ ID │ String │ │ │ Lorem ipsum │ NONE │
│ Locality │ Nullable(String) │ │ │ Lorem ipsum │ NONE │
│ Machine │ String │ │ │ Lorem ipsum │ NONE │
│ Severity │ UInt32 │ │ │ Event severity │ NONE │
│ Transition │ Nullable(String) │ │ │ Lorem ipsum │ NONE │
│ Time │ DateTime │ │ │ Event timestamp │ NONE │
│ Type │ String │ │ │ Event type │ NONE │
└────────────┴──────────────────┴──────────────┴────────────────────┴─────────────────┴──────────────────┘
8 rows in set. Elapsed: 0.001 sec.
e328593055b3 :) select count(*) from testing.cluster01;
SELECT count(*)
FROM testing.cluster01
┌─count()─┐
│ 25831 │
└─────────┘
1 rows in set. Elapsed: 0.002 sec. Processed 25.83 thousand rows, 103.32 KB (10.34 million rows/s., 41.38 MB/s.)
e328593055b3 :)