|
|
ClickHouse is a database we're currently investigating to replace the I/O and metadata services for NDS.
|
|
|
|
|
|
## Summary
|
|
|
|
|
|
ClickHouse is an OLAP database which can be used as a timeseries database:
|
|
|
|
|
|
* Typically aimed at “write-once - read-many workflows”
|
|
|
* No transaction support, updating/deleting rows are inefficient/not supported
|
|
|
* Usually tuned for throughput rather than latency
|
|
|
* Also tuned for bulk inserts, performance not great for adding single rows
|
|
|
* Usually built-in support for aggregation/roll-ups
|
|
|
|
|
|
#### Interesting/relevant functionality:
|
|
|
|
|
|
* Support for arrays, unsigned ints
|
|
|
* Suitable for online queries (unlike many OLAP databases)
|
|
|
* Ingest data natively via Kafka
|
|
|
* Unified access for online and archival data
|
|
|
* Accepts/returns data in various formats (including Apache Arrow)
|
|
|
* Listen to updates in tables via “live views”
|
|
|
* Multi-tiered storage support, e.g.
|
|
|
* Store last hour of data in memory
|
|
|
* Migrate older data (up to last year of data) on NVMe/SSD
|
|
|
* Older data (1+ years) moves to object storage (S3-compatible)
|
|
|
|
|
|
#### Evaluation:
|
|
|
|
|
|
* ASC and LSC data ingested into ClickHouse via Kafka
|
|
|
* Store last ~5 minutes of data in tables
|
|
|
* Client library that returns “wide” rows - 1 column per channel
|
|
|
* Effectively a “pivot” of the stored table
|
|
|
* 75 channels across ASC / LSC subsystems for 5 mins of data
|
|
|
* Query time: ~20 s -> ~4ms per 16 Hz buffer
|
|
|
* Not using in-memory table, instead querying data from disk
|
|
|
|
|
|
#### Other thoughts:
|
|
|
|
|
|
* Channels may have different data types
|
|
|
* Columns must be a single type -> One table per subsystem per data type
|
|
|
* Apache Arrow appears to be a game-changer
|
|
|
* Current ClickHouse Python clients don’t have Arrow support
|
|
|
* ClickHouse also exposes a gRPC interface which supports Arrow format
|
|
|
* Currently keeping data and metadata decoupled
|
|
|
* Planning on ingesting metadata as well
|
|
|
* Live views are an experimental feature and needs to be “switched on”
|
|
|
* Could potentially support data replays trivially
|
|
|
* Due to inherent support for rollups/aggregations, trends could also be computed trivially
|
|
|
|
|
|
## Compression
|
|
|
|
|
|
ClickHouse uses LZ4 compression by default, but there are many options available on a per-column basis. See the  for more information.
|
|
|
|
|
|
Some investigations for different compression algorithms are shown below. In summary, the default LZ4 compression is pretty efficient for the 'data' column, but isn't for the 'time' column, which stores GPS times in nanoseconds as `int64`. We've found great compression with the Delta-LZ4 codec:
|
|
|
|
|
|
```
|
|
|
SELECT
|
|
|
name,
|
|
|
type,
|
|
|
compression_codec AS codec,
|
|
|
data_uncompressed_bytes AS uncompressed,
|
|
|
data_compressed_bytes AS compressed,
|
|
|
data_uncompressed_bytes / data_compressed_bytes AS ratio
|
|
|
FROM system.columns
|
|
|
WHERE table = 'arrakis-H1-ASC-float32'
|
|
|
|
|
|
Query id: e4896783-02c8-428f-a2f3-fc62ca6e8e4d
|
|
|
|
|
|
┌─name────┬─type───────────────────┬─codec────────────────┬─uncompressed─┬─compressed─┬──────────────ratio─┐
|
|
|
│ time │ Int64 │ CODEC(Delta(8), LZ4) │ 8634392520 │ 44538731 │ 193.86256245154357 │
|
|
|
│ channel │ LowCardinality(String) │ │ 2119500127 │ 12149936 │ 174.44537378633106 │
|
|
|
│ data │ Array(Float32) │ │ 19815535740 │ 85380252 │ 232.0857022066414 │
|
|
|
└─────────┴────────────────────────┴──────────────────────┴──────────────┴────────────┴────────────────────┘
|
|
|
```
|
|
|
|
|
|
For reference, here's a comparison of compression ratios for the 'time' column for default compression and the other best two I tried:
|
|
|
|
|
|
* Default (LZ4): 1.20
|
|
|
* DoubleDelta: 59.89
|
|
|
* Delta-LZ4: 193.86 |
|
|
\ No newline at end of file |