Considerations for number of Key-Vals in the cluster


I am planning to model a metric time series layer on top of FDB, and this layer can create potentially a few Trillion rows in the system. So I wanted to check if there is any aspect I should consider before starting out?

Most of the data will be cold data and I will try to model the data so that time_bucket is prefixed to the key (in addition to timestamp being suffixed), to try to produce ranges that become immutable once the time_bucket for that range has become old. This is being done to ensure that the data that has become cold does not incur more churn of any kind. Something like:

coarse_time_bucket/series_key/timestamp -> values

Is there any limitation on the absolute number of KV pairs that can be stored in the system ? Assuming that there is enough disk storage in the cluster to hold KV pairs, are there any other resource requirements that grow proportional to number of KVs (like a memory map to hold location of keys etc.)?


Hm, well, a little. The primary metric we usually use when evaluating database size is how many key and value bytes are in the database rather than the number of keys. I suppose there are a few things like the byte sample (used by data distribution) that perform worse for a cluster of a given size the larger. There is also some amount of per-key overhead that is needed at the BTree level that you will have to pay as well.

The other thing that might not be obvious is that if your keys share a common prefix (which it sounds like they do), then you might end up “wasting” space on the common prefix (as the storage layer does not do any prefix compression).

Having said that, I would probably think about this mostly in terms of the amount of bytes rather than the number of keys. Maybe I’d estimate the overhead factor to be slightly higher. (On a triple replicated cluster, you expect something like 4x storage overhead–1 factor each for each replica and then another 1 for the per-key overhead.)

Thanks for the reply. I plan to have some level of indirection for mapping series_key to a long id to overcome the repetition overhead.

I was checking here primarily if there are any in-memory overheads to hold any kind of data-structures that grow linearly (and have non-trivial cost) as the number of key-values grow.