FoundationDB 7.1.24 - the memory usage after clean startup of fdbserver process is too high

Thanks for confirming that the 17 GB is RSS.

I see the problem. You have 49 TB of KV data and 56 processes. After triple replication, this is 49 * 3 / 56 = 2.6 TB of KV data that each process is responsible for.

There is a data structure that storage servers have called the Byte Sample which stores a deterministic random sample of keys. This data is persisted on disk in the storage engine and is loaded immediately upon storage server startup. Unfortunately, its size is not tracked or reported, but grows linearly with KV size and I suspect yours is somewhere around 4GB-6GB based on the memory usage I’ve seen for smaller storage KV sizes.

The byte sample size is technically configurable to a smaller sample rate, however changing its knob once a cluster is created is undefined behavior. FDB relies on the byte sample’s determinism to know how much logical data is in each shard and each storage server. Weird things will happen if you change this knob on a existing cluster and the cluster may become unavailable. I don’t think any data loss would occur but you could easily get into a situation that is hard to get out of.

If you need to reduce memory usage for you disk sizes you could reduce the cache memory setting.

If you want to reduce the size of the byte sample, you would have to create a new cluster and migrate your data to it. You would also have to make sure that no storage servers on the new cluster ever start up without the knob override. The option is knob_byte_sampling_factor and the default is 250. Multiply this by N to reduce the byte sample size by a factor of N.

5 Likes