Ratekeeper will throttle all the way to 0 normal (non-“system_priority_immediate”) transactions under the right conditions, such as more than 2 Storage Queues growing beyond something like 1.8GB.
Based on
We didn’t see any noticeable change/problem in our metrics for the storage process data or durability lags, or the log queue length.
it sounds like that did not happen here because the cluster was still working at some non-zero transaction rate despite all the StorageServer restarts caused by OOMs. This is not unexpected because Redwood is very fast at recovering from disk and returning to normal write throughput. The return-to-performance time will be especially short if the writes have high key locality.
As for the default memory configuration, it’s probably the case that little attention has been paid to the defaults by large scale FDB users for several years and the StorageServer memory usage outside of the storage engine’s page cache has grown. FDB does not currently limit its memory usage as a reaction to its current usage vs its budget, so it is essentially up to the user to set the cache-memory
and memory
limits.
- The
cache-memory
option sets the page cache size for the Redwood andssd-2
storage engines. They will reliably stick to this limit aside from an occasional tiny overage when too many page eviction attempts encounter temporarily pinned pages. - The
memory
option sets the total memory usage (specifically RSS, not virtual memory) for the process. For a Storage class process, this setting must be large enough to accommodate the sum of:cache-memory
- Storage engine memory usage aside from its page cache (such as temporary memory used by reads or pending writes)
- StorageServer memory other than the storage engine memory listed above.
StorageServer memory will vary based on its current user workload, shard movement activity, and logical data size.
I don’t think there is any documentation describing how to arrive at what memory
should be relative to cache-memory
, and in fact in the fleets I’ve been involved with we have updated these settings occasionally based on memory usage observations. As a general rule, I think setting memory
to (1.5 * cache_memory + 4GB)
would be a stable configuration.