Ratekeeper - MAX_TL_SS_VERSION_DIFFERENCE disabled?

jacksen · August 12, 2025, 12:25am

In server knobs, MAX_TL_SS_VERSION_DIFFERENCE is effectively disabled with a value of 1e99 (foundationdb/fdbclient/ServerKnobs.cpp at main · apple/foundationdb · GitHub)

Doesn’t this prevent Ratekeeper from protecting the cluster when storage servers are unable to keep up with writes?

Is there a reason to not set it to something like 10e6?

Semisol · August 14, 2025, 6:00am

The Ratekeeper now uses the storage server and TLog queue sizes. The SS queue represents how much the storage server has to apply and the TLog queue represents how much is kept in memory and was not consumed by all SSes.

The following knobs set the “soft” limit on how much can be queued:

TARGET_BYTES_PER_STORAGE_SERVER (1GB)
TARGET_BYTES_PER_STORAGE_SERVER_BATCH (750MB)
TARGET_BYTES_PER_TLOG (2.4GB)
TARGET_BYTES_PER_TLOG_BATCH (1.4GB)

The batch limits are separate and will start throttling batch-priority transactions earlier.

The ratekeeper will not start slowing down clients until the “spring” threshold is exceeded, which is how much of the soft budget is left:

SPRING_BYTES_STORAGE_SERVER (100MB)
SPRING_BYTES_STORAGE_SERVER_BATCH (100MB)
SPRING_BYTES_TLOG (400MB)
SPRING_BYTES_TLOG_BATCH (300MB)

There are also some other knobs:

STORAGE_HARD_LIMIT_BYTES: The hard limit a storage server can queue (1.5GB). If this limit is exceeded, the SS will stop reading from the TLog until sufficient progress is made.
TLOG_HARD_LIMIT_BYTES: The hard limit a TLog can keep in memory. This only kicks in if the spill process is not fast enough, and will block all new queue operations until the spill process is under the limit. (3GB)
TLOG_SPILL_THRESHOLD: The TLog will start spilling logs to disk if there is more than this much queued. (1.5GB)

The reason TLog spilling is needed is that if an SS fails, while it is being replaced by the Data Distributor, any mutations that are for the failed server are queued in case it comes back online (say, a restart).
In this case, the amount a TLog has to keep would exceed the short 5s window, and may not fit in memory.

(The on-disk circular log is only used for crash recovery.)

swr · August 14, 2025, 3:12pm

@Semisol isn’t TARGET_BYTES_PER_STORAGE_SERVER actually 1GB? 1000e6 bytes?

Semisol · August 14, 2025, 8:41pm

That was an error, fixed

Topic		Replies	Views
How does Ratekeeper actually _work_, and can I tune it? Running FoundationDB	6	761	October 27, 2022
Configuration of tolerate more storage servers left behind Using FoundationDB	1	373	January 20, 2022
Daily Pattern in WorstTLog Queue / Tuning TLog 2GB Queue size? Using FoundationDB	6	882	February 5, 2019
Ratekeeper limits dropping substantially with bulk writes Record Layer performance	4	411	March 12, 2024
How to prevent tlogs from overcommitting Using FoundationDB	20	1757	October 23, 2018

Ratekeeper - MAX_TL_SS_VERSION_DIFFERENCE disabled?

Related topics