Smaller memory allocations for stateless pods?

We’re building out a new FoundationDB cluster, and I’m thinking about resource allocations after an initial load test. We’re using the FDB Kubernetes Operator and previously ran a load test that handled 100% of our write traffic for 24 hours (hooray!). We used the default resource allocations of 1 CPU and 8 GiB of memory for all fdbserver containers.

In that load test, we observed:

  1. Storage processes are memory-hungry, and we’d like to throw some more memory at them to see if that yields any measurable improvements
  2. Log processes never seemed to use more than 4 GiB of memory, even under heavy sustained load
  3. Stateless processes never seemed to use more than 0.5 GiB, even under heavy sustained load

I’ve seen a variety of bits of advice here and in the docs. I regret that I don’t have a comprehensive set of citations, but my summary at this point is that 8 GiB is, indeed, the recommended minimum these days. More recently, though, I noticed this bit of advice:

As a general rule, I think setting memory to (1.5 * cache_memory + 4GB) would be a stable configuration.

My question: since stateless processes don’t use cache_memory at all (right?), is it safe to reduce their memory setting to 4 GiB, or does it make more sense to keep them at 8 GiB for reasons that didn’t become evident in our (admittedly short!) load test?

For context, the motivation for reducing the memory allocation for stateless processes would be to make room to increase the allocation for storage (and log) processes. We’re currently using machines with 4 vCPU, 32 GiB of memory, and one SSD. After subtracting a little overhead for background processes/sidecars, I think we can fit three fdbserver processes (we have three uncontested cores and one core that’s divided between those auxiliary processes) and divide ~30 GiB of memory between them. Since there’s just one disk, I think our most common configuration will be one storage process with two stateless processes. The more we can shrink the stateless processes’ memory requierment, the more we can add memory to the storage process.

Thanks!

1 Like

You are correct that stateless processes do not use cache_memory. The memory requirements vary by stateless role and some in-memory structures will increase with cluster size, so for the generic stateless process class it is recommended to keep the limit at 8GiB since they can be used for any stateless roles. In order to go lower for specific processes you would have to use more specific process classes to control which roles run where, however be aware that during cluster recovery the recruitment logic operates on a “best fit” notion, so for example if you configure 6 Commit Proxies but only 3 of your commit proxy class fdbserver processes are alive during recruitment it will choose 3 other processes to run the configured 6 instances of that role.

Thanks! That’s helpful, and I appreciate it!