We are deploying a FoundationDB cluster on AWS. Due to a recent outage we have to recreate our cluster. FDB is used a storage of key-value pairs with a Redis Cluster in front serving as a cache. Whenever the Redis Cluster reaches a memory utilization limit we dump some of the older keys into FDB to free up space. We also retrieve the keys from FDB if requested and key is missing in the cache.
We had used a kind of default (lazy) setup (process classes were not set explicitly) and it worked well for the normal case.
However now I’m facing issues when trying to bulk load the data from a Redis Cluster (I’ve recreated all the keys in a larger cluster ~ 780M keys). The issue that I discovered is the disk saturation on all servers (tlog and storage classes are combined into a single process)
Here I’d like to ask for advice on how the setup for initial bulk load of data should differ from a regular setup (where reads/writes are equally likely and at a much lower rate).
One simple improvement is setting the redundancy setting to ‘single’ (during bulk load the cluster constantly enters the “Healthy (Repartitioning)” state with triple redundancy setting).
I’ve read that increasing the tlog processes (alongside disks and cpu cores) could help although didn’t really understood why.
Any other advice you could suggest for the initial write-intensive setup?
My goal is to achieve write throughput above 20K/s. The key-value pairs are written in a transaction (4096 pairs per transaction).