FoundationdDB Cluster Performance Issues

Currently, I have deployed a FoundationDB cluster with the help of the kubernetes operator. I was testing the cluster to see how far it could go, by writing timeseries data. After increasing the number of writes, I hit a plateau in which the number of writes won’t increase.

fdb> status details

Using cluster file `/var/dynamic-conf/fdb.cluster'.

Unable to start batch priority transaction after 5 seconds.

  Redundancy mode        - double
  Storage engine         - ssd-2
  Coordinators           - 3
  Desired Proxies        - 3
  Desired Resolvers      - 1
  Desired Logs           - 3

  FoundationDB processes - 12
  Zones                  - 12
  Machines               - 12
  Memory availability    - 10.6 GB per process on machine with least available
  Retransmissions rate   - 8 Hz
  Fault Tolerance        - 1 machine
  Server time            - 05/06/20 15:40:39

  Replication health     - Healthy (Repartitioning)
  Moving data            - 0.100 GB
  Sum of key-value sizes - 23.638 GB
  Disk space used        - 59.957 GB

Operating space:
  Storage server         - 345.3 GB free on most full server
  Log server             - 345.9 GB free on most full server

  Read rate              - 180 Hz
  Write rate             - 22760 Hz
  Transactions started   - 57 Hz
  Transactions committed - 26 Hz
  Conflict rate          - 1 Hz
  Performance limited by process: Storage server performance (storage queue).
  Most limiting process:

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:      (  2% cpu;  6% machine; 0.009 Gbps; 15% disk IO; 1.8 GB / 11.9 GB RAM  )      ( 25% cpu; 37% machine; 0.005 Gbps;  5% disk IO; 3.8 GB / 11.1 GB RAM  )      ( 18% cpu; 26% machine; 0.004 Gbps; 11% disk IO; 3.7 GB / 11.4 GB RAM  )      ( 25% cpu; 25% machine; 0.005 Gbps;  7% disk IO; 3.0 GB / 11.0 GB RAM  )      (  2% cpu; 13% machine; 0.009 Gbps; 15% disk IO; 1.9 GB / 11.7 GB RAM  )      (  2% cpu;  5% machine; 0.010 Gbps; 16% disk IO; 1.9 GB / 11.4 GB RAM  )      ( 25% cpu; 34% machine; 0.005 Gbps; 12% disk IO; 3.3 GB / 11.2 GB RAM  )       (  8% cpu; 12% machine; 0.046 Gbps;  0% disk IO; 0.3 GB / 11.3 GB RAM  )      (  5% cpu; 12% machine; 0.020 Gbps;  0% disk IO; 0.3 GB / 11.4 GB RAM  )      ( 25% cpu; 19% machine; 0.009 Gbps;  9% disk IO; 1.2 GB / 11.5 GB RAM  )      (  2% cpu; 19% machine; 0.000 Gbps;  0% disk IO; 0.3 GB / 10.6 GB RAM  )      ( 25% cpu; 18% machine; 0.007 Gbps; 13% disk IO; 3.1 GB / 11.4 GB RAM  )

Coordination servers:  (reachable)  (reachable)  (reachable)

Client time: 05/06/20 15:40:33

fdb> setclass
There are currently 12 processes in the database: log (command_line) storage (command_line) storage (command_line) storage (command_line) log (command_line) log (command_line) storage (command_line) proxy (set_class) stateless (command_line) storage (command_line) stateless (command_line) storage (set_class)

From another thread in the forums (Storage queue limiting performance when initially loading data), there was a suggestion to increase the number of storage processes per disk. At the moment I have tried to do that through the kubernetes operator by changing the spec (increasing the storage process count), but instead it tried to create a new pod instead (with a PVC). Is there is an easy way to achieve that? Or I should try something else instead?

No, we don’t currently have a way to run multiple storage processes per disk through the operator.

For what it’s worth, I’d be surprised if you could max out writes with a cluster this small. What write rate are you getting? It might be worth adding more logs and proxies, and maybe resolvers as well.

Thank you for your response, according to the stats the write rate was around 22k - 28k Hz. One thing that I observed was, that this 25% CPU load on the storage processes was partly to blame on the pod blueprint in which the limit was set to 250m. I managed to increase that to 500m and the CPU usage increased to 50% as well (for the storage processes). At the moment, I haven’t tried to increase it any further, but if my understanding is correct the usage would not increase beyond the one CPU core, is that correct?

I believe that with the current recruitment logic, having one process with proxy class will mean that we won’t consider stateless class processes for proxies, so you’ll likely get 1 proxy instead of 3.

Correct. Storage servers, like the rest of FDB, are a single threaded process, and thus won’t be able to take advantage of having >1 cores available to them.

The usage wouldn’t increase beyond one physical CPU core, but that would correspond to 2000m of CPU in the Kubernetes resource requirements, so I think it’s worth testing it at that level as well.

After following your advice about increasing the CPU requests/limit close to 2000m, the write throughput increased considerably even though the load still seems to be CPU bound rather than disk related. As an experiment, I kept increasing the writes to the cluster to find out after how many writes I would need to add more storage processes. After reaching 40k Hz writes I added more storage processes (I started a new cluster with 3 storage processes and 4 proxies and then I added 3 more storage processes). I left it for 2 days to repartition the data, but the original 3 storage processes had a storage queue of around 0.9 GB to 1.4 GB, and the new storage processes had at most 100 MB of data in their storage queue. What do you think is the problem here?

Is there any in-principal objection to that feature (multiple fdbservers per disk/pod) being added? If not, whats the way forward? Start a design discussion on GitHub? (I have a different cluster POC I’m working with, but this was a question I had :))

I don’t object to it in principle, and I haven’t heard from anyone who does. I suspect that it’s going to be fairly complex to implement, because of the secondary effects on port management, exclusions, bounces, etc… If you’d like to champion this feature, you can file an issue on GitHub and we can start discussing the design there.

1 Like

Thank you. I have a few things to investigate immediately but I will likely come back to that on GitHub in the near term.