Currently, I have deployed a FoundationDB cluster with the help of the kubernetes operator. I was testing the cluster to see how far it could go, by writing timeseries data. After increasing the number of writes, I hit a plateau in which the number of writes won’t increase.
fdb> status details
Using cluster file `/var/dynamic-conf/fdb.cluster'.
Unable to start batch priority transaction after 5 seconds.
Configuration:
Redundancy mode - double
Storage engine - ssd-2
Coordinators - 3
Desired Proxies - 3
Desired Resolvers - 1
Desired Logs - 3
Cluster:
FoundationDB processes - 12
Zones - 12
Machines - 12
Memory availability - 10.6 GB per process on machine with least available
Retransmissions rate - 8 Hz
Fault Tolerance - 1 machine
Server time - 05/06/20 15:40:39
Data:
Replication health - Healthy (Repartitioning)
Moving data - 0.100 GB
Sum of key-value sizes - 23.638 GB
Disk space used - 59.957 GB
Operating space:
Storage server - 345.3 GB free on most full server
Log server - 345.9 GB free on most full server
Workload:
Read rate - 180 Hz
Write rate - 22760 Hz
Transactions started - 57 Hz
Transactions committed - 26 Hz
Conflict rate - 1 Hz
Performance limited by process: Storage server performance (storage queue).
Most limiting process: 10.112.15.50:4501
Backup and DR:
Running backups - 0
Running DRs - 0
Process performance details:
10.112.14.23:4501 ( 2% cpu; 6% machine; 0.009 Gbps; 15% disk IO; 1.8 GB / 11.9 GB RAM )
10.112.15.50:4501 ( 25% cpu; 37% machine; 0.005 Gbps; 5% disk IO; 3.8 GB / 11.1 GB RAM )
10.112.17.16:4501 ( 18% cpu; 26% machine; 0.004 Gbps; 11% disk IO; 3.7 GB / 11.4 GB RAM )
10.112.27.11:4501 ( 25% cpu; 25% machine; 0.005 Gbps; 7% disk IO; 3.0 GB / 11.0 GB RAM )
10.112.29.14:4501 ( 2% cpu; 13% machine; 0.009 Gbps; 15% disk IO; 1.9 GB / 11.7 GB RAM )
10.112.30.11:4501 ( 2% cpu; 5% machine; 0.010 Gbps; 16% disk IO; 1.9 GB / 11.4 GB RAM )
10.112.31.18:4501 ( 25% cpu; 34% machine; 0.005 Gbps; 12% disk IO; 3.3 GB / 11.2 GB RAM )
10.112.32.8:4501 ( 8% cpu; 12% machine; 0.046 Gbps; 0% disk IO; 0.3 GB / 11.3 GB RAM )
10.112.32.35:4501 ( 5% cpu; 12% machine; 0.020 Gbps; 0% disk IO; 0.3 GB / 11.4 GB RAM )
10.112.33.13:4501 ( 25% cpu; 19% machine; 0.009 Gbps; 9% disk IO; 1.2 GB / 11.5 GB RAM )
10.112.33.54:4501 ( 2% cpu; 19% machine; 0.000 Gbps; 0% disk IO; 0.3 GB / 10.6 GB RAM )
10.112.34.12:4501 ( 25% cpu; 18% machine; 0.007 Gbps; 13% disk IO; 3.1 GB / 11.4 GB RAM )
Coordination servers:
10.112.27.11:4501 (reachable)
10.112.31.18:4501 (reachable)
10.112.33.13:4501 (reachable)
Client time: 05/06/20 15:40:33
fdb> setclass
There are currently 12 processes in the database:
10.112.14.23:4501: log (command_line)
10.112.15.50:4501: storage (command_line)
10.112.17.16:4501: storage (command_line)
10.112.27.11:4501: storage (command_line)
10.112.29.14:4501: log (command_line)
10.112.30.11:4501: log (command_line)
10.112.31.18:4501: storage (command_line)
10.112.32.8:4501: proxy (set_class)
10.112.32.35:4501: stateless (command_line)
10.112.33.13:4501: storage (command_line)
10.112.33.54:4501: stateless (command_line)
10.112.34.12:4501: storage (set_class)
fdb>
From another thread in the forums (Storage queue limiting performance when initially loading data), there was a suggestion to increase the number of storage processes per disk. At the moment I have tried to do that through the kubernetes operator by changing the spec (increasing the storage process count), but instead it tried to create a new pod instead (with a PVC). Is there is an easy way to achieve that? Or I should try something else instead?