We used YCSB to benchmark FoundationDB, but the performance did not show linear scaling

tiantian · September 22, 2025, 8:07am

The cluster consists of 35 nodes, each with 2 CPU cores and 8 GB of memory. Each node runs one fdbserver. Among them, there are 18 storage processes, 4 transaction processes, 1 grv_proxy, 3 commit_proxies, and 3 coordinator nodes; the rest are set to 1 process each. The database is initialized with triple replication, the storage engine is ssd-redwood-1, and each node’s configuration file was initialized as follows.

[fdbmonitor]
user = foundationdb
group = foundationdb
[general]
restart_delay = 60
cluster_file  = /etc/foundationdb/fdb.cluster
[fdbserver]
command        = /usr/sbin/fdbserver
public_address = ${SELF_IP}:${FDB_PORT}
listen_address = 0.0.0.0:${FDB_PORT}
datadir        = /data0/foundationdb/data/${FDB_PORT}
logdir         = /data0/foundationdb/log
memory         = 7GiB
cache-memory  =  4GiB
logsize = 10MiB
maxlogssize = 100MiB
memory-vsize = 0
locality-zoneid = ${CNP_ZONE}-${SELF_IP}
locality-data-hall = ${CNP_REGION}
io-trust-seconds = 20
[fdbserver.${FDB_PORT}]
class = ${FDB_CLASS_DEFAULT}During the YCSB benchmark, the key size was set to 128 and the value size to 1024. There were 3 clients, and each client ran the following workload:

During the YCSB benchmark, the key size was set to 128 and the value size to 1024. There were 3 clients, and each client ran the following workload:

FDB_CLUSTER_FILE=`pwd`/conf/154.cluster
TEST_DIR=`pwd`
nohup bin/ycsb.sh run foundationdb -s \
  -P workloads/workloada \
  -p table=usertable \
  -p recordcount=20000000 \
  -p insertstart=0 \
  -p operationcount=20000000 \
  -p insertorder=ordered \
  -p insertproportion=1 -p readproportion=0 -p updateproportion=0 -p scanproportion=0 \
  -p zeropadding=124 \
  -p fieldcount=1 -p fieldlength=1024 \
  -p foundationdb.apiversion=520 \
  -p foundationdb.clusterfile="$FDB_CLUSTER_FILE" \
  -p measurementtype=hdrhistogram \
  -p hdrhistogram.percentiles=95,99,99.9,99.99 \
  -threads 1000 > $TEST_DIR/write154.log 2>&1 &

FDB_CLUSTER_FILE=`pwd`/conf/154.cluster
TEST_DIR=`pwd`
nohup bin/ycsb.sh run foundationdb -s \
  -P workloads/workloada \
  -p table=usertable \
  -p recordcount=20000000 \
  -p insertstart=0 \
  -p operationcount=20000000 \
  -p scanproportion=1 -p readproportion=0 -p updateproportion=0 -p insertproportion=0 \
  -p requestdistribution=uniform \
  -p maxscanlength=100 \
  -p scanlengthdistribution=uniform \
  -p zeropadding=124 \
  -p fieldcount=1 \
  -p readallfields=true \
  -p foundationdb.apiversion=520 \
  -p foundationdb.clusterfile=$FDB_CLUSTER_FILE \
  -p measurementtype=hdrhistogram \
  -p hdrhistogram.percentiles=95,99,99.9,99.99 \
  -threads 1000 > $TEST_DIR/read154.log 2>&1 &

The benchmark results are shown below. Why is it that neither adding more storage processes nor adding more grv_proxies achieves the linear scaling shown in the official documentation? How should I configure the system to improve throughput, and which role should I expand for testing?

saintstack · September 23, 2025, 3:11am

tiantian:

…
FDB_CLUSTER_FILE=`pwd`/conf/154.cluster
TEST_DIR=`pwd`
nohup bin/ycsb.sh run foundationdb -s \
  -P workloads/workloada \
  -p table=usertable \
  -p recordcount=20000000 \
  -p insertstart=0 \
  -p operationcount=20000000 \
  -p scanproportion=1 -p readproportion=0 -p updateproportion=0 -p insertproportion=0 \
  -p requestdistribution=uniform \
  -p maxscanlength=100 \
  -p scanlengthdistribution=uniform \
  -p zeropadding=124 \
  -p fieldcount=1 \
  -p readallfields=true \
  -p foundationdb.apiversion=520 \
  -p foundationdb.clusterfile=$FDB_CLUSTER_FILE \
  -p measurementtype=hdrhistogram \
  -p hdrhistogram.percentiles=95,99,99.9,99.99 \
  -threads 1000 > $TEST_DIR/read154.log 2>&1 &
The benchmark results are shown below. Why is it that neither adding more storage processes nor adding more grv_proxies achieves the linear scaling shown in the official documentation? How should I configure the system to improve throughput, and which role should I expand for testing?

202509221600201261580×1028 166 KB

Hello:

Can you add a pointer to the ‘official documentation’ you refer to please? I see QPS rises w/ client count (if I read that right). 1000 threads seems like a lot in a single java process. Perhaps try with less (How much heap was the client running with? Are the clients using any CPU?). Any metrics on FDB processes you want to share or observations on how the cluster is doing while under load?

Thank you,
S

jacksen · September 23, 2025, 5:51am

Worth testing with more clients

The FoundationDB client network thread is quite limited regarding throughput which results in all kinds of issues, like p99 spikes

tiantian · September 24, 2025, 7:19am

Hello, the official benchmark link is as follows: https://apple.github.io/foundationdb/performance.html. Each client was running on a separate machine, independent of the cluster machines.

Subsequently, we modified the cluster configuration to 100 storage processes, 10 transaction processes, 10 GRV proxies, 10 commit proxies, and 3 coordinator nodes. We conducted YCSB stress tests with the same 3 clients and 1,000 threads, but the cluster performance did not increase with the addition of more roles—it remained roughly the same as the data shown in the table.

Later, we scaled the client size to 30, with each client running 10 YCSB programs for testing. The read-write mix was 7:3, and the highest QPS achieved was 1.8 million per second.

Regarding the cluster load, we parsed the files obtained from the “status json” and derived a series of cluster metrics. However, there are too many metrics to analyze easily. Could you please advise which metrics we should focus on in this situation?

tiantian · September 24, 2025, 7:28am

We scaled the client size to 20, with each client running 10 YCSB programs for testing under a 7:3 read-write mix. The peak QPS reached 1.75 million per second. After further scaling the client size to 30, the QPS reached 1.872 million per second, with a P99 latency around 250 ms.

tiantian · September 25, 2025, 2:05am

Hello, we specified process classes in the .conf file, which would assume different roles, preventing a single node from assuming multiple roles. We then increased the number of specific roles for benchmarking. However, we encountered a problem. After scaling up the stoarge nodes to 1,000, the transaction count to 50, the resolution count to 10, the grv-proxy count to 50, the commit-proxy count to 50, and the stateless count to 10, the status displayed the following message for ten hours without changing, with the disk space used continuously increasing:

fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - triple
  Storage engine         - ssd-2
  Log engine             - ssd-2
  Encryption at-rest     - disabled
  Coordinators           - 5
  Desired Commit Proxies - 50
  Desired GRV Proxies    - 50
  Desired Resolvers      - 10
  Desired Logs           - 50
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 1170
  Zones                  - 1170
  Machines               - 1170
  Memory availability    - 7.0 GB per process on machine with least available
  Fault Tolerance        - 2 zones
  Server time            - 09/25/25 09:56:14

Data:
  Replication health     - (Re)initializing automatic data distribution
  Moving data            - unknown (initializing)
  Sum of key-value sizes - unknown
  Disk space used        - 1.317 TB

After checking the “status json”, I found that data_distributor frequently switches between stateless processes, and the CPU usage is almost always above 90%. What is causing this problem? How can I solve this problem and restore the database to a healthy state?

alexmiller · September 25, 2025, 11:12pm

I’m a little suspicious that since you lowered the memory ceiling from the recommended 8GB to 7GB, your data distributor is probably OOMing itself because all the memory tuning knobs within FDB are set assuming 8GB available per process. I’d suggest sticking with the 8GB, and if you find that you need to feed it more than 8GB for it to stop OOMing that’s worth filing a bug over.

Also note that desired != actual. It’d be good to double check from status json how many of which roles were actually recruited based upon your cluster.

I’d also expect that you’d need to run more clients than transaction subsystem processes. Do check your client FDB Network Thread CPU utilization, and either change your YCSB tester to use client_threads_per_version to run multiple FDB network threads (and then round-robin requests across multiple database objects, or partition threads to databases, or something), or just run more YCSB testers that are set to run less QPS per client.

Also be aware that YCSB testing for FoundationDB will be less representative of real workloads than it might be for other key-value stores. FoundationDB has a higher fixed cost for starting a transaction as part of the GetReadVersion call, and YCSB implementations typically do one read or one write per transaction.

I’d make sure that you can first get something that looks like a reasonable saturation and throughput on a small cluster to figure out the about right ratio of log:storage, grv/commit proxies, and number of clients for your benchmark, and then scale up at full multiples.

simenl · October 3, 2025, 1:56pm

Have you tried with insertorder=hashed?

I am taking a guess here:

-p insertorder=ordered

Would cause the key-value pairs to be written ordered by the key, likely causing all writes to hit the same storage processes. I.e. the write load would not be evenly distributed across the processes.

tiantian · October 11, 2025, 9:10am

Yes, we have tried using insertorder=hashed, but the performance was basically the same — we didn’t observe any significant improvement compared to the default setting.

Topic		Replies	Views
We used YCSB to benchmark FoundationDB, but the performance results did not show linear scaling Using FoundationDB performance , operator	3	164	September 25, 2025
Benchmarking FoundationDB on AWS Using FoundationDB	18	9098	October 30, 2018
Why doesn't my cluster performance scale when I double the number of machines? Using FoundationDB performance	20	3384	August 17, 2018
Cluster tuning cookbook Using FoundationDB	26	8953	February 1, 2019
Scalability performance benchmark Using FoundationDB performance	6	2708	March 27, 2019

We used YCSB to benchmark FoundationDB, but the performance did not show linear scaling

Related topics