I have a three_datacenter mode cluster and each cluster has 6 nodes.
The three DCs of the cluster are ningbo1, ningbo2, and zhengzhou.
The network delay of ningbo1 and ningbo2 is 0.1ms, and the delay between them and Zhengzhou is 20ms.
The primary dc is ningbo2.
When running the performance test (use go-ycsb) in zhengzhou, the single write latency gradually rises to 450ms:
./go-ycsb load foundationdb -P workloada --threads 1
The single read latency gradually starts from 160ms and drops to 100ms:
Here’s the output of fdbcli:
fdb> status
Using cluster file `/etc/foundationdb/fdb.cluster'.
Configuration:
Redundancy mode - three_datacenter
Storage engine - ssd-2
Coordinators - 7
Desired Commit Proxies - 3
Desired GRV Proxies - 1
Desired Resolvers - 1
Desired Logs - 18
Usable Regions - 1
Cluster:
FoundationDB processes - 483
Zones - 483
Machines - 18
Memory availability - 8.0 GB per process on machine with least available
Retransmissions rate - 46 Hz
Fault Tolerance - 3 zones
Server time - 01/21/23 15:11:23
Data:
Replication health - Healthy
Moving data - 0.000 GB
Sum of key-value sizes - 124.486 GB
Disk space used - 1.291 TB
Storage wiggle:
Wiggle server addresses- 10.195.154.7:5513
Wiggle server count - 1
Operating space:
Storage server - 1513.1 GB free on most full server
Log server - 1689.9 GB free on most full server
Workload:
Read rate - 548 Hz
Write rate - 0 Hz
Transactions started - 339 Hz
Transactions committed - 0 Hz
Conflict rate - 0 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
Client time: 01/21/23 15:10:29
fdb> status details
Using cluster file `/etc/foundationdb/fdb.cluster'.
Configuration:
Redundancy mode - three_datacenter
Storage engine - ssd-2
Coordinators - 7
Desired Commit Proxies - 3
Desired GRV Proxies - 1
Desired Resolvers - 1
Desired Logs - 18
Usable Regions - 1
Cluster:
FoundationDB processes - 483
Zones - 483
Machines - 18
Memory availability - 8.0 GB per process on machine with least available
Retransmissions rate - 47 Hz
Fault Tolerance - 3 zones
Server time - 01/21/23 15:11:28
Data:
Replication health - Healthy
Moving data - 0.000 GB
Sum of key-value sizes - 124.486 GB
Disk space used - 1.291 TB
Storage wiggle:
Wiggle server addresses- 10.195.154.7:5513
Wiggle server count - 1
Operating space:
Storage server - 1513.1 GB free on most full server
Log server - 1689.9 GB free on most full server
Workload:
Read rate - 471 Hz
Write rate - 0 Hz
Transactions started - 615 Hz
Transactions committed - 0 Hz
Conflict rate - 0 Hz
Backup and DR:
Running backups - 0
Running DRs - 0
Process performance details:
10.181.159.70:5500 ( 2% cpu; 9% machine; 0.014 Gbps; 0% disk IO; 0.2 GB / 8.0 GB RAM )
10.181.159.70:5501 ( 2% cpu; 9% machine; 0.014 Gbps; 0% disk IO; 3.3 GB / 8.0 GB RAM )
10.181.159.70:5502 ( 2% cpu; 9% machine; 0.014 Gbps; 0% disk IO; 3.3 GB / 8.0 GB RAM )
...
Here’s status json
: https://gist.githubusercontent.com/Rjerk/459aef7b339fd62106f5c3d95789d4c9/raw/ed1fd65dc82f5e02f4c5ce5522df1ed363ed1a5a/cluster.json
Here’s proxies info:
+--------------------+------------+--------------+---------------+-------------+
| ip | datacenter | role | run_loop_busy | latency_p99 |
+--------------------+------------+--------------+---------------+-------------+
| 10.195.154.11:7500 | ningbo2 | commit_proxy | 2.5% | 22.62 ms |
| 10.195.154.7:7500 | ningbo2 | commit_proxy | 1.7% | 29.37 ms |
| 10.195.154.9:7500 | ningbo2 | grv_proxy | 23.1% | 0.56 ms |
| 10.195.154.9:7501 | ningbo2 | commit_proxy | 0.9% | 25.78 ms |
+--------------------+------------+--------------+---------------+-------------+
Each machine has a tlog holding a single NVME, here’s 18 tlogs:
+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+
| ip | datacenter | disk | input_bytes | durable_bytes | used_size | total_size | disk_busy | core_used | run_loop_busy |
+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+
| 10.195.154.9:6500 | ningbo2 | nvme3n1 | 0 B/s | 99.0 B/s | 6.3 GB | 1.9 TB | 0.0% | 0.1 | 4.7% |
| 10.195.154.8:6500 | ningbo2 | nvme3n1 | 0 B/s | 0 B/s | 8.9 GB | 1.9 TB | 0.0% | 0.1 | 4.4% |
| 10.195.154.7:6500 | ningbo2 | nvme3n1 | 0 B/s | 0 B/s | 8.9 GB | 1.9 TB | 0.0% | 0.1 | 8.4% |
| 10.195.154.12:6500 | ningbo2 | nvme3n1 | 0 B/s | 0 B/s | 9.4 GB | 1.9 TB | 0.0% | 0.1 | 4.3% |
| 10.195.154.11:6500 | ningbo2 | nvme3n1 | 0 B/s | 99.0 B/s | 8.7 GB | 1.9 TB | 0.0% | 0.1 | 4.1% |
| 10.195.154.10:6500 | ningbo2 | nvme3n1 | 0 B/s | 0 B/s | 8.1 GB | 1.9 TB | 0.0% | 0.1 | 4.5% |
| 10.195.152.8:6500 | ningbo1 | nvme3n1 | 0 B/s | 0 B/s | 6.3 GB | 1.9 TB | 0.0% | 0.1 | 4.4% |
| 10.195.152.7:6500 | ningbo1 | nvme3n1 | 0 B/s | 99.0 B/s | 6.9 GB | 1.9 TB | 0.0% | 0.1 | 4.2% |
| 10.195.152.12:6500 | ningbo1 | nvme3n1 | 0 B/s | 99.0 B/s | 7.8 GB | 1.9 TB | 0.0% | 0.1 | 4.0% |
| 10.195.152.11:6500 | ningbo1 | nvme3n1 | 0 B/s | 0 B/s | 8.6 GB | 1.9 TB | 0.0% | 0.1 | 4.3% |
| 10.195.152.10:6500 | ningbo1 | nvme3n1 | 0 B/s | 99.0 B/s | 6.5 GB | 1.9 TB | 0.0% | 0.1 | 4.2% |
| 10.181.159.75:6500 | zhengzhou | nvme3n1 | 0 B/s | 0 B/s | 6.5 GB | 1.9 TB | 0.1% | 0.1 | 9.8% |
| 10.181.159.74:6500 | zhengzhou | nvme3n1 | 0 B/s | 0 B/s | 9.6 GB | 1.9 TB | 0.0% | 0.1 | 4.3% |
| 10.181.159.73:6500 | zhengzhou | nvme3n1 | 0 B/s | 0 B/s | 7.6 GB | 1.9 TB | 0.0% | 0.1 | 9.7% |
| 10.181.159.72:6500 | zhengzhou | nvme3n1 | 0 B/s | 0 B/s | 7.0 GB | 1.9 TB | 0.1% | 0.1 | 9.8% |
| 10.181.159.71:6500 | zhengzhou | nvme3n1 | 0 B/s | 99.0 B/s | 7.6 GB | 1.9 TB | 0.0% | 0.1 | 9.9% |
| 10.181.159.70:6500 | zhengzhou | nvme3n1 | 0 B/s | 0 B/s | 6.7 GB | 1.9 TB | 0.0% | 0.1 | 9.5% |
+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+
Each machine has 6 NVMEs for storage, and each NVME has 4 storage server.