The single read and write delay of the three data center cluster exceeds 160ms

Rjerk · January 21, 2023, 3:56pm

I have a three_datacenter mode cluster and each cluster has 6 nodes.

The three DCs of the cluster are ningbo1, ningbo2, and zhengzhou.

The network delay of ningbo1 and ningbo2 is 0.1ms, and the delay between them and Zhengzhou is 20ms.

The primary dc is ningbo2.

When running the performance test (use go-ycsb) in zhengzhou, the single write latency gradually rises to 450ms:

./go-ycsb load foundationdb -P workloada --threads 1

The single read latency gradually starts from 160ms and drops to 100ms:

Here’s the output of fdbcli:

fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - three_datacenter
  Storage engine         - ssd-2
  Coordinators           - 7
  Desired Commit Proxies - 3
  Desired GRV Proxies    - 1
  Desired Resolvers      - 1
  Desired Logs           - 18
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 483
  Zones                  - 483
  Machines               - 18
  Memory availability    - 8.0 GB per process on machine with least available
  Retransmissions rate   - 46 Hz
  Fault Tolerance        - 3 zones
  Server time            - 01/21/23 15:11:23

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 124.486 GB
  Disk space used        - 1.291 TB

Storage wiggle:
  Wiggle server addresses- 10.195.154.7:5513
  Wiggle server count    - 1

Operating space:
  Storage server         - 1513.1 GB free on most full server
  Log server             - 1689.9 GB free on most full server

Workload:
  Read rate              - 548 Hz
  Write rate             - 0 Hz
  Transactions started   - 339 Hz
  Transactions committed - 0 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Client time: 01/21/23 15:10:29

fdb> status details

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - three_datacenter
  Storage engine         - ssd-2
  Coordinators           - 7
  Desired Commit Proxies - 3
  Desired GRV Proxies    - 1
  Desired Resolvers      - 1
  Desired Logs           - 18
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 483
  Zones                  - 483
  Machines               - 18
  Memory availability    - 8.0 GB per process on machine with least available
  Retransmissions rate   - 47 Hz
  Fault Tolerance        - 3 zones
  Server time            - 01/21/23 15:11:28

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 124.486 GB
  Disk space used        - 1.291 TB

Storage wiggle:
  Wiggle server addresses- 10.195.154.7:5513
  Wiggle server count    - 1

Operating space:
  Storage server         - 1513.1 GB free on most full server
  Log server             - 1689.9 GB free on most full server

Workload:
  Read rate              - 471 Hz
  Write rate             - 0 Hz
  Transactions started   - 615 Hz
  Transactions committed - 0 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  10.181.159.70:5500     (  2% cpu;  9% machine; 0.014 Gbps;  0% disk IO; 0.2 GB / 8.0 GB RAM  )
  10.181.159.70:5501     (  2% cpu;  9% machine; 0.014 Gbps;  0% disk IO; 3.3 GB / 8.0 GB RAM  )
  10.181.159.70:5502     (  2% cpu;  9% machine; 0.014 Gbps;  0% disk IO; 3.3 GB / 8.0 GB RAM  )
...

Here’s status json: https://gist.githubusercontent.com/Rjerk/459aef7b339fd62106f5c3d95789d4c9/raw/ed1fd65dc82f5e02f4c5ce5522df1ed363ed1a5a/cluster.json

Here’s proxies info:

+--------------------+------------+--------------+---------------+-------------+
|         ip         | datacenter |     role     | run_loop_busy | latency_p99 |
+--------------------+------------+--------------+---------------+-------------+
| 10.195.154.11:7500 |  ningbo2   | commit_proxy |      2.5%     |   22.62 ms  |
| 10.195.154.7:7500  |  ningbo2   | commit_proxy |      1.7%     |   29.37 ms  |
| 10.195.154.9:7500  |  ningbo2   |  grv_proxy   |     23.1%     |   0.56 ms   |
| 10.195.154.9:7501  |  ningbo2   | commit_proxy |      0.9%     |   25.78 ms  |
+--------------------+------------+--------------+---------------+-------------+

Each machine has a tlog holding a single NVME, here’s 18 tlogs:

+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+
|         ip         | datacenter |   disk  | input_bytes | durable_bytes | used_size | total_size | disk_busy | core_used | run_loop_busy |
+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+
| 10.195.154.9:6500  |  ningbo2   | nvme3n1 |    0 B/s    |    99.0 B/s   |   6.3 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.7%     |
| 10.195.154.8:6500  |  ningbo2   | nvme3n1 |    0 B/s    |     0 B/s     |   8.9 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.4%     |
| 10.195.154.7:6500  |  ningbo2   | nvme3n1 |    0 B/s    |     0 B/s     |   8.9 GB  |   1.9 TB   |    0.0%   |    0.1    |      8.4%     |
| 10.195.154.12:6500 |  ningbo2   | nvme3n1 |    0 B/s    |     0 B/s     |   9.4 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.3%     |
| 10.195.154.11:6500 |  ningbo2   | nvme3n1 |    0 B/s    |    99.0 B/s   |   8.7 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.1%     |
| 10.195.154.10:6500 |  ningbo2   | nvme3n1 |    0 B/s    |     0 B/s     |   8.1 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.5%     |
| 10.195.152.8:6500  |  ningbo1   | nvme3n1 |    0 B/s    |     0 B/s     |   6.3 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.4%     |
| 10.195.152.7:6500  |  ningbo1   | nvme3n1 |    0 B/s    |    99.0 B/s   |   6.9 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.2%     |
| 10.195.152.12:6500 |  ningbo1   | nvme3n1 |    0 B/s    |    99.0 B/s   |   7.8 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.0%     |
| 10.195.152.11:6500 |  ningbo1   | nvme3n1 |    0 B/s    |     0 B/s     |   8.6 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.3%     |
| 10.195.152.10:6500 |  ningbo1   | nvme3n1 |    0 B/s    |    99.0 B/s   |   6.5 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.2%     |
| 10.181.159.75:6500 | zhengzhou  | nvme3n1 |    0 B/s    |     0 B/s     |   6.5 GB  |   1.9 TB   |    0.1%   |    0.1    |      9.8%     |
| 10.181.159.74:6500 | zhengzhou  | nvme3n1 |    0 B/s    |     0 B/s     |   9.6 GB  |   1.9 TB   |    0.0%   |    0.1    |      4.3%     |
| 10.181.159.73:6500 | zhengzhou  | nvme3n1 |    0 B/s    |     0 B/s     |   7.6 GB  |   1.9 TB   |    0.0%   |    0.1    |      9.7%     |
| 10.181.159.72:6500 | zhengzhou  | nvme3n1 |    0 B/s    |     0 B/s     |   7.0 GB  |   1.9 TB   |    0.1%   |    0.1    |      9.8%     |
| 10.181.159.71:6500 | zhengzhou  | nvme3n1 |    0 B/s    |    99.0 B/s   |   7.6 GB  |   1.9 TB   |    0.0%   |    0.1    |      9.9%     |
| 10.181.159.70:6500 | zhengzhou  | nvme3n1 |    0 B/s    |     0 B/s     |   6.7 GB  |   1.9 TB   |    0.0%   |    0.1    |      9.5%     |
+--------------------+------------+---------+-------------+---------------+-----------+------------+-----------+-----------+---------------+

Each machine has 6 NVMEs for storage, and each NVME has 4 storage server.

Rjerk · January 28, 2023, 6:05am

The performance of 3dc mode is not as good as we expected.
Is there any optimization we can do so that the read/write delay can be controlled within 80ms.

@jzhou

Topic		Replies	Views
The read latency of 1% read request up to 50ms Using FoundationDB performance	0	402	February 9, 2022
High latency of read-only transactions at the primary DC? Using FoundationDB	5	1108	December 14, 2020
Need suggestion on 3 data center deployment (we want active/active/active if possible) Using FoundationDB	3	938	April 19, 2019
FoundationDB cluster performance issue - Periods of high disk I/O and sustained high latency Using FoundationDB performance	21	2519	July 6, 2020
Set primary data center does not work as expected Using FoundationDB	13	782	June 18, 2019

The single read and write delay of the three data center cluster exceeds 160ms

Related topics