I configured the cluster with resolvers=4
and re-run the same fdbserver test (3min, 50,000tps)
fdbtop:
ip port cpu% mem% iops net class roles
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.28.174 4500 60 4 - 92 test
4501 60 3 - 93 test
4502 59 3 - 93 test
4503 59 3 - 93 test
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.32.157 4500 69 7 20194 21 storage storage
4501 74 6 20211 19 storage storage
4502 0 3 - 0 stateless
4503 1 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.32.74 4500 66 13 1542 210 log log
4501 0 4 - 0 stateless
4502 0 3 - 0 stateless
4503 16 4 - 10 stateless cluster_controller
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.33.171 4500 67 15 16682 19 storage storage
4501 70 18 16682 19 storage storage
4502 0 3 - 0 stateless
4503 1 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.33.172 4500 76 21 17100 22 storage storage
4501 72 18 17124 22 storage storage
4502 1 3 - 0 stateless
4503 1 2 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.34.155 4500 50 19 15794 5 storage storage
4501 49 18 15793 5 storage storage
4502 0 3 - 0 stateless
4503 1 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.35.133 4500 89 10 23528 10 storage storage
4501 70 7 23528 7 storage storage
4502 46 3 - 152 proxy proxy
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.36.35 4500 88 9 2529 283 log log
4501 0 4 - 0 stateless
4502 0 3 - 0 stateless
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.37.131 4500 59 20 16143 10 storage storage
4501 52 18 16078 12 storage storage
4502 2 5 - 0 stateless
4503 1 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.37.98 4500 91 9 16941 20 storage storage
4501 80 6 16932 15 storage storage
4502 80 3 - 265 proxy proxy
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.38.195 4500 76 8 19611 21 storage storage
4501 71 6 19624 20 storage storage
4502 0 3 - 0 stateless
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.38.34 4500 94 9 4279 283 log log
4501 55 5 - 53 stateless resolver
4502 0 3 - 0 stateless
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.39.157 4500 78 8 17692 13 storage storage
4501 77 6 17701 13 storage storage
4502 21 3 - 4 stateless master
4503 41 3 - 29 stateless resolver
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.39.184 4500 82 8 19657 21 storage storage
4501 77 6 19785 21 storage storage
4502 0 2 - 0 stateless
4503 0 2 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.39.85 4500 37 10 670 71 log log
4501 1 4 - 0 stateless
4502 1 2 - 0 stateless
4503 1 2 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.40.18 4500 58 8 18546 11 storage storage
4501 73 6 18554 12 storage storage
4502 0 3 - 0 stateless
4503 28 3 - 26 stateless resolver
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.42.96 4500 43 19 16580 4 storage storage
4501 48 20 16589 4 storage storage
4502 77 7 - 271 proxy proxy
4503 1 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.44.149 4500 55 18 17122 10 storage storage
4501 64 20 17122 10 storage storage
4502 69 7 - 227 proxy proxy
4503 1 2 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.46.120 4500 70 16 16762 17 storage storage
4501 82 15 16785 21 storage storage
4502 46 3 - 40 stateless resolver
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.47.158 4500 74 7 17540 15 storage storage
4501 64 6 17535 16 storage storage
4502 0 3 - 0 stateless
4503 0 3 - 0 stateless
--------------- ------ ---- ---- ------- ----- ----------- --------------------
172.31.47.4 4500 61 24 14587 17 storage storage
4501 65 19 14551 17 storage storage
4502 1 3 - 0 stateless
4503 2 12 - 0 stateless
Results:
setting up test (Benchmark)...
running test...
Benchmark complete
checking tests...
fetching metrics...
Metric (0, 0): Measured Duration, 135.000000, 135
Metric (0, 1): Transactions/sec, 12494.688889, 1.25e+04
Metric (0, 2): Operations/sec, 74968.133333, 7.5e+04
Metric (0, 3): A Transactions, 1686783.000000, 1686783
Metric (0, 4): B Transactions, 0.000000, 0
Metric (0, 5): Retries, 70932.000000, 70932
Metric (0, 6): Mean load time (seconds), 0.000000, 0
Metric (0, 7): Read rows, 1686783.000000, 1.69e+06
Metric (0, 8): Write rows, 8433915.000000, 8.43e+06
Metric (0, 9): Mean Latency (ms), 24.433103, 24.4
Metric (0, 10): Median Latency (ms, averaged), 22.470236, 22.5
Metric (0, 11): 90% Latency (ms, averaged), 30.676603, 30.7
Metric (0, 12): 98% Latency (ms, averaged), 53.017616, 53
Metric (0, 13): Max Latency (ms, averaged), 228.917837, 229
Metric (0, 14): Mean Row Read Latency (ms), 5.856853, 5.86
Metric (0, 15): Median Row Read Latency (ms, averaged), 5.380630, 5.38
Metric (0, 16): Max Row Read Latency (ms, averaged), 187.579632, 188
Metric (0, 17): Mean Total Read Latency (ms), 5.803291, 5.8
Metric (0, 18): Median Total Read Latency (ms, averaged), 5.342722, 5.34
Metric (0, 19): Max Total Latency (ms, averaged), 187.579632, 188
Metric (0, 20): Mean GRV Latency (ms), 7.503022, 7.5
Metric (0, 21): Median GRV Latency (ms, averaged), 7.063389, 7.06
Metric (0, 22): Max GRV Latency (ms, averaged), 35.660267, 35.7
Metric (0, 23): Mean Commit Latency (ms), 9.893580, 9.89
Metric (0, 24): Median Commit Latency (ms, averaged), 9.062767, 9.06
Metric (0, 25): Max Commit Latency (ms, averaged), 55.891752, 55.9
Metric (0, 26): Read rows/sec, 12494.688889, 1.25e+04
Metric (0, 27): Write rows/sec, 62473.444444, 6.25e+04
Metric (0, 28): Bytes read/sec, 1399405.155556, 1.4e+06
Metric (0, 29): Bytes written/sec, 6997025.777778, 7e+06
Metric (1, 0): Measured Duration, 135.000000, 135
Metric (1, 1): Transactions/sec, 12491.244444, 1.25e+04
Metric (1, 2): Operations/sec, 74947.466667, 7.49e+04
Metric (1, 3): A Transactions, 1686318.000000, 1686318
Metric (1, 4): B Transactions, 0.000000, 0
Metric (1, 5): Retries, 73017.000000, 73017
Metric (1, 6): Mean load time (seconds), 0.000000, 0
Metric (1, 7): Read rows, 1686318.000000, 1.69e+06
Metric (1, 8): Write rows, 8431590.000000, 8.43e+06
Metric (1, 9): Mean Latency (ms), 25.181469, 25.2
Metric (1, 10): Median Latency (ms, averaged), 23.118734, 23.1
Metric (1, 11): 90% Latency (ms, averaged), 31.497478, 31.5
Metric (1, 12): 98% Latency (ms, averaged), 54.416656, 54.4
Metric (1, 13): Max Latency (ms, averaged), 315.679073, 316
Metric (1, 14): Mean Row Read Latency (ms), 6.087186, 6.09
Metric (1, 15): Median Row Read Latency (ms, averaged), 5.635738, 5.64
Metric (1, 16): Max Row Read Latency (ms, averaged), 101.420164, 101
Metric (1, 17): Mean Total Read Latency (ms), 6.057966, 6.06
Metric (1, 18): Median Total Read Latency (ms, averaged), 5.623817, 5.62
Metric (1, 19): Max Total Latency (ms, averaged), 101.420164, 101
Metric (1, 20): Mean GRV Latency (ms), 7.647929, 7.65
Metric (1, 21): Median GRV Latency (ms, averaged), 7.230759, 7.23
Metric (1, 22): Max GRV Latency (ms, averaged), 33.132792, 33.1
Metric (1, 23): Mean Commit Latency (ms), 10.048114, 10
Metric (1, 24): Median Commit Latency (ms, averaged), 9.285212, 9.29
Metric (1, 25): Max Commit Latency (ms, averaged), 49.364805, 49.4
Metric (1, 26): Read rows/sec, 12491.244444, 1.25e+04
Metric (1, 27): Write rows/sec, 62456.222222, 6.25e+04
Metric (1, 28): Bytes read/sec, 1399019.377778, 1.4e+06
Metric (1, 29): Bytes written/sec, 6995096.888889, 7e+06
Metric (2, 0): Measured Duration, 135.000000, 135
Metric (2, 1): Transactions/sec, 12489.896296, 1.25e+04
Metric (2, 2): Operations/sec, 74939.377778, 7.49e+04
Metric (2, 3): A Transactions, 1686136.000000, 1686136
Metric (2, 4): B Transactions, 0.000000, 0
Metric (2, 5): Retries, 70142.000000, 70142
Metric (2, 6): Mean load time (seconds), 0.000000, 0
Metric (2, 7): Read rows, 1686136.000000, 1.69e+06
Metric (2, 8): Write rows, 8430680.000000, 8.43e+06
Metric (2, 9): Mean Latency (ms), 23.809055, 23.8
Metric (2, 10): Median Latency (ms, averaged), 22.060156, 22.1
Metric (2, 11): 90% Latency (ms, averaged), 29.850006, 29.9
Metric (2, 12): 98% Latency (ms, averaged), 50.099850, 50.1
Metric (2, 13): Max Latency (ms, averaged), 235.465765, 235
Metric (2, 14): Mean Row Read Latency (ms), 5.736473, 5.74
Metric (2, 15): Median Row Read Latency (ms, averaged), 5.322695, 5.32
Metric (2, 16): Max Row Read Latency (ms, averaged), 177.443027, 177
Metric (2, 17): Mean Total Read Latency (ms), 5.724418, 5.72
Metric (2, 18): Median Total Read Latency (ms, averaged), 5.309343, 5.31
Metric (2, 19): Max Total Latency (ms, averaged), 177.443027, 177
Metric (2, 20): Mean GRV Latency (ms), 7.315672, 7.32
Metric (2, 21): Median GRV Latency (ms, averaged), 6.891727, 6.89
Metric (2, 22): Max GRV Latency (ms, averaged), 36.601305, 36.6
Metric (2, 23): Mean Commit Latency (ms), 9.680379, 9.68
Metric (2, 24): Median Commit Latency (ms, averaged), 8.915186, 8.92
Metric (2, 25): Max Commit Latency (ms, averaged), 54.644346, 54.6
Metric (2, 26): Read rows/sec, 12489.896296, 1.25e+04
Metric (2, 27): Write rows/sec, 62449.481481, 6.24e+04
Metric (2, 28): Bytes read/sec, 1398868.385185, 1.4e+06
Metric (2, 29): Bytes written/sec, 6994341.925926, 6.99e+06
Metric (3, 0): Measured Duration, 135.000000, 135
Metric (3, 1): Transactions/sec, 12506.903704, 1.25e+04
Metric (3, 2): Operations/sec, 75041.422222, 7.5e+04
Metric (3, 3): A Transactions, 1688432.000000, 1688432
Metric (3, 4): B Transactions, 0.000000, 0
Metric (3, 5): Retries, 71062.000000, 71062
Metric (3, 6): Mean load time (seconds), 0.000000, 0
Metric (3, 7): Read rows, 1688432.000000, 1.69e+06
Metric (3, 8): Write rows, 8442160.000000, 8.44e+06
Metric (3, 9): Mean Latency (ms), 24.029755, 24
Metric (3, 10): Median Latency (ms, averaged), 22.102594, 22.1
Metric (3, 11): 90% Latency (ms, averaged), 30.331612, 30.3
Metric (3, 12): 98% Latency (ms, averaged), 51.876068, 51.9
Metric (3, 13): Max Latency (ms, averaged), 296.543598, 297
Metric (3, 14): Mean Row Read Latency (ms), 5.882259, 5.88
Metric (3, 15): Median Row Read Latency (ms, averaged), 5.442858, 5.44
Metric (3, 16): Max Row Read Latency (ms, averaged), 228.428364, 228
Metric (3, 17): Mean Total Read Latency (ms), 5.902114, 5.9
Metric (3, 18): Median Total Read Latency (ms, averaged), 5.451441, 5.45
Metric (3, 19): Max Total Latency (ms, averaged), 228.428364, 228
Metric (3, 20): Mean GRV Latency (ms), 7.293286, 7.29
Metric (3, 21): Median GRV Latency (ms, averaged), 6.917238, 6.92
Metric (3, 22): Max GRV Latency (ms, averaged), 34.911633, 34.9
Metric (3, 23): Mean Commit Latency (ms), 9.678787, 9.68
Metric (3, 24): Median Commit Latency (ms, averaged), 8.921623, 8.92
Metric (3, 25): Max Commit Latency (ms, averaged), 54.420233, 54.4
Metric (3, 26): Read rows/sec, 12506.903704, 1.25e+04
Metric (3, 27): Write rows/sec, 62534.518519, 6.25e+04
Metric (3, 28): Bytes read/sec, 1400773.214815, 1.4e+06
Metric (3, 29): Bytes written/sec, 7003866.074074, 7e+06
4 test clients passed; 0 test clients failed
BEAUTY!
tps: 12,500x4=50,000!
commit latency: 10ms!
So it was the resolvers that were limiting the transaction flow it would appear. I can understand how the resolver can become a bottleneck but do you mind confirming that its impact is as important as that?
Link to status json
dump (after the test was run, more useful as a reference for IP/IDs): status json
I still got 4% of conflicts, so Iāll try to increase the resolvers to 8 (double the number of log processes) and see if that can improve the resultsā¦
Itās been fun thank you