We have a number of workers, each operating on an individual set of keys. Each worker periodically runs a transaction where it reads and updates its own keys.
We’ve encountered high latencies on these transactions reaching 2 seconds when running 1000 workers, with period 200ms (5 writes per second per worker). The strange thing is that when we remove the delay between transactions (i.e. write as fast as we can), the latencies get back to 30-130ms.
Below are the metrics when running with delays:
The first panel shows the latency of executing a single transaction including retries.
The second panel shows the overall rate of transactions among all workers per second.
The third panel shows the overall rate of transaction retries per second.
Here’s the rate of conflicts that we got from
When running without delays, we observed stable low latencies, despite increased overall write rate. Also, we had ~8 times fewer transaction retries.
At the same time,
status json showed ~8 times more transaction conflicts:
We monitored the reasons for the retries, and in all cases, it was error code 1009 (Request for future version).
What may have caused this? It’s surprising that increased load leads to lower latencies. It’s also strange that there are fewer retries despite more transaction conflicts.
We’re using FoundationDB v6.2.7.
Our app is built on node-foundationdb with FDB client v6.2.7.