Hello,
We are evaluating the throughput and latency behavior of read-only transactions over the C binding, and noticed some unexpected latencies of the commit() operation. So we went ahead and implemented what is suggested in the documentation https://apple.github.io/foundationdb/api-c.html:
it is not necessary to commit a read-only transaction – you can simply call fdb_transaction_destroy()
On the small tests that we ran, we see a dramatic improvement when using that suggestion, and not calling commit on read-only transactions.
We didn’t do anything special in the client, just commented out the commit-related instructions, as shown in the snippet below:
//To disable the commit operation we comment comment from here...
FDBFuture *f = fdb_transaction_commit(tr);
fdb_error_t e = waitError(f);
fdb_future_destroy(f);
//... to here
The following plot shows the performance difference at the 50-th and 99-th percentile of the commit operation as measured with the above code snippet (note the logarithmic scale on the y axis). We use a single client process with an increasing number of threads, to increase the amount of work done on the client (we see the same behavior also with 16 client processes).
Would you happen to have the code that you used for benchmarking this as something that you could make available for us to use to reproduce this issue?
I haven’t run this test myself yet, but the first thing that I would suspect in this case is that the latency is related to the busyness of the network thread. In addition to checking ProcessMetrics as mentioned at the link for evidence of this, you could also look at NetworkMetrics for SlowTask* fields, which count the number of tasks that take certain large numbers of cycles, blocking other work while they run.
Hi @gabikliot . We did not investigate the issue much further, the issue seems to be that calling the “commit” operation entails an interaction with the network thread. The interaction is lightweight, but on a loaded client it can run in the occasional latency spike due to high CPU utilization.
If you have new findings or are running in similar issues I’d be glad to know about it.