We are evaluating FoundationDB for a system which has very high read operations (99 :1) The nature of read is such that we can have one operation per transaction. In our performance test we found that starting a new transaction and doing a single read operation adds a lot of latency. Is there any option that could help improve read performance.
You need more concurrency. Even getting the entire transaction down to 1ms leaves you at 1000QPS, where a single process can serve probably 5-10x that depending on key-value size, cache hit ratio, etc.
Thanks for the reply. Here is some more details on our tests
Our tests have 1000 clients executing the test script which does the following
For 5000 times
- Start the transaction
- Perform range gets with limit 2 (all the clients use the same keys so there should be 100% cache hit)
This was taking time. So we changed the process to store the created transaction for 1 sec to reduce the number of transactions
For 5000 times
- Check for a valid transaction
- If not found Start the transaction else reuse older transaction
- Perform range gets with limit 2
This improved the performance by 3X. But I am not sure if this is the right way. Is there any other way to perform fast reads given that there are few or no commits.
It really depends on your workload requirements. FDB guarantees serializable ACID transactions. This means that there is an inherit limit to a transaction per se. Here’s a short (and incomplete) overview what happens for a standard transaction that only executes one read:
- The client asks one proxy for a read version
- The proxy will queue this request and wait for other start transaction requests so it can do batching. How long your transaction will wait depends on your load and on your configuration. With very low load, this will be around a microsecond, but it can go up by up to 10ms (assuming you run with default knobs). With your benchmark I would assume this time to be relatively low. But in general, fdb will start trading latency for throughput when the load increases.
- After this queuing time, the proxy will ask all other proxies in the cluster for the newest version they know of.
- The proxy will send the client this read version.
- Your client will check which storage server it needs to send the read request to. For your workload, we can probably assume that this information is cached. Otherwise this will create a new request to a proxy.
- The client will send this read request to a storage server. This request will be tagged with the read version of the transaction.
- The storage server queues that request until it fetched the request version from the TLog. Depending on performance etc, this can add a few milliseconds to the request
- The storage will process the request and send you an answer.
Now as you can see, this is quite expensive. If each of your reads need to read a version that has to be as real-time as possible, you have no other choice than paying for this.
However, in most workloads, you won’t mind reading an older value. In that case you can do what you already do: reuse a transaction. You can run this transaction for up to 5 second. You can even reuse it for as long as you don’t see a past_version exception. The drawback of this technique is that you will never see your writes until you start a transaction (if your transaction is 1 second old and you read x, you will get the value of x one second ago).
The other question you need to ask yourself: do you want to optimize for latency or for throughput? You can increase throughput by running more simultaneously running transactions and you can reduce latency of individual reads by having longer running transactions.
Thanks a lot this was really helpful