In our project, we have developed a tool that interfaces with the FDB C API to fetch data and store it in our internal client cache or other locations for access. We have also noticed in the official FDB documentation that the FDB client includes a caching mechanism, as mentioned in the following statement: “Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.” However, during testing, we observed that our service does not read hot data from the client cache but directly from the FDB server. Therefore, we would like to understand whether the FDB client cache actually exists, or if there is an issue with our access mechanism, or perhaps there are other reasons involved.
“Direct read from client’s memory: If a key’s value exists in the client’s memory, the client reads it directly from its local memory. This happens when a client updates a key’s value and later reads it. This optimization reduces the amount of unnecessary requests to storage servers.”
This quote is from FDB Read and Write Path — FoundationDB 7.1. What it means is that during a transaction (read-your-write transaction), reads for modified data are serviced locally. After the transaction is committed, no such data are cached on the client. I.e., another transaction would have to read from the FDB server again.
Does the storage server have no cache?
In our work, we encountered performance bottlenecks due to FDB read hotspot issues. Are there any ways to optimize and solve this problem?
Storage server has caching, i.e., --cache_memory=6144MiB
flag. What’s your read per second? Is the key frequently updated?
Usually for read hotspot issues, we’d suggest to look at your application code and see if there are opportunities to optimize, e.g., distributing read requests to different keys. On the server side, I remembered there was a recent change to split a hot shard even though the shard size is already small (but I couldn’t find the PR). Another thing is a new fdbcli command redistribute <BeginKey> <EndKey>
, which essentially manually inserts shard split points.
For subsequent reads of the same key (e.g. think composable interactions, 2 functions both read the same key) will that value be cached on the client? Or will the read RPC again within the same transaction?
Based on Are range reads cached in the C client? i suspect yes
Thank you for your response. I will further optimize my code based on your suggestions. However, I would like to know if there is a key cache on the client side.
There is no client side key cache. The reason is that if there is such a cache, the client may read stale data because another transaction (from a different client) updates the key on the server side. So the client has to talk to the server to be sure the data is up-to-date.