We are considering the two-region multi-datacenter FDB cluster configuration, described in the architecture document, https://apple.github.io/foundationdb/configuration.html#configuring-regions.
Assume that we have three DCs, A, B, C, in two Regions West Coast and East Coast. West Coast Region contains A and B and East Cost contains C. We need to have the Client application to be deployed in both West Coast Region and East Coast Region. Supposed West Coast Region is the active region. The Client hosted in the West Coast region will have reads to be handled by the proxy servers and storage servers in the West Coast Region. My concern is on the Clients that get hosted at the East Coast Region. According to the architecture document, “Reads can be served from either region, and clients can get data from whichever region is closer. Getting a read version from the each coast region will still require communicating with a west coast datacenter. Clients can cache read versions if they can tolerate reading stale data to avoid waiting on read versions.”.
Each Read transaction requires the Read Version. The Client hosted at the East Coast Region will experience the high latency due to cross-region network latency, unless a transaction has many reads bundled, and the latency cost of getting read version can be amortized. We would have to live with this constraint.
In order to have the Client Application deployed in the two regions to provide similar latency-related QOS, it is OK in our application that we can read the relatively stale data, say 1 second behind what gets updated. Following the statement of “Clients can cache read versions if they can tolerate reading stale data”, we would need the following logic in the Client’s program: if the current active region is a remote region, then re-use the read version from the most recent transaction (but no more than 5 seconds ago).
My questions are the following:
(1) what is the FDB client-side API support that allows us to check whether the Client’s active region is a remote region, so that we can turn on the read version caching logic in the Client program?
(2) Due to the 5-second transaction time limit, the Client program can not re-use the read version forever. Every time the program gets to the active region (west coast region), we will see the 95%-percentile or 99%-percentile latency increased. So to maintain the QOS, we would need to have a background thread to keep making the GetReadVersion call to the active region, so that the Client can maintain a recent Read Version number. Is this a good way to reduce the 95% or 99%-tile latency?