Java API Future patterns

(gaurav) #1

Hi, while getting started with the Java APIs, I have a few questions around the reasonings behind APIs returning values both synchronously and asynchronously:


  • All write (except commit) APIs returning a direct value (synchronous): Is this because writes are buffered locally by the client and hence the call is not doing anything expensive, and is therefore considered okay to be made synchronous? Is that the only reason or is there anything more?
  • Seemingly similar APIs are returning values synchronous as well as asynchronously:

    – Long getCommittedVersion();
    – CompletableFuture<byte[]> getVersionstamp();
    – CompletableFuture getReadVersion();

I could not understand the underlying reason to make getCommittedVersion() synchronous call? Can this call be used from method to return committed_version? (I believe this method can only be called after commit() succeeds, but before the transaction is closed()).


(Alec Grieser) #2

In general, I think a good rule of thumb is that if it is a blocking call, then it doesn’t take a long time (and almost certainly doesn’t make any network calls).

In particular, the reason that (except for commit) writes are synchronous calls is that the call only has to update an in-memory buffer. In particular, all mutations to the database only update a local cache known as the “write cache” that is then only sent to the cluster at commit time.

The getReadVersion request sometimes requires a network call. In particular, if there haven’t been any reads done within a transaction, then the client has to communicate with the cluster to get a version, hence it returns a future. (Note that if the transaction already has a read version, then the call to getReadVersion is actually very fast as it only has to read the already cached value from memory.)

The two calls, getCommittedVersion and getVersionstamp, are very similar, but they have slightly different semantics. Both the commit version and the versionstamp for a transaction are determined by the cluster as part of the commit process, so after a commit completes, it should be cheap to get that value as the cluster can return that information to the client. That is exactly why getCommittedVersion is synchronous–namely, that it just requires looking up the value, not doing any I/O. In theory, getVersionstamp could be similar (and instead by synchronous and just look up the value from an already committed transaction), but instead, it allows for the user to return a future that will complete at essentially the same time as commit, which theoretically could be useful (and allows the user to, for example, return a future from the retry loop that is essentially guaranteed to be complete when the commit ends–something that getCommittedVersion makes essentially impossible because it calls commit for you in a place you can’t control). In principle, both of these calls could have been structured to work the same way, but instead, one was implemented one way and one the other.

(gaurav) #3

Thank you for the detailed explanation!