[Java] why is setReadVersion not part of ReadTransaction/ReadSnapshot

Hi,

Why is setReadVersion() method not part of ReadTransaction/ReadSnapshot ? Is it not logically correct (and useful) to use this functionality for read-only or snapshot operations?


thanks,
gaurav

I think the idea behind “ReadTransaction” and “ReadSnapshot” was that it was supposed to be a read-only view into the transaction’s state. In particular, it should only really be used as, say, the argument into another function that uses the transaction as a context in which to perform reads, but it shouldn’t have too many side-effects, if that makes sense.

In other words, the “correct” way to set the read version of a ReadTransaction is to create the transaction, set the read version, and then pass it to some other function that just uses the transaction as a way to perform reads.

That being said, I could kind of see adding this function to this class (like, it wouldn’t be too out of place). I might want a good motivating example before I could say I’m definitely for it, though.

1 Like

Yeah, that was the original intent.

I think one reason we could argue for adding this is that currently it’s not possible to use this function with the Database.read functions, which provide you with a ReadTransaction rather than a full Transaction. For what it’s worth, we do expose options on the ReadTransaction now, which I think may have been done for the same reason.

At first I found it a bit odd as Database.read callbacks did not provide a way to setReadVersion on the read transaction. So I duplicated the Database.read with my own read..() methods.

Another related thing I found missing was not being able to call snapshot() on the Database.read method provided ReadTransaction.

Not a big issue, and I could easily work around; but having these methods on ReadTransaction would probably make it play nicer with Database.read methods.

Calling snapshot() on a read-only transaction should have no effect, except possibly to make reads a bit cheaper. If that’s the case, though, the right thing to do is probably to call snapshot() automatically rather than making it callable on a ReadTransaction.

So do you suggest that we do that using Database.run methods? Or should another set of methods - something like Database.readSnapshot be exposed?

(my intent is to make read-only transactions on snapshots - if there is any possibility of this being cheaper)

For now, I would personally probably use the read functions unless you needed to set the read version manually. If you really wanted the possible optimization of calling snapshot(), you certainly could do that, though I’d likely try to measure whether it really makes any appreciable difference first.

My suggestion above for changing the implementation in the bindings was to have the existing read functions call this automatically. I don’t think there’s any need for having 2 versions of the read functions.

1 Like

Understood - and it makes sense. Thanks for the suggestions!

While this is true, I know I’ve wanted in the past to do something like take a read-only view of a transaction when performing some operation that only reads but I also want to control what conflict ranges get added. I could have sworn that there was some GitHub Issue that I filed a while ago about improving the ReadTransaction API to include things like snapshot(), isSnapshot(), and something like addReadConflictRangeIfNotSnapshot(), but I can’t find it now.

I guess in theory, this could break code that does something like:

db.read((ReadTransaction rtr) -> {
   Transaction tr = (Transaction) rtr;
   return null;
});

In some sense, this is a total misuse of the read API, so maybe we’re okay with that breaking.

To provide a bit more context for what the potential optimization is here, I’ll give a bit more detail about what’s happening. Using snapshot reads prevents adding conflict ranges for reads, which reduces a little the amount of work the client has to do handling that read. The read issued to the cluster is unchanged, and if this is a read-only transaction, then you don’t have to ever send the conflict ranges to the cluster.

Based on the above, what you’re likely to see is a negligible change to read latency (except maybe when reading from cache; i.e. reading keys that were already read or written in the same transaction) and a possibly more noticeable change to client read throughput.

For latency, typically the bulk of this cost is in sending the request and waiting for the storage server to answer it. The amount of work done on the client is typically relatively small in comparison, so optimizing that cost will have a relatively small effect. If you are reading from cache, though, then you won’t need to send a request to the cluster and this optimization might help depending on how much it actually reduces the client-side work being performed.

For per-client throughput, usually the bottleneck here is the fact that there is a single thread (which we call the network thread) doing all of the client-side work for each request. You are therefore limited by how much you can get done on a single core. If by skipping the conflict ranges you can noticeably reduce the amount of work done on the network thread, then you can increase the number of reads you do before that thread saturates.

So just to throw out some fake numbers and a simplified example, let’s say each read takes 10 microseconds of work on the network thread and 500 microseconds to make the request to the cluster and get a response. If we can shave 1 microsecond off the network thread work, then our total latency decreases by ~0.2%. For a cached read, if we can shave off the same 1 microsecond, we would save 10% of our (already relatively small) latency.

In terms of throughput, our client could handle at most 100,000 requests per second (1 second / 10 microseconds). If we reduced this time to 9 microseconds, we could handle at most 111,111 requests per second, or an 11% increase in single-client throughput.

All this is to give you an idea of when this optimization might matter, depending of course on what the actual numbers are. If you find that you are saturating your network thread and it’s costly for you to spin up more clients, then it might be worth it to you. Or, if you do lots of cached reads, then you could potentially see a benefit as well. If you do run a test of this, I’d love to hear the details.

All that said, if there is an appreciable benefit to using snapshot reads, then we’re likely to implement the described change directly in the bindings, which would mean that the optimization will only be relevant until that change lands.

1 Like

I do agree that’s a misuse, but if we wanted to support it then this change should be easy enough to API version.

Yeah, that’s interesting. Such a function would only be useful if it is sometimes called with a full Transaction, but where this really makes a difference is if you also want it to be callable from a read-only context. It does of course also limit the API that you can use in a full Transaction scenario, but that’s probably not as big of a win (especially since to change that we’d have to expand the limited API).

I suppose there’s an increased potential for confusion with this change in that it may appear that some reads add conflict ranges when they in fact do not (a problem you alluded to by suggesting things like addReadConflictRangeIfNotSnapshot()).

Yeah, that’s right. The idea would be this would allow the author of a library to do something like create a function whose signature takes a ReadTransaction or ReadTransactionContext. Then if it is given a full transaction, it adds the conflict ranges, but if given a snapshot transaction, it won’t. This is particularly important if one wanted to do something like use snapshot reads within the function. (Something like a “choose one” function that picks a key from from a key range by doing a snapshot read and then returning a random element. Then you want your contract to be that only the key returned gets a read conflict range and only if it is passed a full transaction.)

Thank you for the details A.J. I concur that the optimization savings are paltry here, and a robustly designed API is more important in this context.