When working on my application, I find that versionstamps are frequently a footgun, specifically attempting to read a versionstamp that was written in the current transaction.
Why does that limitation exist? I can create an incomplete versionstamp in my code and print it, and it has a byte representation, so why can’t the server return an incomplete versionstamp?
Versionstamps are just a placeholder on the client/app side. The value of the versionstamp of a transaction is only determined at the commit time, more precisely when the commit proxy replaces the actual value of the versionstamp for mutations after obtaining the commit version from the sequencer. Because of this limitation, it’s impossible for the client/app to read versionstamp before the commit time.
I get that the complete versionstamp isn’t filled out until the transaction commits, but in Java it’s completely legal to just create an incomplete versionstamp and read it, serialize it to bytes, etc. Given that, why is it necessary for the API to throw an exception when an incomplete versionstamp is read, rather than returning an incomplete versionstamp like I would get before attempting to commit?
It sounds like you want the behavior enabled by the
bypass_unreadable transaction option
To try to answer the original question of why this is a thing though, let’s first look at the motivation for the read-your-writes cache. If we view transactions as functions that take a database and return a new database (a database being a set of key-value pairs), then we can model the current state of FDB as the composition of some sequence of transactions. FDB’s conflict checking preserves this even if there are concurrent writers.
Now let’s say your application has two transactions t1 and t2, and you want to make a new transaction t3 which is the composition of t1 and t2. The read-your-writes cache makes this straightforward - you simply call t1 and then t2 in the same transaction. t2 sees the effects of t1, and the result commits atomically. This is also why e.g. in the python api the
@fdb.transactional decorator creates a function that takes a Database or a Transaction. You can commit a function as its own transaction, or you can freely compose many functions into one transaction.
If the read-your-writes cache allowed you to read unresolved version stamps by default, then this whole paradigm no longer works. t2 would see something, but it wouldn’t be the exact effect of t1.
There are probably other ways for this paradigm to fall short, and in each case FDB should by default choose the safe option of not committing anything. One other way for this to fall short is transaction size limits - it could be that performing t1 and then t2 in one transaction makes the transaction too large to commit.
Something that has been quite surprising to me (and does not seem necessary) is that if you have a range with a series of complete versionstamps and one incomplete one, you can’t do a range read on any part of it, even parts that could never possibly contain whatever that versionstamp will be once it’s completed.
tx1 writes a verionstamped key
vs1 and commits
tx2 writes another one
vs2 and commits
tx3 writes incomplete versionstamped key
vs3, and before committing tries to read the range from
vs2. FDB throws “read or wrote unreadable key”.
In this example we know that
vs3 must come after the current read version, which must come after
vs2, so we aren’t actually reading the incomplete versionstamp. Yet it still is not allowed.
I think that’s either a bug, an old version of fdb, or a quirk resulting from the fact that setting an atomic op must complete synchronously in the client API, which can be worked around by getting a read version before performing the versionstamped key op.
What version are you using? Do you still see it if you wait until the transaction has a read version before setting the versionstamped key?
Sidenote: tracking unreadable keys significantly complicates the implementation of the read-your-writes cache. While it’s probably theoretically possible to fix the above quirk, I kind of doubt anybody will invest enough to pull it off, or if it’s worth the maintenance burden even if someone does.
It appears that
bypass_unreadable doesn’t work with range scans, is that correct?
Looks like you’re right - it only affects get requests. I find this surprising. My naive reading of the code is that it wouldn’t be too too difficult to apply this to get range requests as well, but I’m not 100% sure.
Anyway I can confirm that you’re right. Current status is “gets only”