I am trying to understand how to use versionstamps and if they can be useful in my layer.
I am trying to build something like datomic, that is a versioned database. I am relying on optimistic locking that is, I read and write the latest version of the data before storing the diff between the new data and the previously stored data with the transaction. In this context, it seems to me that using a timestamp generated in Python should be enough to have the correct ordering of transactions when it matters. That is, when two transactions read and write different parts of the database it doesn’t really matter which one comes first, I think.
A simpler question, might be, in which case versionstamp are useful? What are the usecases for versionstamps?
Also, I tried to experiment with fdb.tuple.pack_with_versionstamp, here is an example run:
Could you dump the value of v1 and v2 before calling fdb.set_versionstamp_xxx ?
The encoding of the second range read (b'\x01'...) looks like the correct encoding of a tuple with a singe-byte array followed by a 96-bit version stamp.
The encoding of the fist read (b'B'...), though, looks really weird. It is as if the set_versionstamp_value(...) call did not properly find the offset where the stamp is located, and so the db overwrote the first 10 bytes of the value with the stamp, instead of starting at offset 6 ? When sent by the client, the stamp initially is filled with all \xFF bytes, and we can still part of these towards the middle…
I know that in API version 51x, the set_versionstamp_value call did not have the ability to specify an offset, and would always overwrite the first 10 bytes. In API version 520+, the method was changed to be able to pass an offset as well (like it is possible for the keys).
So either the code did not get the correct location of the offset, or it was executing with API 510 which does not expect an offset?
– EDIT: missed that you were selecting API 510 at the start of the script. Can you try selecting API 520 and see if this changes the result?
For your other question, I use versionstamps for the following properties:
Need a “sequential”-ish id for items but you don’t want to pay the cost of having a centralized counter (or even having to deal with a counter at all).
Need a “happened before”/“happened after” way of sorting data without having to rely on an external time source (and the system clock of the local server is NOT reliable enough in some cases)
Need to generate 80-bit or more UUID that are guaranteed to be globally unique both in time and space (random uuids could work but are not sortable)
Need a way to do a “write-only” transaction (no reads!) for latency reasons.
I think that in most cases, you could have done the same thing without versionstamps, but it would have been slower or caused more conflicts. Versionstamps are a way to optimize your layer, and also allows for algorithms that were not performant enough before to be practical ?
Yes, and it would also probably work with version 520 is my guess.
I’m not sure if this is a bug in the python binding or something you have to know, but set_versionstamped_value behavior changed from 510 to 520 to allow specifying an offset for the stamp within the value. With API 510, the stamp MUST start at the beginning of the value (essentially making it impossible to use with tuples), while 520+ it can start anywhere (with the same limitation as for keys that there can be only one).
Maybe the python binding should throw if given a tuple with stamp at offset non-zero, and if API version selected is < 520 ?
I don’t think we could do that because of the fact that versionstamped keys supported non-zero offsets in older API versions, and the versionstamped tuples were introduced in earlier versions specifically for the keys.