Yes, that’s right that you can’t have more than one incomplete versionstamp in a single key or value. This limitation is baked into the FDB C API that the Java API is built on top of, so to change this, we’d need to do something like change the way the versionstamp atomic operations work.
For reference, the current scheme for a versionstamped key or value is that when using the
SET_VERSIONSTAMPED_VALUE atomic operations, the last 4 bytes of the key or value are stripped from the bytes given, and then interpreted as a little endian integer. This new integer is then treated as the (zero-indexed) position within the stripped bytes at which the index versionstamp is inserted.
So if you had something like:
Then those last four bytes
\x01\x00\x00\x00 get treated as position 1. If the versionstamp is something like
\x00\x00\x00\x00\x5c\xa1\xab\x1e\x0f\xdb, then the new key/value becomes:
But as you can see, this serialization format only allows one versionstamp to be serialized into it.
To generalize this to more versions, you could imagine we did something like introduce new atomic operations like
SET_MULTI_VERSIONSTAMPED_VALUE or perhaps one atomic operation
MULTI_VERSIONSTAMPED_SET where the serialization format was something like:
- Start with a byte prefix with dummy values for the versionstamps that will become the final key/value (but with the versionstamps filled in)
- Encode after the prefix four-byte little Endian integer offsets for each incomplete versionstamp
- At the end, suffix the byte array with a single byte containing the number of versionstamps
This has some nice properties like:
- Vanilla sets are equivalent to this scheme, but with the value
\x00 appended at the end
- Versionstamp operations in the older encoding can be converted to the new encoding by appending
\x01 at the end
\x00 suffix can be leveraged to allow the same
MULTI_VERSIONSTAMPED_SET operation to be used for versionstamped key and value operations
This scheme is limited to 256 versionstamps in each single key/value, which maybe some use cases would find limiting. Of course, that could be increased by either increasing the number of terminating bytes or by using, say, a variable length encoding scheme.
Adding a new FDB atomic operation is fairly straightforward, though it would require an FDB API change. The standard procedure would be to introduce it in the list of mutations in
fdb.options. This exposes it to all of the bindings, and then code to interpret it needs to be added to the server. You’d probably want to also update the
Tuple class so that it supports serializing
Tuples with multiple versionstamps using the new scheme.
If you didn’t want to do that, you could potentially modify your application’s serialization scheme to support using only incomplete versionstamp. For example, I’m not sure how you’re encoding your entity IDs, but if it’s something like:
(entity 1 versionstamp, entity 1 suffix, entity 2 versionstamp, entity 2 suffix, entity 3 versionstamp, entity 3 suffix)
Then, say, that you want entity 1 and entity 2 to have the same incomplete versionstamp, and you want entity 3 to have some other versionstamp (because, say, it was written in a different transaction). You could do something like:
(shared versionstamp, null, entity 1 suffix, null, entity 2 suffix, entity 3 versionstamp, entity 3 suffix)
null in the entity 1 and entity 2 versionstamp locations indicate that it should use the single shared versionstamp, and entity 3 should use the given versionstamp.
This only works, though, if you don’t care about the ordering (so probably better for values than keys). If you wanted, say, an index on
(entity 1 ID, entity 2 ID), then I don’t think there’s a great way that doesn’t require this new
SET_MULTI_VERSIONSTAMPED_KEY API, or an additional commit (for example, you could serialize entity 1 and entity 2 in transaction to assign them versionstamps, and then update the index in a separate transaction using only complete versionstamps, read from the database from the result of the first transaction)