FoundationDB

Use case of versionstamp and behavior of pack_with_versionstamp

bindings

(Amirouche) #1

I am trying to understand how to use versionstamps and if they can be useful in my layer.

I am trying to build something like datomic, that is a versioned database. I am relying on optimistic locking that is, I read and write the latest version of the data before storing the diff between the new data and the previously stored data with the transaction. In this context, it seems to me that using a timestamp generated in Python should be enough to have the correct ordering of transactions when it matters. That is, when two transactions read and write different parts of the database it doesn’t really matter which one comes first, I think.

A simpler question, might be, in which case versionstamp are useful? What are the usecases for versionstamps?

Also, I tried to experiment with fdb.tuple.pack_with_versionstamp, here is an example run:

import fdb
fdb.api_version(510)
db = fdb.open()

txn = db.create_transaction()
v1 = fdb.tuple.pack_with_versionstamp((b'\x00', fdb.tuple.Versionstamp(),)); 
txn.set_versionstamped_key(v1, b'\x42'); 
v2 = fdb.tuple.pack_with_versionstamp((b'\x00', fdb.tuple.Versionstamp(),)); txn.set_versionstamped_value(b'\x42', v2)
txn.commit()

And I get a very strange result:

>>> db.get_range_startswith(b'\x42')
[b'B': b'\x00\x00\x00#l\xc5\x04d\x00\x00\xff\xff\xff\xff\xff\x00\x00\x05\x00']

Where as:

>>> db.get_range_startswith(b'\x01')
[b'\x01\x00\xff\x003\x00\x00\x00#l\xc5\x04d\x00\x00\x00\x00': b'B']

Where the first byte of the key is indeed \x01 which is the code for bytes values followed by \x00.

Also both timestamp are cleary not the same. What is the reason for that?


(gaurav) #2

Could you please check following conversations if they answer some of your queries?

[VersionStamp vs CommittedVersion]
[VersionStamp uniqueness and monotonicity]
[Implementing VersionStamps in bindings]


(Christophe Chevalier) #3

Could you dump the value of v1 and v2 before calling fdb.set_versionstamp_xxx ?

The encoding of the second range read (b'\x01'...) looks like the correct encoding of a tuple with a singe-byte array followed by a 96-bit version stamp.

The encoding of the fist read (b'B'...), though, looks really weird. It is as if the set_versionstamp_value(...) call did not properly find the offset where the stamp is located, and so the db overwrote the first 10 bytes of the value with the stamp, instead of starting at offset 6 ? When sent by the client, the stamp initially is filled with all \xFF bytes, and we can still part of these towards the middle…

I know that in API version 51x, the set_versionstamp_value call did not have the ability to specify an offset, and would always overwrite the first 10 bytes. In API version 520+, the method was changed to be able to pass an offset as well (like it is possible for the keys).

So either the code did not get the correct location of the offset, or it was executing with API 510 which does not expect an offset?


EDIT: missed that you were selecting API 510 at the start of the script. Can you try selecting API 520 and see if this changes the result?


(Christophe Chevalier) #4

For your other question, I use versionstamps for the following properties:

  • Need a “sequential”-ish id for items but you don’t want to pay the cost of having a centralized counter (or even having to deal with a counter at all).
  • Need a “happened before”/“happened after” way of sorting data without having to rely on an external time source (and the system clock of the local server is NOT reliable enough in some cases)
  • Need to generate 80-bit or more UUID that are guaranteed to be globally unique both in time and space (random uuids could work but are not sortable)
  • Need a way to do a “write-only” transaction (no reads!) for latency reasons.

I think that in most cases, you could have done the same thing without versionstamps, but it would have been slower or caused more conflicts. Versionstamps are a way to optimize your layer, and also allows for algorithms that were not performant enough before to be practical ?


(Amirouche) #6

I retried the experience version 6 and there is no more bug:

(py3) amirouche@ubujul18:~/src/python/tmp$ cat test-fdb.py 
import time
import fdb
fdb.api_version(600)
db = fdb.open()

txn = db.create_transaction()
key_stamp = fdb.tuple.pack_with_versionstamp((b'\x00', fdb.tuple.Versionstamp(),))
print('key_stamp', key_stamp)
txn.set_versionstamped_key(key_stamp, b'\x42')
value_stamp = fdb.tuple.pack_with_versionstamp((b'\x00', fdb.tuple.Versionstamp(),))
print('value_stamp', key_stamp)
txn.set_versionstamped_value(b'\x42', value_stamp)
txn.commit()

time.sleep(5)

print('record starting with x01')
for record in db.get_range_startswith(b'\x01'):
    print(record)

print('record starting with x42')
for record in db.get_range_startswith(b'\x42'):
    print(record)
(py3) amirouche@ubujul18:~/src/python/tmp$ fdbcli --version
FoundationDB CLI 6.0 (v6.0.15)
source version 8903c5f6212a0dd927c69c094b3b84d38ef7a62d
protocol fdb00a570010001
(py3) amirouche@ubujul18:~/src/python/tmp$ fdbcli 
Using cluster file `/etc/foundationdb/fdb.cluster'.

The database is available.

Welcome to the fdbcli. For help, type `help'.
fdb> writemode on;
>>> writemode on
fdb> clearrange \x00 \xFF;
>>> clearrange \x00 \xff
Committed (398979788234)
fdb> 

(py3) amirouche@ubujul18:~/src/python/tmp$ python test-fdb.py 
key_stamp b'\x01\x00\xff\x003\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x05\x00\x00\x00'
value_stamp b'\x01\x00\xff\x003\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\x00\x00\x05\x00\x00\x00'
record starting with x01
b'\x01\x00\xff\x003\x00\x00\x00\\\xe5Jj\\\x00\x00\x00\x00': b'B'
record starting with x42
b'B': b'\x01\x00\xff\x003\x00\x00\x00\\\xe5Jj\\\x00\x00\x00\x00'

(Christophe Chevalier) #7

Yes, and it would also probably work with version 520 is my guess.

I’m not sure if this is a bug in the python binding or something you have to know, but set_versionstamped_value behavior changed from 510 to 520 to allow specifying an offset for the stamp within the value. With API 510, the stamp MUST start at the beginning of the value (essentially making it impossible to use with tuples), while 520+ it can start anywhere (with the same limitation as for keys that there can be only one).

Maybe the python binding should throw if given a tuple with stamp at offset non-zero, and if API version selected is < 520 ?


(A.J. Beamon) #8

I don’t think we could do that because of the fact that versionstamped keys supported non-zero offsets in older API versions, and the versionstamped tuples were introduced in earlier versions specifically for the keys.