VersionStamp uniqueness and monotonicity

(gaurav) #1


Are version_stamps guaranteed to be unique and monotonic for the life of a FDB cluster? Are there scenarios when version_stamps may be reset to value lesser than one already generated?

In this discussion Christophe mentioned that in case a new cluster is being restored from a backup, the version_stamps may be reset to 0 (or more generally to some start value that is <= to a value already generated earlier in the fdb cluster from where the backup was taken).

By the way: there is no guarantee that version stamps will always go up: if you are restoring from a backup after completely reinstalling a new cluster, it may be possible that the read version starts again from 0… The conditions to make this happen may be improbable, but not impossible!

Is there a way to avoid this? Maybe by including in backup some meta-data that tells the FDB cluster to start version_stamp generation from a given value onwards? Or some other approach that clients can themselves take to get around this? Any pointers will be very helpful.

I was thinking of using version_stamps for multiple purposes:

  • as a reference key between two key-values (e.g. data_row, and corresponding index_row).
  • as a prefix key in a log index so that I can “tail” the index from a given bookmark (version_stamp), and cycle…
  • etc.

But if I cannot guarantee the uniqueness and monotonicity of these then it becomes difficult to use it for above use-cases.


Use case of versionstamp and behavior of pack_with_versionstamp
Streaming data out of FoundationDB
How to use set_versionstamped_key
(Ryan Worl) #2

I have never heard of that potential until now, but that’s quite annoying if true.

My idea for a workaround is to prefix the versionstamp with a counter that represents the version of that installation of the database.

So if you choose to use a 1 byte prefix, you have 255 cluster re-installations if you increment that prefix upon each re-installation as a part of the bootstrap process after restoring the data, but before starting the application.

e.g. LogName-Prefix-VersionStamp => Value

(Alec Grieser) #3

Yeah, that sounds like a reasonable enough workaround for this. You could also use a tuple encoded integer instead of a single byte. It works out to be an extra byte for values between 1 and 255, but it also means that you aren’t limited to 255 restores.

You could also imagine using the prefix if you want to move data between clusters for other reasons. Suppose, for example, that you had multiple queues (maintained using versionstamps) all from the same cluster. Then you might decide you want to start sharding across multiple clusters (because maybe you want to serve some queues from one locale and another set of queues into another locale). Then you can use this prefix to safely copy the queue from one cluster to another (with the prefix essentially being the number of times you’ve copied the queue from one cluster to another).

Also, there is already work that bumps the current database version on a DR switchover so that the same versionstamp log can be used even if you switch from the primary to the secondary. In theory, similar work could be done on a database restore. But this would definitely require a fair amount of core development.

I will also say that restoring from a backup should be a fairly rare occurrence. If you are restoring data into the same cluster because an application did something like delete an extra range accidentally, then you want to restore with the same versionstamps as you did when you inserted the data the first time (or your references won’t match up). If you have to restore from backup because some catastrophe meant that all of your data were lost, then you’re in a somewhat stickier situation, but that should be very rare. At that point, it might be safer to reinsert all of the data in your application using a new version history anyway for data integrity reasons.

(gaurav) #4

Thanks for the suggestions! I can easily incorporate these in my store design.
I think this information regarding VersionStamps, and potential workarounds may generally be useful in main documentation for other trying to model their applications using VersionStamps