How to use set_versionstamped_key

(Amirouche) #1

I would like to build versioned quad store with a linear history. Let’s forget for a moment what a quad store is.

The problem we have at work is that basically bugs sneak into production and then it mess with the data and we need to trace what happened to fix the production database. Another use case, is that we need to trace when changes happens for domain logic/business purpose. Right now, we rely on ad-hoc solutions. For instance, we add things like modified_at columns or use tables (or collection in the case of MongoDB) just to keep track of what happened sometime recording the previous values like event logs. Yet another example, are workflows. We end with complicated models and multiple ad-hoc solutions, a lot of efforts and duplication to solve the problem of logging changes.

I don’t think event sourcing (as in a table that logs events/changes) offers the correct canvas, because you can not AFAIK without storing snapshots (like backups) go back in time. And the history track is limited or costly to recompute. Like I said we need to keep track of changes not only for debug but also to implement domain logic. So getting the history of a given model must be easy.

The quad store is a 4-tuple set. I plan to record changes / diff ie. what was added or deleted along a linear history. So, with the diff it’s possible to recompute for a given revision relatively quickly if you avoid counters the state of the data and know the history of a model.

In addition the diffs, I want to store the last version of the data to make reading the data super quick. There is not issue with reading the last version of the data.

My problem is how should I materialize what I call transaction uid that must build a linear history. I was thinking about using set_versionstamped_key. I am not sure how it work.

(Alex Miller) #2

Versionstamped keys are exactly what you want. They record the commit version at which FDB committed your transaction, and the ordering of transactions within the batch of transactions that were all applied at that version. FDB promises that if you take all your transactions and apply them in versionstamp order, then you will get exactly the same data that FDB contains or will present to you. So FDB already provides this linear history, and versionstamps let you easily build on top of that.

You should just be able to maintain one subspace that records all the changes done to your database, and another subspace that is the current-up-to-date copy of the database. Make sure your transactions always modify both, and record a versionstamp so you know in what order to roll back operations you did to the database if you need to go backwards in time.

(Alec Grieser) #3

In addition to what @alexmiller mentioned, you might also want to see this discussion, which went into some pitfalls that you might run into if you need to perform a restore on a database that you have decided to use versionstamp operations or if you decide you need to be able to copy versionstamped data between FDB clusters.