Versionstamp ranged deletes

I have a usecase where I’m using versionstamps to insert a rolling set of data which I want to delete over time – a use-case like having the last 30 days of data.

My idea was to delete the old data with a ranged delete at regular intervals. However, as best as I can see that requires keeping a some sort of mapping between time and versionstamps so I can look up which versionstamp is most closely associated with 30 days ago, and delete up to that key.

Reading Versionstamp as absolute time I can see there is a built in way to do that, except its not exposed via an official API.

Is there some other mechanism I’m not thinking of to accomplish my goal (besides keeping the mapping in the userspace the database)?

1 Like

Since on each recovery the versionstamp can increase by hundreds of millions (unreliable to calculate exact time), I would just set up a scheduled job to create a current time to versionstamp mapping every minute or so in the userspace. Or I would just generate the ids of the records on the client side and use a TSID that encodes the current time.

1 Like

The Timekeeper as mentioned in the other thread keeps the mapping every 10s. There is already code to convert a Unix timestamp to a FDB version. Note the error is about 10s. This is probably the easiest way.

Having Timekeeper lookup functions in the FDB API would be a nice feature.

You did not mention languages, but there is also a Java version of this in Record Layer.

1 Like

Does this mapping data survive a backup → restore to empty cluster cycle? (is the 0xFF keyspace backed up/restored?)

The mapping is not included in a backup or restored by restore. Both backup and restore accept a list of target ranges, and while I don’t think anything would block inclusion of the Timekeeper range in both operations I’m not sure exactly what the result would be. The destination cluster’s version timeline is independent of the source cluster and it has its own Timekeeper data, and it would also be generating more Timekeeper data during restore. Depending on how the cluster versions in the backup data and the destination cluster align, weird things could happen.