Automatically providing transaction idempotency

I have a question about transactions with unknown results and would like to discuss ideas of providing transaction idempotency.

Transactions with unknown results describes the problem and suggests an approach where the application will generate a unique id for each transaction that will be checked and written in the transaction loop. This will eliminate double execution. This is definitely a good approach but the problem is that every application code needs to write this extra code, which is error prone.

We considered doing it generically in our binding layer, where we will automatically allocate transaction ids, write them to lets say a special directory and check for idempotency automatically. This however still has the extra overhead of additional write, potentially hot range (the directory for tx ids; could be solved by splitting to multiple directories) and the need to clean them up periodically. The advantage of course is that we can do it completely generically for all application code and that it also does not require FBD code changes.

The question here is if there is a better/more efficient way to achieve that?
Maybe someone in the community came up with a different generic approach?
Maybe there is some extra information in the FDB itself that can help solve that better (potential “small” changes to FDB itself could be considered)? Maybe there is a way to query a committed_version even for transactions that failed with commit_unknown_result?
Brainstorming ideas are welcomed!

2 Likes

The version doesn’t help here: read and write versions are shared between several transactions. So knowing that something with version x (whether read or write version doesn’t matter) committed, doesn’t give you any additional information.

We could implement the same trick (writing some transaction ID to storages with every transaction) within FDB. This would give us the benefit that we could potentially safe one round-trip at the start of the transaction: we could send the transaction ID to the proxy during the GRV request, the proxy would read this value and only give back as read version if this transaction ID doesn’t exist.

However, the main problem here is that cleaning up these entires from storage will be quite hard as clients can potentially wait a very long time before they retry. So I am not convinced that this will be worth the effort…

But generally I strongly agree that finding a better way to solve this problem would be highly valuable!

1 Like