Make commit_unknown_results unlikely - PR status

Thanks a lot for mentioning the issue with transaction_timed_out. I completely forgot about it! :slight_smile:

You had also mentioned the following in a previous thread.

Are there any other error codes similar to cluster_version_changed that has the behavior of being retryable while the transaction is still in-flight?

In order to correctly deal with in-flight transactions for non-idempotent transactions, would the following mechanism work?

  1. In the event we receive a transaction_timed_out or cluster_version_changed, wait for a specific duration of time, for example 10 seconds.

  2. The in-flight transaction would have, by this time, either been committed, or the transaction would have been rejected due to the 5 second limit.

  3. After 10 seconds, attempt to read the sentinel key again. This time if we are able to read the sentinel key, that would mean a successful commit. Missing sentinel key would mean the previous transaction failed to commit and would also not commit in the future.

That would be an error in the garbage collection algorithm. For the design to work correctly, the garbage collection process must leave a threshold (for example: 72 hours) of sentinel keys untouched.

I’ve updated the document reflect this (Tracking is now enabled, so you can easily find the edit).