Transaction Cancelled Errors

arshia · August 24, 2020, 10:48am

We’ve been facing a lot of Transaction Cancelled errors (Operation aborted because transaction was cancelled: 1025) lately. We have identical, unique transactions of ~500 KB each being transacted by 6 Java clients in a loop. We do not have any timeout options set for these transactions. We were running these with a performance testing setup in mind, and the write rate observed was ~25000 kHz.

Our cluster configuration:

5 machines
triple replication
1 SSD per machine

The process class configuration:

Machine 1: 2 storage + 1 proxy + 1 stateless
Machine 2: 2 storage + 1 proxy + 1 stateless
Machine 3: 2 storage + 1 proxy + 1 stateless + 1 log
Machine 4: 2 storage + 1 log + 1 stateless
Machine 5: 2 storage + 1 log + 1 stateless

Interestingly enough, our logs suggest that some of these transactions were already committed when the said error was encountered.

Any insight into this would be helpful!

ajbeamon · August 24, 2020, 3:32pm

transaction_cancelled can get thrown by an operation for a few reasons:

You call cancel on your transaction.
The transaction is destroyed while an operation is outstanding. For example, if you start a read and don’t wait for the result, closing the transaction may cause that operation to throw this error.
You reset your transaction (I don’t think this is possible in Java).
You retry a transaction that has outstanding operations (using onError or the default retry loops).
If you happen to be running an API version before 410, then I think commit could put the transaction into a state where any subsequently started operations may be cancelled.

If you have transactions that are successfully committing and then throwing this error in some operation, then I suspect what’s happening is you are hitting #2. In the case that you commit a transaction, though, it will wait for outstanding reads to complete before the commit can succeed. That would mean that in order to trigger this case, you would need to be starting operations after the commit has started, and it would be these operations that would see the error.

Is that something that is plausibly happening in your application?

Topic		Replies	Views
High rate of transaction retries with error code 1009 (Request for future version) Using FoundationDB performance	39	5124	April 30, 2020
Should we cancel read-only transactions before (or instead of) committing them? Using FoundationDB	1	91	October 29, 2024
Prevent retrying on slow transactions Using FoundationDB	1	711	November 14, 2019
What can cause proxy commit batch memory to be exceeded? Using FoundationDB	2	672	August 25, 2020
Transaction is too old to perform reads or be committed AND Request for future version Using FoundationDB	1	2428	December 31, 2019

Transaction Cancelled Errors

Related topics