Returning conflicting read/write conflict ranges after transaction conflict

alloc · June 28, 2019, 5:24pm

Debugging transaction conflicts is currently fairly difficult. The best way I’ve found so far (other than thinking really hard about what happened) is to (1) enable client trace logs on all operations and (2) enabling transaction logging on all transactions (3) processing client logs and manually running the conflict algorithm for all of the transactions found (which requires having all available transactions to get it correct). This is problematic as, especially in real environments, one might not have access to all conflict ranges from all transactions (as that can be a lot of data for reasonable workloads).

One proposed solution that I’d kind of like to see if people have any thoughts on is to return conflict information from the resolver back to the client. I believe the heart of the conflict resolution happens here:

github.com

apple/foundationdb/blob/a4f12a19a3a24bd68676bef8e629d6e855c4702c/fdbserver/SkipList.cpp#L825-L828


							if (nextS->length() == start.value.size() && !memcmp(nextS->value(), start.value.begin(), start.value.size()))
								return noConflict();
							else
								return conflict();

At this point, the resolver knows (1) the failing read conflict range from this transaction, (2) the read version, (3) the commit version at which the range was changed, and (4) an upper bound of the mutation range that caused the operation to fail. I think all of those would be useful in debugging transaction conflicts (except maybe the transaction read version, though the client already should know that).

Knowing the failing read conflict range would be a big help by itself. Then the user can use this information to debug what might have happened based on domain knowledge of their data model. For example, in the Record Layer, knowing which range failed might be useful in determining whether it was a single record in the read set that was updated that caused the failure, or if concurrent writes would have caused a uniqueness constraint to be violated, or if the store header was updated due to some meta-data upgrade.

Knowing the commit version and mutation range can then be useful for correlating what happened to other operations going on at the same time. This is especially true if all transactions are being logged, but they have their uses in other instances as well.

The next question would be how to expose this to the user. The quickest way would be to log it in the client trace logs, maybe only if the user has set the “debug ID” option. Then it is up to the user to actually enable that logging. This has the disadvantage that the information must then be fished out of the log, but it would be available somewhere and wouldn’t require any API changes at the FDB client level.

All else being equal, the “best” option would be to make the exception class in each bindings beefier to include additional information. For transaction conflicts, you could then include this information as methods in the exception (maybe lazily marshaling from the C client?). This doesn’t quite gel with how those classes are structured at the moment though.

You could also imagine methods on the transaction object that allowed the user to query the transaction for what happened (and then do…something when the transaction hasn’t committed yet or the transaction succeeded). You could imagine a higher-level layer on top of the key-value store bindings would then package up the information from those methods with a transaction conflict exception so that the user could ask the exception for more details into what happened.

zhongyan · June 16, 2022, 9:22am

Hi Alec,
I’m curious if your proposed solution is implemented in current FDB release? we are investigating a conflict transaction issue, trying to find out what is the keyrange causing the conflict. Anyone know an easy way?

zhongyan · June 16, 2022, 9:59am

I got the answer from here: Report conflicting keys by zjuLcg · Pull Request #2257 · apple/foundationdb · GitHub

andrew.noyes · June 16, 2022, 4:07pm

Special Keys — FoundationDB 7.1 has some more detail about how to use this - also see the report_conflicting_keys transaction option. Edit: also make sure api version is at least 630.

Topic		Replies	Views
Reading your own write conflict range Using FoundationDB	5	1163	May 4, 2020
Questions regarding FDB transaction conflict on two concurrent transactions Using FoundationDB	9	2124	October 28, 2021
Issue with reverse getRange / conflictRange full range ignoring limit field Using FoundationDB	1	407	September 15, 2022
Layer for read-write transactions lasting longer than 5 seconds Development	9	2023	November 12, 2019
Inducing a Read Conflict Using FoundationDB	5	870	December 19, 2019

Returning conflicting read/write conflict ranges after transaction conflict

Related topics