Reading your own write conflict range

I’m thinking about a feature for the special key space where you can read a transaction’s read conflict range and write conflict range for debugging or logging purposes.

I realized this has a weird interaction with the set_versionstamped_key feature. If a transaction sets a versionstamped key, then its write conflict range is unknown until commit time.

My first reaction was to just disallow reading your own write conflict range for transactions that use set_versionstamped_key, but I’m curious what other people think would be the right thing to do here? Maybe the read should block until the write conflict range is known?

isn’t the set_versionstamped_key trying to avoid the transaction conflict?

Is it possible that two transactions conflicting on the versionstamped_key? (They may conflict on other keys in the txns.)

If versionstamped_key is not supposed to conflict, I would prefer not exposing the particular set_ versionstamped_key’s conflict key.

The rest of conflicting keys in the txn can still be useful. Is it possible to expose the rest of them and having a flag to indicate that not all conflict keys (such as versionstamped_key) have been exposed?

You would still have the write conflict from writing the versionstamp. So, for example, if you had a subspace where each key was written with a versionstamp, and you read the entire subspace, then your transaction could fail with a conflict if another transaction concurrently adds a new entry to the subspace.

As to the original question, I suppose the approach of only completing the read after the conflict ranges are known would work, though it would mean you could only query this index after you commit (and I could see someone accidentally writing code that reads from this range prior to committing running into a situation where there program hangs). I suppose another approach would be to use the “fake” conflict range that the RYW cache uses to avoid reading a version stamped key, which would be a range that contains all possible keys the versionstamp might be resolved to. It’s a bit of a blunt tool, though, so not sure it’s better.

1 Like

I think this is still a special case when users want to know their transaction’ conflict key range.

As to the versionstamped_key conflict, how about replying versionstamped_key_conflict_unkown to user?

I think this information can provide enough debug and log (correct me if I’m wrong) information for users to figure out conflict keys:
If the transaction conflicts in non_versionstamped_keys, it will receive the exact conflict range;
If the transaction conflicts in versiontamped_keys ONLY, the message versionstamped_key_conflict_unkown tells users that the versionstamped keys are conflicted.

I’m assuming users usually don’t have a transaction that set lots of version stamped keys.

This feels like it would be better to me than omitting it, mainly because it seems like it would be easier to reason about a range that is too big (i.e. an overlapping read might conflict) vs. a range that’s just missing, which would either mislead you or require you to be very aware of what’s going on. It also feels a little better to me than returning an error because even though it might not be fully accurate, it doesn’t seem like it’s inaccurate in a dangerous way and the data will still be useful. In some ways, it may actually be better to see the whole possible affected range of a versionstamp if you are trying to evaluate the impact of a transaction before committing it. Maybe I’m wrong about that though.

The idea of blocking until commit time only if we wrote a versionstamped key feels a little clunky. If there’s a use-case for reading it before commit, then that use-case becomes incompatible with versionstamped keys. If there’s not a use-case, then we could just always block.

I’m hearing that returning an error and omitting it are out, so I’ll use the “fake” conflict range @alloc described if the transaction hasn’t committed yet, and the precise conflict range if it has.