In a transaction, if I update the \xff/metadataVersion
key, and then attempt to read it again, I get a “Read or wrote an unreadable key” error (1036) when trying to read again. That was a bit surprising at first, but it seems it is the side effecting of using versionstamps that are not known under after commit, and is the expected behavior.
Looking at the implementation in the record layer, it seems to specifically catch this error, and return the null
versionstamp instead. I changed my code to do the same thing if the read fails.
My surprise is that then if I attempt to commit the transaction, then the commit also fails with the same error … ? Using snapshot isolation or not has no impact. The error is throw by fdb_transaction_commit
, and is rethrown by fdb_transaction_on_error
.
Transaction #5 (read/write, 4 operations, '#' = 0.5 ms, started 22:24:26.1274739Z, ended 22:24:26.1584759Z)
┌ oper. ┬─────────────────────────────────────────────────────────────────┬──── start ──── end ── duration ──┬─ sent recv ┐
│ 0 a │ X │ T+ 1.841 ~ 2.196 ( 355 µs) │ 31 │ Atomic_VersionStampedValue <FF>/metadataVersion, <00 00 00 00 00 00 00 00 00 00>
│ 1 mv°│ :###` │ T+ 2.199 ~ 3.746 ( 1,547 µs) │ │ GetMetadataVersion => [AccessedUnreadable] => <null>
│ 2 !Co*│ _______$###################################################X │ T+ 5.270 ~ 30.141 ( 24,871 µs) │ │ Commit => [AccessedUnreadable] Read or wrote an unreadable key
│ 3 !Er°│ ____________________________________________________________=## │ T+ 30.516 ~ 31.707 ( 1,192 µs) │ │ OnError AccessedUnreadable (1036) => [AccessedUnreadable] Read or wrote an unreadable key
└────────┴─────────────────────────────────────────────────────────────────┴──────────────────────────────────┴─────────────┘
note: timings are weird because it’s a unit test and the JIT has to compile all the code on the fly!
What is then the point of catching the error when reading the key, if it will inevitably throw the same error at commit time? This means that any combinations of layers, one that mutate the schema, and another that attempts to use a cache, will systematically throw at commit time.
The record-layer attempts to keep a dirtyMetaDataVersionStamp
flag that knows that it has already been changed and not even attempt to read the key (and doom the transaction) but:
- it only works locally for the record-layer, not if another layer changes the key separately.
- it seems racy (no locks) if multiple threads use the same transaction.
- if the read still happens somehow, and the error is caught, the code sets the flag to true but it will still fail to commit later, so … ??
Either I missing something critical, or it means that it is impossible to compose two layers that use the metadataVersion key for caching in the same read/write transaction?
I could maybe see moving this “dirty versionstamp” protection logic from layer code up to the binding itself, so that no one can slip past, but then I’m not sure if this would be race-proof: both a read and write happening on the metadataVersion key for same transaction.