According to the documentation:
Transaction size cannot exceed 10,000,000 bytes of affected data. Keys, values, and ranges that you read or write are all included as affected data.
Through reading the code and testing it myself, the size of the values you read does not seem to be a part of the calculation. I wrote 15mb of data into a range in multiple transactions, read it back in a single transaction with one large range read, and did not encounter transaction_too_large
as I would’ve expected.
From reading the code, it seems like the only way to hit this on read would be to read so many individual keys or ranges as to create more than 10 megabytes of keys, not values.
Additionally, from my understanding of the internals, value size should not really be an issue in the same way as writes and conflict ranges are. Large writes are hard for obvious reasons, and many individual conflict ranges would make conflict detection take longer. But reading data comes from the storage servers, which don’t do much more than just return the data.
Is my test wrong somehow, or could the documentation be more specific about how the calculation is done?
The use case I was interested in is, if the new storage engine enables longer running transactions through more sophisticated multi-versioning, doing snapshot reads over lots of data would be possible. Making this part of the documentation more clear would make that use case more obvious to other users.
Edit: Additionally, can you even encounter this at all if you’re doing a read-only transaction because they don’t commit?