Goal: be able to detect database property failures in simulation very precisely even when running very weird randomized workloads, which might sometimes deliberately weaken serializability or just be nondeterministic in nature. See for example issue 126 which I guess must have slipped through a crack in existing workloads?
The basic idea is that rather than try to determine if an arbitrary result is consistent with some legal behavior of the database, which is a challenging search problem in general, we take advantage of the fact that FoundationDB’s implementation actually constructs a (supposedly) valid serialization order to check a strictly stronger property.
Start with an arbitrary concurrent workload, running in simulation, and using the API pretty freely (but I guess staying away from system keys and management APIs). The distribution of behavior, of course, matters and there is plenty of room for creativity.
Have each simulated workload client keep (out of band, safe from simulated failures, and above the level of the RYW API) a record of all its read and write transactions, including true simulation times before and after, an identifier unique for each transaction (retry) snapshot and (where commit() succeeds) commit versions, and details of reads and writes. It’s probably OK to store only a hash of the results for reads.
Augment each transaction with a versionstamped operation (either key or value) that includes the unique identifier.
After the workload is complete, from looking at the versionstamped operation log we know the final serialization order of all transactions that were attempted by any client: if the transaction was read only, it serializes at (the end of?) the snapshot version. If it was read/write and does not occur in the operation log it did not occur, and if it does occur it should serialize in just that order.
Combining this information with the out of band logs from each client, we can check, separately:
(1) that the serialization order is compatible with the before and after times recorded for each transaction (external consistency), that every transaction that the client saw successfully commit() serialized (durability), and that transactions the client received not_committed() for did not serialize, and
(2) that every read and the final contents of the database are (isolation, atomicity, many other properties) exactly what would be predicted by executing each operation sequentially in the serialization order (using MemoryKeyValueStore
, and perhaps ideally an independent implementation of the atomic mutations as in the specific workload for that?)
Together these properties should more or less precisely validate the database’s contract. The only restriction on workloads is that there can’t be a transaction without a versionstamped write but with other writes, but it is hard for me to imagine a realistic bug that can’t manifest with that restriction.
It would be interesting to implement this technique as a wrapper on the transaction API, so that it could easily be reused by different workloads.
Thoughts?