Inducing a Read Conflict

My team and I are working on a distributed clock that ticks up a counter once a second. For redundancy, we wanted to have multiple “time keepers” that update the clock in case one of them crashes. To prevent the clock from being ticked twice, we of course need to cause transaction conflicts. Here’s how we did it:

for {
  db.Transact(func(tr fdb.Transaction) (interface{}, error) {

    // Reset our 1s timer.
    timer.Reset(time.Second)

    // Without getting the read version, the
    // conflict doesn't seem to occur. Why?
    rvp := tr.GetReadVersion()

    // Add the read conflict manually since
    // we aren't actually performing a read.
    tr.AddReadConflictKey("clock_key")

    // Wait for the timer to expire.
    <-timer.C

    // Increment the clock's value by 1.
    tr.Add(c.Key, []byte{1,0,0,0,0,0,0,0})

    // Make sure we got a read version
    // before committing.
    rvp.MustGet()

    return nil, nil
  })
}

Error handling has been removed to keep example clean. My question is why do I need to get the read version to induce a conflict?

My guess is that the read version is compared to other transactions’ commit versions to decide whether these transactions can conflict. Because I didn’t perform a read, my transaction never got a read version and therefore would never be checked for conflict.

The reason that getting a read version in your code helps to generate conflicts is that conflicts are detected by looking for changes to keys that have been read in between the version they were read (the read version) and the version of your transaction’s commit.

When you don’t manually get a read version or read a key, a read version will be fetched for you automatically at commit time. Because of the wait occurring in the middle of your transaction, the read version fetched at commit will be much later than one fetched at the beginning. This means that when you aren’t getting the read version manually, your transactions will be much less likely to overlap and conflict.

The code above is still potentially prone to this problem. If getting a read version takes a long time (say, 1 second), then the fact that you aren’t waiting for it before starting your timer means that the time between your read version and commit may still be small.

What kind of guarantees are you looking to provide with this clock? Does it need to tick at intervals >= 1 second? If so, then I think it would be necessary for you to wait for a read version before starting the timer, as that would mostly ensure that you didn’t see any changes in the second leading up to your commit (though this may still not be strictly guaranteed).

This could potentially lead to your updates coming at intervals consistently longer than 1 second, though. Another alternative would be to have the clients keep some concept of what the time should be (e.g. by measuring the number of seconds since you first read the key or using a clock) and do a read and write of the value, only updating it if the time is increasing (and updating the local value if necessary).

(Unrelated to original question) What would be the use of getting the read_version in write only transactions? From what I’ve understood so far, such transactions cannot conflict. Does the client specifically makes a grv call prior to commit for such transactions?

In a write only transaction with no read conflict ranges, I think the main (only?) reason is that getting a read version is the way that transactions interact with ratekeeper. If we didn’t get a read version for a transaction, then ratekeeper wouldn’t be able to control it.

We will also send the read version with the transaction, but as far as I know it wouldn’t have any impact on the commit or other transactions. Fetching the read version in this case is done using the option CAUSAL_READ_RISKY as an optimization to avoid interacting with the transaction logs.

Well, I guess if for some reason the commit takes long enough to actually get to the resolvers after having obtained a read version, your transaction could be failed because the read version has become too old. I don’t know if it’s strictly necessary that this be the case (we would still need to not commit something with too old of a commit version), but maybe it’s desirable that we don’t apply a random old commit that’s been floating around for some reason.

Thank you! That makes sense, and is very helpful.