Efficient way to create if absent in FDB

Is there an efficient way to create a key-value pair if it’s absent? A while ago someone suggested this can be added through the atomic mutation framework. I’m trying to figure out how this might work?

1 Like

It’s relatively straightforward to add a new atomic mutation for a function (key, Optional<value>) -> Optional<(key, value)>, as long as that function is total (i.e. doesn’t crash and is well defined for all possible inputs). This function can even use one parameter to customize its behavior. For example the atomic add mutation takes a little-endian integer that encodes what to add to the little-endian value. I think some kind of set_if_absent mutation could make sense and fit that criteria. The semantics would probably be set_if_absent <key> <value> would set the key to value if previously unset and otherwise be a no-op.

That said it’s kind of a lot of work to add it, and it would probably not be available in a release for a while if we do decide to add it. Here’s what’s currently available:

  1. Read key at snapshot isolation
  2. If key is absent, add key to read conflict range manually and set key to value
  3. Commit

This costs a GRV (get read version) to start the transaction, a read, and a commit, so basically three round trips to the server. If you’re combining this with other operations in a transaction then this does have key in the read conflict range (some of the time), and so might cause the whole transaction to need to retry.

If we did have the atomic mutation described above, then this would only cost two server round trips (GRV + commit), and would never have key in the read conflict range.

I’m imagining use cases for this and one might be if you need to register unique usernames at a high throughput or something and you always want the first writer to win. For this use case we could actually do something with versionstamped keys (though it’s not as clean). Write key + <versionstamp> -> value instead of just key -> value, and when you read you do a range read with limit 1 for keys starting with key. I think this gives you the same semantics - the downside is that you use extra storage for keys that weren’t the first writer. You could have some kind of background job to clean these keys up but that adds significant complexity.

Edit: for completeness’s sake I should probably point out that the key input and output must be the same in the (key, Optional<value>) -> Optional<(key, value)> function described above. Maybe it makes more sense to just call it a function Optional<Value> -> Optional<Value>

1 Like

Thank you for the detailed answer. It is really useful.

This is essentially what is required. The only problem here is that there is no way to ensure whether the write actually happened. Is there a possibility to throw an error (i.e. fail the transaction somehow) if the write did not actually happen (which means the key was already present).

This will be really useful. The use case is to optimize large transactions which insert a number of keys. Since these are insert operations, I have to lookup these keys to be 100% certain no-one has created this as part of different transaction. The optimization that is proposed is if I can do an insert natively, the transactin will succeed 99 percent of the time and I will save one read operation for each insert.

Is there any other way to go about this, if the atomic mutation framework cannot be used to fail the transaction? I’m happy to raise a feature request and contribute to the implementation of it seems feasible.

The way mutations work is that you just commit the mutation and it succeeds (even if the mutation didn’t end up having an effect) - you don’t get any feedback about what the previous value actually was. You can always read the previous value in your transaction but that kind of defeats the purpose. There’s some subtle tricks you can play with read conflict ranges - you can add a read conflict on the key without actually reading it. But all that really tells you is that no one modified the key at a version between your transaction’s read version and commit version, exclusive. Sometimes you can infer that the key is empty at a particular read version without actually reading it based on some invariants of your data model - if that’s the case then the read conflict without actually reading approach should work well.

FWIW it turns out we have a use case for something slightly more general so I started writing a short design for it: Proposal: compare_and_set mutation · apple/foundationdb Wiki · GitHub. Still a work in progress.

One idea is to create a new directory for the keys you’re inserting, so that you can just blind write the bulk of the data into the directory’s subspace and rely on seeing whether or not the directory was created to be certain no-one has already created this (be careful with commit_unknown_result - it could be that an earlier attempt in the retry loop actually succeeded, and you’re not conflicting with another transaction)

Another idea (which only works if all the keys you’re inserting are adjacent and belong to a unique subspace) is that you can cheaply check if a key range is empty by doing a range read with limit 1. If you get 0 key value pairs back, then the range is empty. If you get 1, then it’s not empty.

Ok actually it turns out that it’s not as useful as initially thought for us for the same reason (wanting to know whether or not it had an effect without reading)

Yes, I don’t think it will actually help when the key is already written as a part of a previously committed transaction.

I’m not sure if I understand correctly, but what happens if some key in the transaction is present already, then in that case will the directory be created even though the key was already there?

Also, would having a native insert operation besides the set and a get operation follow the general FoundationDB philosophy? Having inserts supported natively will be really useful.

The idea is that if you’re using the directory layer to manage all your subspaces then if you successfully create a directory in a transaction (and check that it wasn’t already present), then you know the subspace associated with that directory you created is empty.

What would be the semantics of the insert operation? If it’s another name for the create if absent operation we’ve been talking about then the basic problem is that the way foundationdb separates transaction commits and reads architecturally makes it so that it’s basically just as efficient to just read on the client even if we did want to implement that logic server-side.