How to minimize transaction conflicts on atomic operations?

Yes. If you were updating a total counter of items in a table, that would work fine, but if you need the updated value from the same transaction (for uids or primary keys) then this will not be fast because of read conflicts.

Under the hood, if you don’t read the key and only atomically increment it, the client sends “increments that key” to the cluster who will perform the operation later in the pipline. If you read the key and increment it, then you would still conflict if the value changes anyway, so the client can convert the operation into a regular read/write. Note that if in the same transaction you atomically increment the key twice, you’ll see that it only send one increment for +2 (or +N, basically it merges the operations into one).

Yes, that’s one the scenario that atomic operations were designed for: counters that are read by other transactions (where before atomic operations existed, back in I think v2.x you had to also use high contention counters).

Versionstamps are designed to help solve (among others) the problem of uid counters when you need something that increments (with gaps), but at the cost of larger keys. This helps a lot solving the perf issues with queues (like message queues, ordering of events, …).

In your repo, it looks like the uid will always be the first parf of the key, so you may be saved by the future storage engine being worked on (talk about this: https://www.youtube.com/watch?v=nlus1Z7TVTI&list=PLbzoR-pLrL6q7uYN-94-p_-Q3hyAmpI7o&index=10), that will introduce key prefix compression: the versionstamps are currently implemented as the cluster’s commit version (large number that increments by ~1,000,000 per second), so the first 4-6 bytes will not change frequently (every day? week?). With prefix compression, this means that the actual cost (on disk) of versionstamps will be down back to the 4-6 bytes (so same as integers). I’m not sure if the prefix compression scheme will be used for network transmission, but at least you will pay less for storage.

This won’t help much if uids are not the first part of the key or in the value (so basically indexes, or internal pointers to other entities).

Another point: if you have a location that is updated by each mutating transaction (a counter), then you will create a “hot spot” for that particular storage process. You could try to break the counter into multiple keys far apart in the key space so that they end up on multiple storage nodes, but that could be somewhat difficult to achieve. This impacts global counters as well as voting tallies like an index (..., 'best_girl_votes', <character_uid>) = [32 bit counter]: the most popular characters will be mutated more frequently than some others, and may create hotspots. I think there is also a talk in the playlist that talks about hotspots and how to avoid them.

Sooo… maybe versionstamps + hot stops mitigation could still be a solution, at least long term? You’d need to do some calculations and cost breakdown for this. If you go that route, take a look at VersionStamp uniqueness and monotonicity - #3 by alloc for some caveats.

1 Like