Query hotspotting on Directory Layer's metadata subspace

That’s an interesting work around.

I’ve had many issues other the years with the Directory Layer, related to sequential read latency required to get a subspace prefix, and issues related to caching (which is unsafe if not done very carefully), and tried multiple approaches to fix this, including breaking changes to the DL API. I ended rolling back these changes because it made it unsafe to use in combination with other tools (that don’t know about the API changes).

The best compromise solution I found was to defer the reads to the DL until the end of the transaction so as to get rid of the extra latency, and only commit the transaction if the result matches what the cache expected (and if not, retry the transaction). But this still induces a hot spot in the DL’s key subspace.

Your idea of adding a reverse key into each subspace is technically a breaking change to the DL contract, meaning that any external client or tool that would create a directory subspace by itself would not insert this key, and your cache check would fail. Note that the DL api in most binding usually allow the caller to implicitly create the directory if it is missing, so even if the tool was only intending to “read”, it may accidentally cause the subspace to be created if it runs before your application deployment script.

Also, there are some layers that use the subspace prefix key itself to store some sort of metadata, so that would collide with your reverse mapping key:

(PREFIX,) = { layer stores some metadata here }   // <-- spot already occcupied!
(PREFIX, (123,)) = SOME DATA
(PREFIX, (456,)) = SOME OTHER DATA
(PREFIX, (789,)) = SOME MORE DATA

Using (PREFIX,) + \xFF could also be an issue because, even though the tuple encoding does not uses \xFF as a header byte, other key encoding might, like someone just appending raw uuids or another other compact key encoding.

But if you have complete control over the content of your subspaces and control or audit all tools/scripts that could touch them, then you should definitely combine your reverse mapping key, plus deferring the read until the end of the transaction to reduce the latency even further!