Just wanted to give a quick update of what we ended up doing–
We checked that the initial key of every directory (equivalent to the dirSubspace.Bytes()) is not being used, which makes sense as we encode everything in tuples. We have an in-memory cache, then at the initial key of every directory wrote the directory path as a reverse mapping. We update the cache and reverse mapping key value when a directory is moved, deleted, or if the reverse mapping value doesn’t match.
So for example, the read path is to get the directory subspace for my/dir is:
- Check cache for
my/dir, say the value is\x01\x02 - Read key
\x01\x02and confirm that the value ismy/dir - If the value does not match, get the subspace from the directory layer - say
\x01\x03Then update the in-memory cache and if the tx is not read-only, writemy/dirat key\x01\x03
It is still the same amount of reads but now much more distributed as it isn’t hitting the directory layer shard upon every request. Our CPU usage hotspotting has greatly improved since we enabled this.
Also came across the Consistent Caching talk (Consistent Caching in FoundationDB - Xin Dong, Apple & Neelam Goyal, Snowflake - YouTube) which looks like it would totally take care of this. Super exciting stuff! Any idea when this will be released? Is it part of 6.3?