Abstracting hundred-thousands of collections of items

I want to create an FDB instance holding holding several hundred-thousand collections of items. Each of these collections have a name, and need to be re-namable. I’d appreciate the community’s opinion on this.

1. Separate Name Index

First idea is to use a separate index for the name. The main data entries would look like this:

(app_dir, collections_dir, 1, item1) = data1
(app_dir, collections_dir, 1, item2) = data2
...

Then the index would look like this:

(app_dir, names_dir, collection_name) = 1

2. Collection Directories

The second idea is to use directories for each collection which are inherently re-namable:

(app_dir, collections_dir, collection1_dir, item1) = data1
(app_dir, collections_dir, collection1_dir, item2) = data2
...

I’m leaning towards the first plan because I don’t know of a way to stream a list of subdirectory names. One of the things I need to be able to do is list all collection names. Because several hundred thousand names may fail to be read in a single transaction, I wanted the ability to perform a long multi-transaction range-read on the contents of the collections_dir. The directory API doesn’t currently provide this, correct?

If you choose to go the do-it-yourself route with creating an index of names to internal IDs, you should check out the metadata version feature. This will allow you to keep a consistent cache on the client of names to IDs and know when 1) collections are added or removed, and 2) when collections are re-named.

This will decrease the latency of your transactions if each requires mapping a collection name to ID before doing anything else.

The directory layer currently doesn’t implement this, so that is a benefit to doing it yourself here.

Without looking at your proposed solutions, my thought would be to use a random ID / high contention allocator to allocate IDs, and then maintain a separate index that maps ID to name. This appears to be your (1) solution.

I vaguely recall some weird caveats about directories being renamed, but it’s left my mental cache… @alloc would probably know.