Getting the number of key/value pairs

I’ll add that

(1) it would be possible for a storage engine to offer O(log N) counting and key selector offsets (this would require extensions to the IKeyValueStore interface. it would be more straightforward if we move to a MVCC interface to storage)

(2) if an application or layer needs these operations (or related ones like picking a key uniform randomly or determining the “rank” of a value in some index) to be fast for some specific data it can make them reasonably efficient via (somewhat tricky) data modeling. There used to be a released python layer RankedSet that demonstrated this; I can’t remember how good the implementation was. Basically the approach is to store, besides, the data, a number of increasingly sparse samples of the data (say, 1/100 of keys, then 1/10000 of keys, etc) and with each of these samples an accurate, atomically incremented counter of the number of keys between it and the next sampled key at the same level. When adding or removing data you update all these (with careful use of snapshot reads and manual key ranges to reduce conflicts), and when you want to count or offset from a particular point. You can “count” whatever metric or metrics you want rather than FoundationDB k/v pairs specifically.

1 Like