Count a range of keys

cisaacson · January 26, 2026, 3:05am

I have read many posts on how to count the keys in a key range, including the Java code that used offsets to skip through key values (from 2018). After reading more of the documentation, it does not seem the right way to do things.

We are using the Rust implementation of foundationdb-rs.

The plan now is to simply set a LIMIT size (like 100,000), get the first chunk with the first/last key. Then use the last key in the result to start the next chunk. With get_streams we can simply loop through the keys and aggregate the count of each. By using a LIMIT we can avoid the 5s timeout, and with threads we can have several threads go at once, reading chunks simultaneously.

We do not use the RecordLayer, this is just direct in FDB using our own keys.

Is this the right direction?

Any advice appreciated.

janderland · January 27, 2026, 4:32pm

Yes, that will work. Note that you’ll be downloading the entire KVs, not only the count. Also note that you may miss KVs if another process is writing to a range which you previously scanned.

cisaacson · January 27, 2026, 6:09pm

Thanks for confirming, we will go this way.

Semisol · January 28, 2026, 12:18am

Since you have mentioned parallelism, I would also recommend checking out “get range split points”. This will provide you with a list of boundary keys, so each range is of roughly wanted size (should be over 3-5MB), so if you for example ask for [A, B) it will give you the following array:

A
A00149
A14934
A35831
A48571
B

Then you can use these to create multiple ranges, such as A-A00149, A00149-A14934, …, A48571-B.

This is especially useful if the keys are not distributed evenly.

cisaacson · January 28, 2026, 12:33am

Thank you very much, I was looking for that fn in foundationdb-rs, now I see it. Just what I needed.

PierreZ · January 28, 2026, 9:18am

get_range_split_points is indeed really useful to get a list of boundary keys, and it is one of the latest method I added to the rust bindings. There is also get_estimated_range_size_bytes that can be useful.

in my company, when we know that we will need to count either bytes or number of records or number of rows, we are wrapping some statistics logic in a dedicated subspace. The statistics are maintained through atomic operations, much like the record-layer.

cisaacson · January 28, 2026, 9:47am

Great to know. I see that I am on version 0.8.0 and looks like this was added in 0.10.0, if the later version works with our version of foundationdb I will upgrade. We need the count for exactly the purpose, to split a set of keys for parallel processing. We will try this.

PierreZ · January 28, 2026, 10:06am

Feel free to open discussion and issues on the foundationdb-rs repo if needed!

Semisol · January 30, 2026, 12:21am

If your KV pairs are on average the same size (or you are using size as a measure), and you do not have an exact batch size requirement, I’d recommend using the output of get_range_split_points directly. With small datasets, the small amount of unevenness will not matter. With large datasets, it will average out to being roughly equal.

Topic		Replies	Views
Missing API for getting just the count of a key range? FoundationDB Core	13	3748	September 10, 2018
Getting the number of key/value pairs Using FoundationDB	11	6414	April 17, 2021
Slicing a key range to work with analytical engines (e.g. Spark) Using FoundationDB	4	703	July 7, 2020
Limiting the cardinality of a key range Using FoundationDB	1	1288	August 27, 2018
Issues with get_range_split_points returning chunks of very uneven size Using FoundationDB bindings , performance	1	357	May 17, 2023

Count a range of keys

Related topics