Scalability performance benchmark

nitins64 · March 26, 2019, 3:55pm

Hi there,

I am evaluating foundationdb for our use case. Could someone point me to any performance benchmark?
I am interested in learning how much data can we store in foundationDb (number of machines/core) without impacting performance?

Someone told me that foundationDB doesn’t scale well beyond 100TB. Is this correct? if possible, could you share scaling bottleneck?

Thanks!
Nitin

ryanworl · March 26, 2019, 4:24pm

There is a user with over 1PB in a single cluster for an analytics workload that I am aware of. I wouldn’t necessarily say that is a good idea, though. You can run more than one cluster.

If you’re expecting many TB of data and have a clean way to partition it such that the failures can not impact data across partitions, run multiple clusters. This isn’t an FDB specific thing.

For performance specifically, there are a few examples in the documentation that show scalability across multiple cores and machines as well as how an individual process handles different read/write patterns. These examples match up with my experience mostly.

One thing not mentioned in the documentation is you need to understand your workload relative to how many storage processes you use per disk. If you need to store a lot of data that will mostly be cold, you can get away with fewer processes per disk than if you have a workload with a high write rate. Fewer storage processes mean less work doing failure detection and other things like that which limit how large a cluster can be from my understanding.

alexmiller · March 26, 2019, 6:31pm

The currently advertised limit is ~500 fdbserver processes per cluster, as somewhere before 1000 the poor cluster controller doing all the failure monitoring becomes overwhelmed with responses. How this translates into storage volume depends on the size of disk you have attached to each fdbserver process and performance requirements. Running >1 storage servers per disk will result in better disk IOPS utilization, but reduce your maximum data size by a constant factor.

Recovery time will scale with data volume, as FDB is required to load the map of key range to shard before accepting commits again. I suspect that pushing the limits of total data volume in FDB to the 1PB or higher level would probably lead to 10s recoveries. Work slated for FDB6.2 includes both changing failure monitoring to raise the number of processes limit, and reduce recovery times for large clusters.

I’d also be interested to hear where you’re getting your FoundationDB scaling limitation rumors from?

ryanworl · March 26, 2019, 7:03pm

I bet it was someone who read this quote from the documentation:

“FoundationDB has been tested with databases up to 100 TB (total size of key-value pairs – required disk space will be significantly higher after replication and overhead).” Known Limitations — FoundationDB 7.1

And it telephoned itself into a limitation, even if this is outdated.

alexmiller · March 26, 2019, 7:04pm

FDB documentation strikes again!

nitins64 · March 27, 2019, 12:19am

So what is the latest information on this? 500TB?

ryanworl · March 27, 2019, 12:44am

There is not a hard cap on data, but the guideline for a maximum number of processes being 500 today is useful. You should do some small scale benchmarking for your specific workload while varying the number of storage processes per disk from 1 to e.g. 8 and see what meets your latency (beware coordinated omission!) and throughput needs.

Then you can do the math assuming your hypothetical 500 process cluster exists and see how much disk space you actually get at the number of processes per disk you’ve chosen. This obviously depends on the size of your disks too.

Topic		Replies	Views
When to use multiple FDB clusters Using FoundationDB	11	807	May 24, 2024
Understanding the upper limit of 100TB databases Using FoundationDB performance , operator	1	542	September 15, 2023
FoundationdDB Cluster Performance Issues Using FoundationDB performance , operator	11	1305	October 21, 2020
Using FoundationDB with Samba Using FoundationDB	2	298	September 29, 2023
FoundationDB cluster setup Using FoundationDB performance	14	2619	February 13, 2023

Scalability performance benchmark

Related topics