Scalability performance benchmark

ryanworl · March 26, 2019, 4:24pm

There is a user with over 1PB in a single cluster for an analytics workload that I am aware of. I wouldn’t necessarily say that is a good idea, though. You can run more than one cluster.

If you’re expecting many TB of data and have a clean way to partition it such that the failures can not impact data across partitions, run multiple clusters. This isn’t an FDB specific thing.

For performance specifically, there are a few examples in the documentation that show scalability across multiple cores and machines as well as how an individual process handles different read/write patterns. These examples match up with my experience mostly.

One thing not mentioned in the documentation is you need to understand your workload relative to how many storage processes you use per disk. If you need to store a lot of data that will mostly be cold, you can get away with fewer processes per disk than if you have a workload with a high write rate. Fewer storage processes mean less work doing failure detection and other things like that which limit how large a cluster can be from my understanding.

Topic		Replies	Views
Scaling issues with FDB for write throughput Running FoundationDB	6	1830	September 14, 2020
Why doesn't my cluster performance scale when I double the number of machines? Using FoundationDB performance	20	3266	August 17, 2018
Performance tuning and availability Running FoundationDB performance	0	366	June 10, 2022
Understand FDB read/write with disk IOPS/throughtput/Blocksize Using FoundationDB	3	852	August 21, 2024
Storage queue limiting performance when initially loading data Using FoundationDB	10	2729	October 14, 2019

Scalability performance benchmark

Related topics