Sizing and Pricing

ajbeamon · May 8, 2018, 9:38pm

The basic idea is you need to account for the following things:

Overhead per byte
- This depends a bit on the size of your keys and values, but should be somewhat less than 2x for sufficiently sized key-value pairs. For small pairs, it may be higher. I don’t have an exact formula for you right now, but let’s use 1.7x as an example. You can probably determine a more accurate number for your data empirically, but note that the b-tree tends to be more efficient right after inserting data and uses a bit more space as it settles in and undergoes mutations.
Replication
- 1x for single, 2x for double, 3x for triple
Over-provisioning
- The documentation mentions that some SSDs benefit performance-wise from not being completely full. It’s also advisable to leave some extra space around to be able to tolerate a machine failing (all of the data from that machine would be replicated to other machines). If we wanted to keep the disks less than 2/3 full, for example, then we’d choose 1.5x.

To get the total amount of disk capacity we’d need to account for all that, we’d multiply the different overheads. So for a single-replicated cluster, we’d want our disk capacity factor to be 1.7 * 1.5 = 2.55x, or 255GB for 100GB of data. For double replication, that would be 1.7 * 2 * 1.5 = 5.1x (510GB), and for triple you’d have 1.7 * 3 * 1.5 = 7.65x (765 GB).

Topic		Replies	Views
Understanding the upper limit of 100TB databases Using FoundationDB performance , operator	1	534	September 15, 2023
Considerations for key and value sizes Using FoundationDB	2	2028	November 28, 2018
Quick question on tlog disk space for large clusters Using FoundationDB	3	592	February 18, 2020
Help me understand this status output Using FoundationDB	12	3566	June 15, 2021
Considerations for number of Key-Vals in the cluster Using FoundationDB	2	584	August 22, 2018

Sizing and Pricing

Related topics