Ideal setup for Fault Tolerance = 3 in Triple mode

alexmiller · July 15, 2019, 9:23am

There was a thread that discussed some misinterpretable aspects of that table:

Which also caused some edits to its wording.

Fault Tolerance will always be, at best, Total Number of Copies - 1. So Fault Tolerance =2 is the best that you will get with triple replication.

Also see Optimal configuration for more than 3 DCs and a linked thread therein for a longer discussion on FDB on more than two regions.

Note that FDB is not a quorum-based system. The “a majority of nodes is required” logic only applies to coordinators, which do form a quorum-based system. Otherwise, the number of processes involved in holding a shard of data is explicitly set as part of the configuration, as the table attempted to describe.

I think a 2-region triple configuration will give you the highest effective fault tolerance right now. If all copies are lost in one region, data distribution in multi-region understands how to copy it back across. You’d be able to theoretically lose 5 machines and FDB would still be able to recover, as long as you run 11 coordinators. I recall the status message wording being a bit weird with multi-region currently, so I think it will still show 2, but will eventually tell you but (N) without data loss if you start killing enough things.

I’ve typically seen people plan for 1 planned and 1 unplanned outage, so triple replication and a fault tolerance of 2 has generally been fine for most folk. Any reason you’re extra concerned about copies of your data suddenly going missing or corrupt?

Topic		Replies	Views
Fault Tolerance - 0 zones after setting locality_zoneid Using FoundationDB	8	788	May 19, 2021
6.0.18 Reporting incorrect Fault Tolerance via fdbcli with triple redundancy mode? Using FoundationDB	14	1972	April 6, 2019
Fault Tolerance changes from "2 machines" to "0 machines (2 without data loss)" Using FoundationDB	1	571	June 5, 2019
Optimal configuration for more than 3 DCs Using FoundationDB	7	1323	July 15, 2019
Replication method consideration Using FoundationDB	1	789	May 31, 2019

Ideal setup for Fault Tolerance = 3 in Triple mode

Related topics