The way we (Snowflake) chose the coordinators is by running
coordinators auto on fdbcli after the cluster has been set up. But you can definitely chose them manually.
To understand how to chose coordinators (and how many to chose), you need to understand the following:
- Your coordinators run a paxos-like algorithm to store a small amount of state that is needed to bring up your database. Think of them as FDBs own Zookeeper (AFAIK the earliest versions of FDB used Zookeeper for this).
- If you lose a majority of your coordinators you lose your database (including all your data - it would be possible to restore the data but I don’t think we have any tools that do that for you).
- Many coordinators will make you generally saver but it will be slow.
- More than one coordinators on the same machine is typically pointless.
So typically you want to have 5 coordinators and you want to have them as far away from each other as possible (in the sense of your network topology).
So what we typically do is we have at least one coordinator per data center (or availability zone in AWS). If we have fewer than 5 data centers we make sure that all coordinators run on different physical machines. Additionally we chose to store the coordinator-data on a network disk (EBS) which would allow us to restore the data even if we would lose a majority of all physical machines. As the coordinated state is small, network attached storage is not problematic for this task.
So as you can see we have 15 copies of this data (5 coordinators, 3 copies per coordinator). But so far we never lost a majority of our coordinators