I’m investigating FDB as a database engine for apps that will have a relatively small number of users and relatively small datasets, but which have the following characteristics:
- Users may be spread around the world yet,
- They need low read latencies
- Servers will run on relatively cheap clouds or dedicated machines
I’m wondering what the best configuration is because despite reading the docs a few times and watching the multi-region tech talk by Evan in 2018, I find I’m still not entirely sure. All I want is (say) 3 replicas of the dataset, separated by a WAN, in which the loss of one isn’t fatal, and in which if the database grows an additional machine can be added to each replica to add storage capacity.
- Multi-region sounds good, but, only two regions are supported. But it may be that some projects need at least three (Europe, USA, Asia) and quite possibly more, to ensure app servers are close to users. So it seems this is out.
-
three_data_hall
mode doesn’t seem right, because then you need at least 4 machines to make progress. But I want only 3 to make progress, with 1 lost (they’ll probably be VMs so I expect “loss” to be a rare and user-driven event). - None of the single datacenter modes seem right. E.g.
triple
replication is good but you need 5 machines to make progress. -
three_datacenter
mode therefore seems the best, at least for apps that want low latencies in Asia as well as Europe/USA, or want to split the US into two coastal regions. But it says data is replicated six times, which seems wasteful. Why isn’t it replicated three times? And what if I wanted four datacenters e.g. east-us/west-us/europe/taiwan but am OK with the same level of fault tolerance as 3, i.e. the 4th doesn’t take part in leader election so having an even number isn’t fatal?
Maybe FDB isn’t really designed/optimised for many replicas of a small dataset?