We are planning to build a 7.1.x cluster with a replication factor (RF) of 2 on AWS. If we were to lose two storage process nodes (ephemeral storage), what would be the impact on cluster health and operability? While I expect there would be some data loss, would the cluster remain operational with the remaining available data? cc: @amehta @samitsawant
Would really appreciate any feedbacks/ recommendations here.
There are some chance of losing data, i.e., when 2 replicas of data happen to be on the failed process nodes. If some critical data, e.g.,
\xff\serverList is lost, then the database will become unavailable and be hard to recover. If the data is not critical, e.g., user data, then database is able to function and status will report shard loss.
It’s recommended to run
triple configuration if you expect to lose 2 storage nodes.
Thank you @jzhou for the reply. We will most likely end up with RF=3. However, in worst case if we end up losing 3 nodes and critical data like
\xff\serverList end up in those servers, how can we recover? Is recovery even possible?
\xff\serverList is lost, we don’t have tool to recover from that. Your data is still on available storage servers, and the best chance is to use
fdbserver -r kvfiledump to dump the data from the remaining storage servers.
I haven’t used
fdbserver -r kvfiledump before. Does this mean that we’ll need to develop applications to read the data dump and restore it to a new cluster, or are there alternative mechanisms for recovering data from the available dump?
-r kvfiledump is contributed from the community. So yes, you need to develop applications to restore the data to a new cluster. Alternatively, you need to consider backup and restore solutions, which is already documented. Snowflake uses a snapshot backup, taking advantage of AWS’s disk snapshot capability, which you could also ask around.
Some more info about data loss probability - FDB stores data on groups of ReplicationFactor storage servers, called “teams.” Teams are created such that each storage server is a member of many teams but an arbitrary set of ReplicationFactor storage servers most likely do not constitute a team. The number of teams that exist is a very low percentage of the number of possible teams. This why with RF=2 if you lose 2 storage servers you have a possibility of data loss but it’s not certain.
Thank you, @SteavedHams, for providing additional information. I’m not entirely sure I grasp this concept fully. Where can I read more about this? How will RF=2 vs RF=3 affect this?
I’m not sure if there’s a comment block or any docs about team selection, but the concept can be explained as:
If you have N storage hosts, there are (N choose RF) possible groups of hosts that could be represented in a storage team of size RF. For N=30 and RF=3, N choose RF would be over 24k but FDB will only choose a small percentage of the possibilities as storage teams. If you lose 3 hosts which comprise one of those teams, you would lose data, but if you lose 3 random hosts you have a low chance of there being a team that includes all of them.
If I recall correctly the team usage percentage is something like 5% for large clusters with RF=3, but I don’t remember what it would be for RF=2.