I have a fdb cluster and lost all transaction nodes and 10% storage node by accidentally removing data directory , is there any way to bring it back to work? It is acceptable to lose some data.
fdb version is 6.0.15
Thanks in advance.
I have a fdb cluster and lost all transaction nodes and 10% storage node by accidentally removing data directory , is there any way to bring it back to work? It is acceptable to lose some data.
fdb version is 6.0.15
Thanks in advance.
Did you have backup setup? How much data the cluster has?
The first choice is restoring from backup, assuming the cluster is not configured with HA (FDB HA Write Path: How a mutation travels in FDB HA — FoundationDB 6.3).
Without being hold liable, I would try fdbcli force_recovery_with_data_loss
which is designed to get cluster back at the event of losing tLogs.
Your situation can be more complex:
If 10% storage nodes also cause the loss of all replicas of some shard (especially the metadata shard), it will be really hard to recover with 6.0.
We do have a WiP feature (ref. Reproduced user data loss incident, and tested the improved exclude tool by liquid-helium · Pull Request #5713 · apple/foundationdb · GitHub) to bring the cluster back in this event. It’s tested in test env with master branch (7.1 version) and worked. We didn’t test it on older versions.
If this unfortunate event happens in your prod env, I would review your deployment model in the PIR. Losing so many storage nodes simultaneously will be a disaster to all DBs I know and will need the disaster recovery solution (e…g, restore or DC failover).
Will this work for cluster without region setting?