Working High Avalable Solutions with Two Datacenters

Hello!.

I’m trying to create a two datacenter configuration, each with 3 fdb nodes.

My requirements:

  1. Not more than two datacenters exist
  2. Active/Passive. Under normal conditions the first datacenter operates data and the second keeps a replica
  3. When the whole second datacenter fails, the first datacenter should continue working without any downtime. Some performance penalty is acceptable.
  4. When the whole first datacenter fails, there should be a capability to activate a second datacenter to operate data. Some downtime, a small data loss and a manual reconfiguration are acceptable.
  5. Capability of switching roles of two datacenters for maintenance without any data loss. Small downtime and a manual reconfiguration are acceptable.

My first approach was to build a DR cluster. This solution satisfies all 5 requirements but there are 2 problems

  1. This solution is declared as obsolete Design and Implementation of a Performant Restore System in FDB
  2. The DR solution has a performance penalty because all mutations need to be written to the system keyspace that doubles the writing volume.

I tried to use a suggested multi-region configuration with two regions, each having a single datacenters. I used six coordinator processes: three in each datacenters. But this configuration didn’t satisfy requirements 3 and 4: when any datacenter failid, three coordinator processes were not enouth for continuing work. Seems multi-region configuration becomes useful only with three and more datacenters that contradicts the requirement 1.

Any assymetric configuration (4 + 3 coordinators) does not survive when the datacenter with most coordinators fails.

There is a sentence In the documentation https://apple.github.io/foundationdb/configuration.html#choosing-coordinators

This is because if an entire region fails, it is still possible to recover to the other region if you are willing to accept a small amount of data loss. However, if you have lost a majority of coordinators, this becomes much more difficult.

But I cann’t find any step-by-step information, how to recover a fdb cluster when majority of coordinators are not available. Is it ever feasible?

This is theoretically feasible, but not implemented. It’s been discussed as on the roadmap before, and I’d suggest @markus.pilman as perhaps a good person to talk about when that would be implemented.

Assuming you cannot hide a coordinator in any third region somewhere, then for now, the “correct” way to do this would be to use two clusters and DR precisely as you outlined above. I think I’d say more “deprecated” than “obsoleted”, as I hadn’t heard of a removal of it planned yet, but @mengxu is welcome to correct me if I’m wrong. The double write penalty is just something you’d have to live with until the better solution becomes available.

You are right. I’m unaware of any plan to remove DR anytime soon (like in at least a year).
[cc. @ajbeamon @Evan ]

Thanks for reply, @mengxu @alexmiller.

Also I found a discussion in Two datacenters with double redundancy in each? But there is recommended to use a separate (third) datacenter for coordinators.

I’ll use and recommend the DR solution as a high-available solution exactly for two datacenters.

This is theoretically feasible, but not implemented. It’s been discussed as on the roadmap before, and I’d suggest @markus.pilman as perhaps a good person to talk about when that would be implemented.

Seems https://github.com/apple/foundationdb/issues/2022 is about recovering the majority of coordinators

how to recover a fdb cluster when majority of coordinators are not available. Is it ever feasible?
This is theoretically feasible, but not implemented.

I’ve managed to recover my fdb cluster when the majority of coordinators was lost.

  1. Initial state: two datacenters: a primary and a remote were in two regions. Four coordinators: three were in the primary and one was in the remote.
  2. The full primary datacenter failed.

Steps to recover:

  1. Stop foundation db in the secondary datacenter
  2. Modify the cluster file to have three coordinators in the second datacenter
  3. Copy the coordination-* files from the initial coordinator in the second datacenter to the new two ones.
  4. Start foundation db in the secondary datacenter
  5. Do force_recovery_with_data_loss with fdbcli

That will work, but it’s possible that it won’t recover, if it does you’ll lose an unbounded amount of data, and theoretically open yourself to database corruption.

A recovery (or more than one) could have happened, and written the new coordinated state to only the three coordinators in the primary. Your coordinated state in the secondary is thus stale, and doesn’t know that it is stale. When you copy it to more coordinators to get back to having a quorum, you’re restoring a stale coordinated state. It’s possible that it points only to transaction log instances that no longer exist, and thus recovery will block forever. It’s possible that it points to a subset of the older transaction log instances that do exist, and then you’ll lose all data written in the newer generations of transaction logs (but it will still be a consistent snapshot).

It’s also possible that the primary half of the database could come back online that isn’t aware of your manual coordinator changes, and then you’d have two FDB clusters both trying to use the same transaction logs, which will probably have very weird behavior.

So it will work, but there’s a lot of caveats, and is why #2022 exists to provide a safe® way of doing such an operation.

1 Like

Yes, this scenario is not safe. For safity I’d add a step
6. Prevent the primary datacenter from starting up

But sometimes splitting a cluster to two independent parts is a desired goal. For example, when I want to create a full copy of data from a working cluster for testing.

Earlier I was using a DR cluster for cloning. But seems making this with multiregion configuration is also possible.