How to avoid database partionning while changing redundancy mode from single/1 machine to double/3 machines

damienleroux · June 9, 2022, 12:57pm

Hello

Context:

I plan to change the redundancy of an in-production 6.3.23 FDB database from single on one machine to double with 3 machines .

I did some tests locally with docker containers and everything worked fine so far

Still, I have concerns regarding the changing of redundancy.

Problem:

First, I have to add additional machines to the cluster before changing the redundant mode.

Indeed, if running fdbcli --exec "configure double" without adding 2 machines, an error Not enough processes exist to support the specified configuration is raised.

As a result, I must add these machines first. But the consequence of adding additional machines when a database is in a single redundant mode is that FDB should start partitioning the data across these machines :

single mode will work with clusters of two or more computers and will partition data for increased performance but the cluster will not tolerate the loss of any machines.

In my opinion, it is unnecessary because I intend to change the redundant mode to double.

Also, I don’t know how FDB reacts to fdbcli --exec "configure double" while already partitioning a large database .

Questions:

1/ Is there a way to avoid the partitioning of the database between the addition of extra machines and the redundancy configuration change?

2/ What are the risks of changing the redundancy configuration when FDB is partitioning the data through the just added extra machines?

Thanks to all of you who take the time to help .

sfc-gh-xwang · June 10, 2022, 9:25pm

I think the solution is to use a hidden command to disable the data distributor
datadistribution off. Then enable datadistribution after you add 2 machines.

Code reference: fdbcli.actor.cpp - apple/foundationdb - Sourcegraph

sfc-gh-xwang · June 10, 2022, 9:38pm

And for the second question, “What are the risks of changing the redundancy configuration when FDB is partitioning the data through the just added extra machines?”.
I think the risk you’ll have is to waste some read/write bandwidth on the src/dest machine. The data distributor itself has a limited quota to do data movement, if you can pay for extra 600Mb/s (>50*100M, here 50 is the parallelism limit and 100M is the max shard size) moving for a while, I think it’s fine.

damienleroux · June 21, 2022, 4:46pm

@sfc-gh-xwang Thank you for your detailed answer .

I finally had the time to give it a try , and indeed fdbcli --exec "datadistribution off" works like a charm.

I run it before the 2 machines join my cluster. Then, as soon as I update the redundancy and the include additional coordinators, I run `fdbcli --exec “datadistribution on”.

Thank you for your second comment too! knowing that I’ll add the nodes when there is low traffic on our platform, and considering less than 100Go of data, I think it could be negligible indeed.

Topic		Replies	Views
Trying out redundancy mode 'double' on a single macOS machine Using FoundationDB	5	1197	May 8, 2018
Can't change redundancy mode Running FoundationDB	2	352	June 10, 2023
FoundationDB processes - 2 (less 0 excluded; 1 with errors) Using FoundationDB performance	7	886	March 13, 2020
Question about starting with one server and adding more when ready Using FoundationDB	2	482	January 6, 2023
Redundancy mode: three_data_hall Using FoundationDB	10	2194	November 9, 2022

How to avoid database partionning while changing redundancy mode from single/1 machine to double/3 machines

Related topics