'Locking coordination' state after process removal

Meai1 · July 11, 2019, 6:16am

If you remove Replication Factor machines or more from a cluster without excluding them first, and waiting for the exclude to finish, then you’re going to break your cluster, because there’s data (including system metadata) that will be permanently missing.

I didnt remove any machines, I removed processes! Because when looking at the docs, I dont see how “1 machine 1 process” suddenly shouldnt be a workable state just because I temporarily increase the amount of processes and then remove them again.
In my understanding I dont have any replication here to begin with. If I am violating something then it is invisible to me and the docs dont mention it.

I dont see any indication that I’m violating something called a ‘Replication Factor’ when reading this: Configuration — FoundationDB 7.1

I’m confused though that fdbcli> configure single ssd shouldn’t bring you back to a working cluster. Running fdbcli> configure new single ssd and thus throwing away the previous database might? Did you happen to elide the new by accident when posting, or should I go think harder?

well today I tried it again and now it seems completely arbitrary to me. Sometimes it locks up even if I just remove 1 of 4 processes. Sometimes only when I remove 2. It definitely always happen when I remove 3 at the same time.

I made two videos to show the behavior: (at first it seems fine, processes just say “no metrics available” but seconds later the whole thing goes into error mode)
https://webm.red/Fyct

second video I cant believe that it’s working again at first but then I try removing 2 at the same time and it goes into error mode again:
https://webm.red/u80p

I cant replicate the behavior of the ‘configure single ssd’ thing again, I do believe that it happened just like I said.

Topic		Replies	Views
How to remove process from test server Running FoundationDB	5	828	April 19, 2021
Can I remove a process? Using FoundationDB	8	2170	June 18, 2019
Locking coordination state. Verify that a majority of coordinattion server process are active. Single machine Using FoundationDB	4	1174	March 8, 2021
FoundationDB processes - 2 (less 0 excluded; 1 with errors) Using FoundationDB performance	7	890	March 13, 2020
How are 'contributing_workers' computed? Using FoundationDB	19	1981	May 12, 2018

'Locking coordination' state after process removal

Related topics