Redundancy mode: three_data_hall

(Ochan) #1

Hi all,
I’ve been testing foundationdb cluster on AWS and I have one question concerning redundancy mode.
My testing environment is illustrated in the above figure. I wanted to check machine failure scenario when redundancy mode is set to three_data_hall. Coordinator server was placed on machine #1 and master/cluster_controller were run on machine #2.

case 1) Killed machine #5 and machine #6
Reading and Writing key/values worked well.

case 2) Killed machine #3 and machine #5.
Fdb cluster was down. When I typed “status” command using fdbcli, I got the below message.

Recruiting new transaction servers.

Need at least 4 log servers, 1 proxies and 1 resolvers.

Have 180 processes on 4 machines.

Timed out trying to retrieve storage servers.

As far as I understand (link), case 2 is ok but in case 1, writing key/values should be failed. However, the testing results were different from my expectation. Am I missing something??

(Ochan) #2

I found the answer for myself while looking into the source code (link)

tLogPolicy = IRepPolicyRef(new PolicyAcross(2, “data_hall”,IRepPolicyRef(new PolicyAcross(2, “zoneid”, IRepPolicyRef(new PolicyOne())))

Log replica is set to 4 and it should be placed on 2 data halls and 2 zones per data hall. So case 2) was failed.
But I still wonder that why case 1) is ok. There were only two data halls available and three data replica was not available.

(Alex Miller) #3

Log replicas being set to 4 means we’ll have 4 total logs being recruited. And you’ve interpreted the policy correctly; we’re looking to recruit logs in two data halls, and two machines within each data hall. And in (1), that’s exactly what we have left. You have two data halls (DH#1 and DH#2), and two machines left within each data hall.

Just in case it’s the cause of your confusion, three_data_hall mode is three data halls, and not triple replicated within data halls.

(Ochan) #4

Alexmiller, thanks for your fast reply!!
Your reply is helpful for me to understand fdb operation and I have an additional question concerning case 1.
As far as I understand, Case 1 satisfies with log replica requirement (2 data halls, 2 machines per each data hall), so writing and reading key/values are available. Is it right?
And then, those log data need to be moved into storage nodes (three replicas), but there are only two data halls.
So in this situation, I guessed two scenarios. First one, two replicas are moved into storage nodes, and one replica is moved later when data hall #3 is recovered. And second one is that all the replicas are moved into storage nodes when data hall #3 is recovered. Which one is right?

(Meng Xu) #5

In my understanding, (correct me if I’m wrong @alexmiller ), it’s scenario 1: data in tLog is moved into the two storage nodes as two replicas. Note that mutation in tLog won’t be depleted until it has been replicated into three storage nodes.
So even though your case 1 can still serve data, your tLog server’s queue will start to build up until you got the third data hall recovered.