Hello,
My cluster deployment was 3 DCs, 2 regions.The FDB database version is 6.3.23
.
region.json
{
“regions”: [ {
“datacenters”:[{
“id” : “dc1”,
“priority”: 1
},{
“id”: “dc2”,
“priority”: 0,
“satellite”: 1,
“satellite_logs”: 2
}],
“satellite_redundancy_mode”: “one_satellite_double”
},{
“datacenters”:[{
“id”: “dc3”,
“priority”: 0
}]
}]
}
DC1
is the primary DC.DC2
is satellite DC, andDC3
is the one in a different region.
fdbcli --exec status
I simulatedDC1
failure by stopping all processes deployed in primary datacenterDC1
, then thestatus
of cluster changes.
It seems the primary datacenter has changed to
DC3
,but fault tolerance changes to -1, and replication health keeps initializing automatic data distribution
for a long time. However, read and write requests can be submitted by the cluster normally.fdbcli --exec 'status json'
shows DC3
hasn’t fully recovered yetAnd a stateless process deployed in
DC3
shows the recovery_state is still accepting_commits.Question:
- Are there any problems with my cluster configuration? The status of
cluster hasn't fully recovered yet
of remote datacenter is not the expect behavior of me. Should the recovery_state of the cluster becomefully_recovered
? What can I do to handle this problem? - What should I do to recover the remote DC after the failure of primary datacenter? or the failure of satellite?
- I plan to simulate all data loss of
DC1
, The next step is to changeregions.json
that sets priority ofDC1
to -1,andconfigure usable_regions=1
.Is it a safe step when facing these situations?
Thanks in advance!