Coordinators unavailable when 1 node out of 3 is down in 'single' redudancy mode?

pdeva · September 20, 2023, 10:55pm

so we created a 3 node cluster of fdb coordinators using single redundancy mode:

fdb> coordinators 172.17.0.2:4500 172.17.0.3:4500 172.17.0.4:4500
Coordination state changed
fdb> 
fdb> 
fdb> status details

Using cluster file `/etc/foundationdb/fdb.cluster'.

Configuration:
  Redundancy mode        - single
  Storage engine         - memory-2
  Coordinators           - 3
  Usable Regions         - 1

Cluster:
  FoundationDB processes - 3
  Zones                  - 3
  Machines               - 3
  Memory availability    - 8.0 GB per process on machine with least available
  Fault Tolerance        - 0 machines
  Server time            - 09/20/23 22:47:52

Data:
  Replication health     - Healthy
  Moving data            - 0.000 GB
  Sum of key-value sizes - 0 MB
  Disk space used        - 315 MB

Operating space:
  Storage server         - 1.0 GB free on most full server
  Log server             - 40.1 GB free on most full server

Workload:
  Read rate              - 19 Hz
  Write rate             - 3 Hz
  Transactions started   - 7 Hz
  Transactions committed - 1 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  172.17.0.2:4500        (  1% cpu;  2% machine; 0.000 Gbps;  0% disk IO; 0.5 GB / 8.0 GB RAM  )
  172.17.0.3:4500        (  2% cpu;  2% machine; 0.000 Gbps;  0% disk IO; 0.5 GB / 8.0 GB RAM  )
  172.17.0.4:4500        (  1% cpu;  2% machine; 0.000 Gbps;  0% disk IO; 0.5 GB / 8.0 GB RAM  )

Coordination servers:
  172.17.0.2:4500  (reachable)
  172.17.0.3:4500  (reachable)
  172.17.0.4:4500  (reachable)

Client time: 09/20/23 22:47:52

fdb>

Now when we kill just 1 of the 3 nodes, the cluster becomes completely unavailable.

vagrant@vagrant-1:~$ fdbcli
Using cluster file `/etc/foundationdb/fdb.cluster'.

The database is unavailable; type `status' for more information.

Welcome to the fdbcli. For help, type `help'.
fdb>
fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster'.

Could not communicate with all of the coordination servers.
  The database will remain operational as long as we
  can connect to a quorum of servers, however the fault
  tolerance of the system is reduced as long as the
  servers remain disconnected.

  172.17.0.2:4500  (reachable)
  172.17.0.3:4500  (reachable)
  172.17.0.4:4500  (unreachable)

Locking coordination state. Verify that a majority of coordination server
processes are active.

Why is this? Its single redundancy mode after all. Why is it unavailable when 2 of the 3 nodes are still up?

pdeva · September 20, 2023, 11:07pm

oh i just noticed single mode doesnt tolerate loss of any nodes.

Topic		Replies	Views
Triple ssd fdb cluster on 3 node, one node poweroff, but the fdb cluster is unavailable! Using FoundationDB	2	696	July 7, 2020
Redundancy mode: three_data_hall Using FoundationDB	10	2209	November 9, 2022
Working High Avalable Solutions with Two Datacenters Using FoundationDB	7	1166	December 1, 2020
Can't change redundancy mode Running FoundationDB	2	355	June 10, 2023
Could not communicate with a quorum of coordination servers Using FoundationDB	2	2226	March 5, 2020

Coordinators unavailable when 1 node out of 3 is down in 'single' redudancy mode?

Related topics