Locking coordination state. Verify that a majority of coordinattion server process are active. Single machine

Hi We were trying to get multiple instances of foundationdb on the same development machine and are having troubles reverting. How can I get this instance back up and running? I have attached the status json below. I am happy to provide more information if needed. Thanks. The error I am getting is shown below:

Using cluster file `/etc/foundationdb/fdb.cluster'.

The database is unavailable; type `status' for more information.

Welcome to the fdbcli. For help, type `help'.
fdb> status

Using cluster file `/etc/foundationdb/fdb.cluster'.

Locking coordination state. Verify that a majority of coordination server
processes are active.

  127.0.0.1:4500  (reachable)

Unable to locate the data distributor worker.

Unable to locate the ratekeeper worker.

status json

{
    "client" : {
        "cluster_file" : {
            "path" : "/etc/foundationdb/fdb.cluster",
            "up_to_date" : true
        },
        "coordinators" : {
            "coordinators" : [
                {
                    "address" : "127.0.0.1:4500",
                    "reachable" : true
                }
            ],
            "quorum_reachable" : true
        },
        "database_status" : {
            "available" : false,
            "healthy" : false
        },
        "messages" : [
        ],
        "timestamp" : 1613517380
    },
    "cluster" : {
        "clients" : {
            "count" : 2,
            "supported_versions" : [
                {
                    "client_version" : "6.1.8",
                    "connected_clients" : [
                        {
                            "address" : "127.0.0.1:56846",
                            "connected_coordinators" : 1,
                            "log_group" : "default"
                        },
                        {
                            "address" : "127.0.0.1:56850",
                            "connected_coordinators" : 1,
                            "log_group" : "default"
                        }
                    ],
                    "count" : 2,
                    "protocol_version" : "fdb00b061060001",
                    "source_version" : "bd6b10cbcee08910667194e6388733acd3b80549"
                }
            ]
        },
        "cluster_controller_timestamp" : 1613517385,
        "connection_string" : "7q70LYEi:AtXa8OQn@127.0.0.1:4500",
        "datacenter_version_difference" : 0,
        "degraded_processes" : 0,
        "incompatible_connections" : [
        ],
        "layers" : {
            "_error" : "configurationMissing",
            "_valid" : false
        },
        "machines" : {
            "938899b9dec5bd69aa94e8585bd561c2" : {
                "address" : "127.0.0.1",
                "contributing_workers" : 2,
                "cpu" : {
                    "logical_core_utilization" : 0.0045028100000000003
                },
                "excluded" : false,
                "locality" : {
                    "machineid" : "938899b9dec5bd69aa94e8585bd561c2",
                    "processid" : "2a17b16577ccc421ff1db5f2070a3545",
                    "zoneid" : "938899b9dec5bd69aa94e8585bd561c2"
                },
                "machine_id" : "938899b9dec5bd69aa94e8585bd561c2",
                "memory" : {
                    "committed_bytes" : 1810952192,
                    "free_bytes" : 65637105664,
                    "total_bytes" : 67448057856
                },
                "network" : {
                    "megabits_received" : {
                        "hz" : 0.24741000000000002
                    },
                    "megabits_sent" : {
                        "hz" : 0.24741000000000002
                    },
                    "tcp_segments_retransmitted" : {
                        "hz" : 0
                    }
                }
            }
        },
        "messages" : [
            {
                "description" : "Unable to locate the data distributor worker.",
                "name" : "unreachable_dataDistributor_worker"
            },
            {
                "description" : "Unable to locate the ratekeeper worker.",
                "name" : "unreachable_ratekeeper_worker"
            },
            {
                "description" : "Unable to read database configuration.",
                "name" : "unreadable_configuration"
            }
        ],
        "processes" : {
            "2a17b16577ccc421ff1db5f2070a3545" : {
                "address" : "127.0.0.1:4500",
                "class_source" : "command_line",
                "class_type" : "unset",
                "command_line" : "/usr/sbin/fdbserver --cluster_file=/etc/foundationdb/fdb.cluster --datadir=/var/lib/foundationdb/data/4500 --listen_address=public --logdir=/var/log/foundationdb --public_address=auto:4500",
                "cpu" : {
                    "usage_cores" : 0.0093905700000000009
                },
                "disk" : {
                    "busy" : 0,
                    "free_bytes" : 1703289110528,
                    "reads" : {
                        "counter" : 24811660,
                        "hz" : 0,
                        "sectors" : 0
                    },
                    "total_bytes" : 1889167495168,
                    "writes" : {
                        "counter" : 224496123,
                        "hz" : 0.79996299999999998,
                        "sectors" : 160
                    }
                },
                "fault_domain" : "938899b9dec5bd69aa94e8585bd561c2",
                "locality" : {
                    "machineid" : "938899b9dec5bd69aa94e8585bd561c2",
                    "processid" : "2a17b16577ccc421ff1db5f2070a3545",
                    "zoneid" : "938899b9dec5bd69aa94e8585bd561c2"
                },
                "machine_id" : "938899b9dec5bd69aa94e8585bd561c2",
                "memory" : {
                    "available_bytes" : 33222342656,
                    "limit_bytes" : 8589934592,
                    "unused_allocated_memory" : 30277632,
                    "used_bytes" : 606818304
                },
                "messages" : [
                ],
                "network" : {
                    "connection_errors" : {
                        "hz" : 0
                    },
                    "connections_closed" : {
                        "hz" : 0
                    },
                    "connections_established" : {
                        "hz" : 0
                    },
                    "current_connections" : 6,
                    "megabits_received" : {
                        "hz" : 0.021686999999999998
                    },
                    "megabits_sent" : {
                        "hz" : 0.014207399999999999
                    }
                },
                "roles" : [
                ],
                "uptime_seconds" : 207.88999999999999,
                "version" : "6.1.8"
            },
            "5bc0d29964d256936c2e40e7dbfaa9b9" : {
                "address" : "127.0.0.1:4501",
                "class_source" : "command_line",
                "class_type" : "unset",
                "command_line" : "/usr/sbin/fdbserver --cluster_file=/etc/foundationdb/fdb.cluster --datadir=/var/lib/foundationdb/data/4501 --listen_address=public --logdir=/var/log/foundationdb --public_address=auto:4501",
                "cpu" : {
                    "usage_cores" : 0.021741999999999997
                },
                "disk" : {
                    "busy" : 0,
                    "free_bytes" : 1703289106432,
                    "reads" : {
                        "counter" : 24811660,
                        "hz" : 0,
                        "sectors" : 0
                    },
                    "total_bytes" : 1889167495168,
                    "writes" : {
                        "counter" : 224496125,
                        "hz" : 0.79997000000000007,
                        "sectors" : 176
                    }
                },
                "fault_domain" : "938899b9dec5bd69aa94e8585bd561c2",
                "locality" : {
                    "machineid" : "938899b9dec5bd69aa94e8585bd561c2",
                    "processid" : "5bc0d29964d256936c2e40e7dbfaa9b9",
                    "zoneid" : "938899b9dec5bd69aa94e8585bd561c2"
                },
                "machine_id" : "938899b9dec5bd69aa94e8585bd561c2",
                "memory" : {
                    "available_bytes" : 33222346752,
                    "limit_bytes" : 8589934592,
                    "unused_allocated_memory" : 0,
                    "used_bytes" : 200761344
                },
                "messages" : [
                ],
                "network" : {
                    "connection_errors" : {
                        "hz" : 0
                    },
                    "connections_closed" : {
                        "hz" : 0
                    },
                    "connections_established" : {
                        "hz" : 0
                    },
                    "current_connections" : 6,
                    "megabits_received" : {
                        "hz" : 0.046954999999999997
                    },
                    "megabits_sent" : {
                        "hz" : 0.047167799999999996
                    }
                },
                "roles" : [
                    {
                        "id" : "38f368d1770d6c77",
                        "role" : "master"
                    },
                    {
                        "id" : "e74f98a67996c615",
                        "role" : "cluster_controller"
                    }
                ],
                "uptime_seconds" : 210.006,
                "version" : "6.1.8"
            }
        },
        "protocol_version" : "fdb00b061060001",
        "recovery_state" : {
            "description" : "Locking coordination state. Verify that a majority of coordination server processes are active.",
            "name" : "locking_coordinated_state"
        }
    }
}

Here is the foundation db Configuration file as well:

## foundationdb.conf
##
## Configuration file for FoundationDB server processes
## Full documentation is available at
## https://apple.github.io/foundationdb/configuration.html#the-configuration-file

[fdbmonitor]
user = foundationdb
group = foundationdb

[general]
restart_delay = 60
## by default, restart_backoff = restart_delay_reset_interval = restart_delay
# initial_restart_delay = 0
# restart_backoff = 60
# restart_delay_reset_interval = 60
cluster_file = /etc/foundationdb/fdb.cluster
# delete_envvars =
# kill_on_configuration_change = true

## Default parameters for individual fdbserver processes
[fdbserver]
command = /usr/sbin/fdbserver
public_address = auto:$ID
listen_address = public
datadir = /var/lib/foundationdb/data/$ID
logdir = /var/log/foundationdb
# logsize = 10MiB
# maxlogssize = 100MiB
# machine_id =
# datacenter_id =
# class =
# memory = 8GiB
# storage_memory = 1GiB
# metrics_cluster =
# metrics_prefix =

## An individual fdbserver process with id 4500
## Parameters set here override defaults from the [fdbserver] section
[fdbserver.4500]

[fdbserver.4501]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster
#
# [fdbserver.4502]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster
#
# [fdbserver.4503]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster

[backup_agent]
command = /usr/lib/foundationdb/backup_agent/backup_agent
logdir = /var/log/foundationdb

[backup_agent.1]

Hey @jds can you elaborate more on what do you mean by reverting and back up and running?

You can see the commented out lines in the foundation db configuration file which we tried to add

[fdbserver.4501]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster
#
# [fdbserver.4502]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster
#
# [fdbserver.4503]
# cluster_file = /etc/foundationdb/fdb-$ID.cluster

I want to get fdbcli working again as you can see:

Unable to locate the data distributor worker.

Unable to locate the ratekeeper worker.

I’m not 100% sure I’m following, but it looks like you might have added processes to a single ssd cluster, and then the database is not available after removing these processes. This makes sense - if you lose any (stateful) processes in single ssd mode you lose availability. What you can try instead is using the exclude fdbcli command to safely migrate state away from processes before removing them. See Administration — FoundationDB 6.2 for more details