Doubts regarding fdb.cluster file

alloc · February 7, 2019, 12:23am

Just so I don’t have to keep going back forth between here and our docs, here’s the cluster file format:

description:ID@ip1:port1,ip2:port2,ip3:port3

To connect to an FDB cluster, you must have a matching description. In some sense, the description should be used to track the progress of a single cluster through time as machines are added and removed.*

Every time the coordinators are changed, the ID should be updated as the ID uniquely identifies the coordinator set. However, if you try to connect to the cluster using an older ID (after it has been changed), the server will give you the newer ID and then the client updates its cluster file. This can be done by serving the updated file from one of the old coordinators even if they are no longer in the cluster.

The reason for this is to allow for changing the coordinators midstream without downtime. When the coordinator change happens, the update is propagated to any connected client and they update their file. Any dormant client will also pick up the change when they wake back up. The problem case happens when all of the coordinators are changed and removed from the cluster. (This might happen if, say, the cluster is moved to an entirely different set of hardware.) In that case, any client that doesn’t connect to the database between the cluster file being changed and the old coordinators being removed from the cluster will be forever more unable to connect to the cluster unless they can get the updated file from someone else.

But it’s a little more stringent than just letting the client connect regardless of the ID. For example, if you take your cluster file copy (i.e., fdb_client.cluster) and just randomly change the ID, I believe you’ll find that you can’t connect. Likewise, if you randomly change one of the coordinators in the file, you shouldn’t be able to connect even if that process is in the cluster.

So, I believe the minimal requirements are:

The description must exactly match the description in the servers’ cluster files.
The ID must match either the current or a previous ID used by the cluster (assuming at least one coordinator from when that ID was the current ID is still in the cluster).
The coordinator set should match the coordinator set associated with the ID.

Or something along those lines.

* But if you change all of the machines, is it really the same cluster?

Topic		Replies	Views
When is the fdb.cluster file updated on the client? Using FoundationDB	2	948	April 17, 2019
Deployment with custom fdb.cluster file Using FoundationDB	7	4302	July 27, 2018
Allowing client APIs to use an "in-memory" fdb.cluster file? FoundationDB Core	2	1750	September 1, 2018
Forcing a newly installed single-node cluster to "join" an existing cluster from fdbcli Using FoundationDB	1	949	October 30, 2018
How to create a foundation db Using FoundationDB	1	2903	February 1, 2019

Doubts regarding fdb.cluster file

Related topics