Hi, I am having a bit of confusion understanding the semantics of a “legal” fdb.cluster file:
Reading the documents here, I assumed that in order for a client to connect to an fdb cluster it should have an identical fdb.cluster file as that being used by the servers in the cluster. However, this does not seem correct. I performed a toy experiment, and here are the findings:
- Created an FDB cluster with two processes on a single node, and assigned one of the processes to be coordinators (say 127.0.0.1:4500)
- Copied the fdb.cluster file from
/etc/foundationdb/fdb.cluster
to a temp location and named itfdb_client.cluster
. - Started fdbcli, on the server node, without any special cluster file parameters (so that it uses the file at default location:
/etc/foundationdb/fdb.cluster
; then changed the coordinator to127.0.0.1:4501
. Then exited the fdbcli. - Observed that the
ID
and the coordinator process id of the cluster changes in the updated/etc/foundationdb/fdb.cluster
(due to changing the coordinator). - Started the fdbcli using
-C fdb_client.cluster
that was earlier copied to the temp location.fdbcli
was able to join the cluster successfully (note that this file still had old coordinator ip:port and oldID
prior to this step. After the fdbcli joined the cluster, it updated thefdb_client.cluster
(located in the temp location) to match the contents of/etc/foundationdb/fdb.cluster
.
I also tried repeating the above steps (fresh start, with the same initial conditions), but this time, prior to step (5), I manually edited the description
in the fdb_client.cluster
to something random. Now, in step (5), fdbcli was not able to connect to the cluster.
So, I have doubts about what constitutes a valid fdb.cluster file, for a client to join an existing cluster? From the above experiments, I could observe that (a) if the fdb.cluster file that the client is using is pointing to “some” alive process in the cluster (not necessarily a coordinator), it is able to join the cluster, and then it updates its cluster file. And (b), even if the ID
in the client’s cluster file does not match that of fdb cluster’s, it is able to join it, as long as the description
matches.
Could someone please clarify what is the minimum requirement for a client’s fdb.cluster file to be considered legal in order to join a running cluster?
–thanks