What's the compatible version of foundationdb server and operator version?

liuyan · July 1, 2022, 4:36am

Dear experts,
We are trying to upgrade the fdb-kubernetes-operator to v1.4.1, and from the dockerfile

FoundationDB/fdb-kubernetes-operator/blob/v1.4.1/Dockerfile#L5


      
          # Build the manager binary
          FROM docker.io/library/golang:1.17.8 as builder
          
          
# Install FDB this version is only required to compile the fdb operator
          ARG FDB_VERSION=7.1.5
          ARG FDB_WEBSITE=https://github.com/apple/foundationdb/releases/download
          ARG TAG="latest"
          
          
RUN set -eux && \
          	curl --fail -L ${FDB_WEBSITE}/${FDB_VERSION}/foundationdb-clients_${FDB_VERSION}-1_amd64.deb -o fdb.deb && \
          	dpkg -i fdb.deb && \
              rm fdb.deb
          
          
WORKDIR /workspace
          # Copy the Go Modules manifests

It mentioned ARG FDB_VERSION=7.1.5
Does it mean for operator 1.4.1, it can only work with foundationdb version 7.1.5?
I tried operator 1.4.1 and foundationdb 6.2.29, we wrapped fdb operator and in one of the testing scenarios, I saw this error:

{"level":"info","ts":1656557697.521365,"logger":"fdbclient","msg":"Command completed","namespace":"testoperator1","cluster":"mdm-foundationdb-ibm","output":"The database is unav..."}
{"level":"error","ts":1656557697.5215714,"logger":"controller","msg":"Error in reconciliation","namespace":"testoperator1","cluster":"mdm-foundationdb-ibm","subReconciler":"controllers.changeCoordinators","requeueAfter":0,"error":"unable to fetch connection string: The database is unavailable; type `status' for more information.\n","stacktrace":"github.com/FoundationDB/fdb-kubernetes-operator/controllers.(*FoundationDBClusterReconciler).Reconcile\n\t/workspace/controllers/cluster_controller.go:183\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:298\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214"}
{"level":"error","ts":1656557697.5217397,"logger":"controller-runtime.manager.controller.foundationdbcluster","msg":"Reconciler error","reconciler group":"apps.foundationdb.org","reconciler kind":"FoundationDBCluster","name":"mdm-foundationdb-ibm","namespace":"testoperator1","error":"unable to fetch connection string: The database is unavailable; type `status' for more information.\n","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214"}

I checked the pod with cluster_controller role, and didn’t find fdb.cluster file it is trying to read from.
is this expected?
what’s the compatibility of the operator version and the foundationdb version? thanks.

johscheuer · July 1, 2022, 1:28pm

This doc contains all information about the supported versions: fdb-kubernetes-operator/compatibility.md at main · FoundationDB/fdb-kubernetes-operator · GitHub. The version you referenced is only the version used for compiling the operator not the supported version.

When did you see this error message during the creation of the cluster?

liuyan · July 1, 2022, 1:34pm

@johscheuer thank you. while compiling operator I used foundationdb v6.2.29, will it cause issue or I has to use 7.1.5?
The error was found in the recovery scenario like if I kill 3 storage pods, it will trigger coordinator change, then I observed this database unavailable issue. Any suggestions on how to solve it? thanks.

I checked one controller pod:

1000710+       9       8  0 13:06 ?        00:00:09 /usr/bin/fdbserver --class cluster_controller --cluster_file /var/fdb/data/fdb.cluster --datadir /var/
1000710+     474       0  0 13:34 pts/0    00:00:00 sh -i -c TERM=xterm sh
1000710+     480     474  0 13:34 pts/0    00:00:00 sh
1000710+     481     480  0 13:34 pts/0    00:00:00 ps -ef
$ cat /var/fdb/data/fdb.cluster
# DO NOT EDIT!
# This file is auto-generated, it is not to be edited by hand
mdm_foundationdb_ibm:SZ7NOf9vA1aDvRJBv3niEqrZTbLXkeBw@10.254.15.3:4500:tls,10.254.16.228:4500:tls,10.254.20.222:4500:tls

the ip in fdb.cluster is old. check storage pods

mdm-foundationdb-ibm-storage-1                                2/2     Running     0          25m   10.254.16.235   worker0.fdbtest3.cp.fyre.ibm.com   <none>           <none>
mdm-foundationdb-ibm-storage-2                                2/2     Running     0          25m   10.254.15.5     worker2.fdbtest3.cp.fyre.ibm.com   <none>           <none>
mdm-foundationdb-ibm-storage-3                                2/2     Running     0          25m   10.254.20.223   worker1.fdbtest3.cp.fyre.ibm.com   <none>           <none>

johscheuer · July 1, 2022, 2:30pm

In the latest Dockerfile we actually use 6.2.29 for building fdb-kubernetes-operator/Dockerfile at main · FoundationDB/fdb-kubernetes-operator · GitHub, so that the go bindings and the version we use for compiling supports the minimum version compatible version (6.2). There is no need to compile the operator with a different FDB version, you just can inject the required libraries and binaries with init containers, see: fdb-kubernetes-operator/manager.yaml at main · FoundationDB/fdb-kubernetes-operator · GitHub.

If you kill to many coordinators at once you can run into this: fdb-kubernetes-operator/warnings.md at main · FoundationDB/fdb-kubernetes-operator · GitHub. This can be fixed manually with the kubectl fdb plugin: fdb-kubernetes-operator/debugging.md at main · FoundationDB/fdb-kubernetes-operator · GitHub.

liuyan · July 5, 2022, 3:50am

Thank you @johscheuer . I used kubectl fdb fix-coordinator-ips to correct the coordinator ip, it works. Just curious, is there any consideration why we don’t automate this procedure so that end user needn’t manually run this command?

johscheuer · July 5, 2022, 7:23am

Thank you @johscheuer . I used kubectl fdb fix-coordinator-ips to correct the coordinator ip, it works. Just curious, is there any consideration why we don’t automate this procedure so that end user needn’t manually run this command?

I think the honest answer is that no now implemented the logic in the operator (there should be an issue on GitHub were we discussed this) and with 7.1 DNS support for the cluster file is coming which means we don’t have to do this hack anymore and the operator can use a headless service. Just as a warning if you want to try out the DNS support, there might be some issues and you first want to test this in a test cluster. Once the DNS support is stable and used in production we will announce this.

Topic		Replies	Views
Is kubectl-fdb version must be the same with apple operator version? Kubernetes Operator	4	369	May 17, 2022
FDB EKS --DNS issue Kubernetes Operator	7	590	June 16, 2023
Upgrade from fdb cluster from 6.2 to 6.3 failed on k8s environment Using FoundationDB fdbsummit , operator	5	511	November 8, 2022
How to upgrade apple operator 0.48 to 1.4.1 without deleting the old crd? Kubernetes Operator operator	41	1179	October 13, 2022
FDB K8s Operator stuck after FDB 7 migration Kubernetes Operator operator	4	297	July 7, 2023

What's the compatible version of foundationdb server and operator version?

Related topics