Couldn't get correct "status json" if Skip: true set

If we set skip: true, and verified in apple operator log:

{"level":"info","ts":1673235228.380058,"logger":"controller","msg":"Skipping cluster with skip value true","namespace":"testoperator1","cluster":"mdm-foundationdb-ibm","skip":true}
2023-01-09T03:33:18.508Z        INFO    Found skip true info from apple operator        {"kind":"controllers.fdbcontroller.FdbCluster"}
2023-01-09T03:33:18.508Z        INFO    Running command {"kind":"controllers.fdbcontroller.FdbCluster","fdbcluster":"testoperator1/mdm-foundationdb-ibm","command":"/usr/bin/fdbcli","fdbcluster":"testoperator1/mdm-foundationdb-ibm","args":"[/usr/bin/fdbcli --exec status json -C /tmp/2764938153]"}
2023-01-09T03:33:23.557Z        INFO    liuyan  {"kind":"controllers.fdbcontroller.FdbCluster","fdbcluster":"testoperator1/mdm-foundationdb-ibm","fdbcli status json ouput":"{
    "client" : {
        "cluster_file" : {
            "path" : "/tmp/2764938153",
            "up_to_date" : true
        },
        "coordinators" : {
            "coordinators" : [
                {
                    "address" : "172.30.195.80:4500:tls",
                    "reachable" : false
                },
                {
                    "address" : "172.30.6.93:4500:tls",
                    "reachable" : false
                },
                {
                    "address" : "172.30.127.11:4500:tls",
                    "reachable" : false
                }
            ],
            "quorum_reachable" : false
        },
        "database_status" : {
            "available" : false,
            "healthy" : false
        },
        "messages" : [
            {
                "description" : "Unable to reach a quorum of coordinators.",
                "name" : "quorum_not_reachable"
            }
        ],
        "timestamp" : 1673235201
    },
    "cluster" : {
        "layers" : {
            "_valid" : false
        }
    }
}
"}

actually, coordinator is reachable now.

Coordination servers:
  172.30.250.62:4500:tls  (reachable)
  172.30.197.24:4500:tls  (reachable)
  172.30.121.7:4500:tls  (reachable)

Client time: 01/09/23 03:57:48

fdb>

Is this expected?
Does it mean if skip: true there is nothing we could do, like get fdbbackup status, lock db etc.

Could you add some more information like the operator version you are using and what you are trying to achieve? I’m not sure I understand the questions you are asking. If the skip setting is set to true the operator won’t do any operations and won’t invoke any commands, after the FoundationDBCluster is fetched from the Kubernetes API it will directly stop doing any work for this cluster: fdb-kubernetes-operator/cluster_controller.go at main · FoundationDB/fdb-kubernetes-operator · GitHub so I’m not sure where the fdbcli call you pasted is called (I assume in you own operator around the open source version?).

Assuming the underlying FDB cluster is running and configured you should be able to query the FDB status with the admin client without any issues. Since the cluster is running with TLS could you verify that the TLS setup is correct?

1 Like

@johscheuer Thank you so much for your explanation, you inspired me to find the root cause. In my env, TLS is enabled, it is because this is a post restore scenario, meaning before skip true, old foundationdb cluster got deleted and recreated, but my application still using old ip to connect. That’s the culprit. thanks again for your help.