Hi
I am using fdb-operator to setup a very basic FDB cluster on EKS with EBS as storage. I am running into multiple issues with FDB Cluster setup and upgrade -
- The cluster never becomes available and gives me below error when I’m checking the status with
kubectl fdb
command, this is happening when all the pods are in running state -
kubectl get pod
NAME READY STATUS RESTARTS AGE
astradot-fdb-cluster-controller-1 2/2 Running 0 9m58s
astradot-fdb-log-1 2/2 Running 0 9m58s
astradot-fdb-log-2 2/2 Running 0 9m58s
astradot-fdb-log-3 2/2 Running 0 9m58s
astradot-fdb-log-4 2/2 Running 0 9m58s
astradot-fdb-storage-1 2/2 Running 0 9m58s
astradot-fdb-storage-2 2/2 Running 0 9m58s
astradot-fdb-storage-3 2/2 Running 0 9m58s
kubectl fdb analyze --all-clusters
Checking cluster: astradot/astradot-fdb
✖ Cluster is not available
✖ Cluster is not fully replicated
✖ Cluster is not reconciled
✖ ProcessGroup: cluster_controller-1 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: log-1 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: log-2 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: log-3 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: log-4 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: storage-1 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: storage-2 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✖ ProcessGroup: storage-3 has the following condition: MissingProcesses since 2023-07-21 04:23:29 +0530 IST
✔ Pods are all running and available
Checking cluster: astradot/astradot-fdb with auto-fix: false
This is my FDB Cluster manifest -
---
apiVersion: apps.foundationdb.org/v1beta2
kind: FoundationDBCluster
metadata:
name: astradot-fdb
namespace: astradot
annotations:
argocd.argoproj.io/sync-wave: "1502"
spec:
version: 7.1.27
automationOptions:
replacements:
enabled: true
minimumUptimeSecondsForBounce: 60
processCounts:
cluster_controller: 1
stateless: -1
processes:
general:
podTemplate:
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: karpenter.sh/provisioner-name
operator: Exists
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
containers:
- name: foundationdb
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
- name: foundationdb-kubernetes-sidecar
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
initContainers:
- name: foundationdb-kubernetes-init
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
volumeClaimTemplate:
spec:
# storageClassName: local-storage-disk-rancher
resources:
requests:
storage: 20G
routing:
defineDNSLocalityFields: true
sidecarContainer:
enableLivenessProbe: true
- When I change any parameter in the FoundationDBCluster object then it does not deletes the old running pods. Like below is the scenario when I updated the version of the FDB cluster -
kubectl get pods -w -owide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
astradot-fdb-cluster-controller-1 2/2 Running 0 19m 10.0.73.82 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-cluster-controller-2 2/2 Running 0 3m23s 10.0.93.96 ip-10-0-83-61.ec2.internal <none> <none>
astradot-fdb-log-1 2/2 Running 0 19m 10.0.65.81 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-2 2/2 Running 0 19m 10.0.75.192 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-3 2/2 Running 0 19m 10.0.78.50 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-4 2/2 Running 0 19m 10.0.64.93 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-5 2/2 Running 0 3m24s 10.0.66.148 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-6 2/2 Running 0 3m23s 10.0.71.155 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-7 2/2 Running 0 3m23s 10.0.79.13 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-log-8 2/2 Running 0 3m23s 10.0.93.100 ip-10-0-83-61.ec2.internal <none> <none>
astradot-fdb-storage-1 2/2 Running 0 19m 10.0.69.122 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-storage-2 2/2 Running 0 19m 10.0.72.135 ip-10-0-72-98.ec2.internal <none> <none>
astradot-fdb-storage-3 2/2 Running 0 19m 10.0.77.15 ip-10-0-72-98.ec2.internal <none> <none>
As you can see we had 4 pods of astradot-fdb-log and once we upgraded the version it became 8, which ideally should be 4 only but with upgrade fdb version.
Let me know if more details are required, I’ll provide the information.
Thanks