We use self signed CA for securing our FDB traffic in staging but not in production we have those weird error messages
teTime
2025-08-15T23:04:39Z
ID
0000000000000000
Machine
10.241.1.88:4502
Reason
preverification failed
Roles
SS
Severity
20
Type
TLSPolicyFailure
VerifyError
self signed certificate in certificate chain
__InvalidSuppression__
But from what I can tell it seems that everything is working how can it happen ?
So it turns out that the reason for the issue is fairly simple:
In the same k8s I have 2 FDB clusters each with their own CA, it happens that sometime a given pod in the cluster fdb-1 tries to talk to a pod in the cluster fdb-2 because the IP of this pod used to be owned by a pod in the first cluster.
This is happening because we use DNS not cluster IP because of constraints from our cloud provider and so upon pod restarts the IP of the pod will most probably change and the old processes in the other cluster will still try to connect to the old IP.
Because we tend to restart pods because nodes get updated regularly to deal with security updates we face this situation quite a lot actually.
That makes sense, thanks for sharing your finding. The stale connection issue is causing quite some issues, I know that some people are working on this issue, but it seems a bit more complex.