mason
(mason)
July 17, 2021, 1:03pm
1
Hi all,
I tried to deploy FDB cluster with TLS enable with document https://github.com/FoundationDB/fdb-kubernetes-operator/blob/master/docs/manual/tls.md
If I do not DISABLE_SIDECAR_TLS_CHECK=1, it will always send error
{
"level": "error",
"ts": 1626526429.2545986,
"logger": "controller-runtime.manager.controller.foundationdbcluster",
"msg": "Reconciler error",
"reconciler group": "apps.foundationdb.org",
"reconciler kind": "FoundationDBCluster",
"name": "fdb-cluster",
"namespace": "default",
"error": "GET https://10.42.2.14:8080/substitutions giving up after 11 attempt(s): Get \"https://10.42.2.14:8080/substitutions\": x509: cannot validate certificate for 10.42.2.14 because it doesn't contain any IP SANs",
"stacktrace": "github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:302\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:253\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/internal/controller/controller.go:216\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.JitterUntilWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:185\nk8s.io/apimachinery/pkg/util/wait.UntilWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.20.4/pkg/util/wait/wait.go:99"
}
Could you help me solve this?
johscheuer
(Johannes Scheuermann)
July 19, 2021, 6:56am
2
"https://10.42.2.14:8080/substitutions: x509: cannot validate certificate for 10.42.2.14 because it doesn't contain any IP SANs"
tells you the issue. If you wanted to run a TLS cluster without DISABLE_SIDECAR_TLS_CHECK
you have to provide valid TLS certificates that match the IP address (currently the operator will use the IP address to connect to the sidecar). That’s also mentioned in the linked manual: fdb-kubernetes-operator/tls.md at master · FoundationDB/fdb-kubernetes-operator · GitHub . This is currently a limitation how TLS is handled we have a planned feature to make this more flexible: Custom validation for TLS connections to the sidecar · Issue #756 · FoundationDB/fdb-kubernetes-operator · GitHub . If you need this feature feel free to work on it and file a PR.
mason
(mason)
July 19, 2021, 11:44am
3
Hi @johscheuer
Thank for your answer. Can you give me one example about peer verification?
I do some but it not working for peer, here is my config:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: fdb-cluster-certificate
namespace: default
spec:
dnsNames:
- “*.fdb-cluster.default.svc.cluster.local”
secretName: fdb-cluster-certificate
issuerRef:
name: fdb-cluster-issuer
kind: Issuer
routing:
publicIPSource: “service”
mainContainer:
enableTls: true
peerVerificationRules: “subjectAltName.DNS>=fdb-cluster.default.svc.cluster.local”
sidecarContainer:
enableTls: true
peerVerificationRules: “subjectAltName.DNS>=fdb-cluster.default.svc.cluster.local”
johscheuer
(Johannes Scheuermann)
July 22, 2021, 6:08am
4
The official docs have some examples: Transport Layer Security — FoundationDB 6.3 if anything is missing please let me know or update the docs with the missing pieces. You can also look at the trace logs in the FoundationDB containers to see what’s the issue with the verification rule (or rather why the certificate is not matching). Depending on how cert-manager generates the certificates they might be seen as self-signed rather then “official” certificates.
mason
(mason)
July 22, 2021, 10:05am
5
Hi @johscheuer
I did the same with official document and peerVerificationRule work with sidecar, not for maincontainer
Here is my config:
mainContainer:
enableTls: true
peerVerificationRules: “S.subjectAltName<=DNS:.fdb-cluster.default.svc.cluster.local”
sidecarContainer:
enableTls: true
peerVerificationRules: “S.subjectAltName<=DNS:.fdb-cluster.default.svc.cluster.local”
johscheuer
(Johannes Scheuermann)
July 23, 2021, 7:39am
6
Could you take a look in the FDB trace logs for any TLS issues? I assume that’s an issue with self-signed certificates. If you root CA is a self-signed cert it must contain the AuthorityKeyId
that points to itself otherwise FDB will reject a self-signed certificate as root cert.
HI,
I tried the rules as well as adding the AuthorityKeyID in the root certificate. In both cases, I get this error:
{“level”:“error”,“ts”:1645030087.361136,“logger”:“controller”,“msg”:“Error deserializing pod substitutions”,“responseBody”:"\n\n \n <meta http-equiv=“Content-Type” content=“text/html;charset=utf-8”>\n Error response\n \n \n
Error response \n
Error code: 401
\n
Message: Client certificate was not approved.
\n
Error code explanation: 401 - No permission – see authorization schemes.
\n \n\n",“error”:“invalid character ‘<’ looking for beginning of value”,“stacktrace”:“
github.com/FoundationDB/fdb-kubernetes-operator/internal.(*realFdbPodClient ).GetVariableSubstitutions\n\t/workspace/internal/pod_client.go:216\
Ngithub.com ).Reconcile\n\t/workspace/controllers/cluster_controller.go:155\
nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller ).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:298\
nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller ).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:253\
nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller ).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.9.6/pkg/internal/controller/controller.go:214”}
Any idea what that is about?
johscheuer
(Johannes Scheuermann)
February 21, 2022, 9:37am
8
tangerine:
tried the rules as well as adding the AuthorityKeyID in the root certificate. In both cases, I get this error:
{“level”:“error”,“ts”:1645030087.361136,“logger”:“controller”,“msg”:“Error deserializing pod substitutions”,“responseBody”:"\n\n \n <meta http-equiv=“Content-Type” content=“text/html;charset=utf-8”>\n Error response\n \n \n
This error comes from one of the sidecars and mean that the provided cert by the operator doesn’t match the provided peerVerificationRules
for the sidecarContainer
.
The rule I used is peerVerificationRules: “Check.Valid=1”, wouldn’t that would match anything?