I am able to bring up non-TLS clusters with FDB v6.2, but have difficulties to create a TLS v6.2 cluster.
By following instructions in https://apple.github.io/foundationdb/tls.html, we tried two approaches:
- Setting Up FoundationDB to use TLS
- Converting an existing cluster to use TLS (since v6.1).
Neither succeeded. Here are some details.
Our env:
FDB v6.2.10
Ubuntu 18.04
Using Docker containers with Kubernetes.
Each Storage K8s pod has two processes at 4500 and 4501.
- “Setting Up FoundationDB to use TLS” approach
When using “-t” flag on make_public.py (at the Docker container startup), we see the coordinator in fdb.cluster file has “:tls” suffix, e…g,
4QzbRd0m:ItkkDDfp@10.104.193.12:4500:tls
However, the processes on the container seem abnormal. There are 3 fdbmonitor processes, but only 1 fdbserver process (supposed to have 2 at 4500 and 4501), as follows:
root@tls62-storage-01-76cd5f5b99-trqhh:~# ps aux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 6 0.0 0.0 12644 1052 ? S 07:26 0:00 /usr/lib/foundationdb/fdbmonitor
root 20 0.0 0.0 21076 3016 ? Ss 07:26 0:00 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize
foundat+ 34 0.1 0.0 331732 19420 ? Sl 07:26 0:00 /usr/sbin/fdbserver --class storage --cluster_file /var/lib/foundationdb/fdb.cluster --datacenter_id dc1 --datadir /var/lib/foundat
root 40 0.0 0.0 34448 12200 ? Sl 07:26 0:00 fdbcli --no-status --exec configure new single memory
root 47 0.0 0.0 21076 248 ? S 07:30 0:00 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize
Also there is a trace file under /var/log/foundation associated with 4501, but none for 4500.
root@tls62-storage-01-76cd5f5b99-trqhh:/var/log# ll foundationdb/
total 1604
-rw-r--r-- 1 foundationdb foundationdb 743 Jan 23 07:26 trace.10.104.193.12.22.1579764417.yVrDKA.0.1.xml
-rw-r--r-- 1 foundationdb foundationdb 1457778 Jan 23 07:42 trace.10.104.193.12.4501.1579764417.KoUk1T.0.1.xml
The resulting db is unavailable.
- “Converting an existing cluster to use TLS (since v6.1)” approach
Taking out the “-t” flag of make_public.py, we created the whole cluster in non-TLS first, then convert it to TLS, following the instructions in the doc.
Restarting fdbserver gave me this error:
root@tls62-storage-01-76cd5f5b99-c9w22:~# fdbserver -C /var/lib/foundationdb/fdb.cluster -p 10.104.193.12:4500 -p 10.104.193.12:4501:tls
Error initializing networking with public address 10.104.193.12:4500 and listen address 10.104.193.12:4500 (Local address in use)
Try `fdbserver --help' for more information.
Please help. Thank you.
Leo