At Wavefront we have effectively three Fdb tiers. One is statically defined as a KV store and tends not to grow.
The other two are dynamic and grow/contract based on how we consume telemetry. This is elastic - but not automatically elastic - and we’ve built tooling to support that, primarily Terraform & Ansible.
Hosts are launched via Terraform using pre-built AMIs (via Packer/Ansible). These AMIs are “instance type” based and derive their disk layout and
/etc/foundationdb/foundationdb.conf largely as a function of the instance type. An
i3.16xlarge vs. an
i3.4xlarge will have a different set of ssd/mem processes as a function of the memory/CPU footprint. Disks are likewise built as a function of the instance type (and specifically number of NVMe instance stores) + data volume size.
These AMIs are stateless. Terraform plumbs in instance tags and on instance launch, we have two separate processes that define state and join this new host to the cluster:
@reboot cronjob that runs
telegraf, central logging and volume tagging is done here. This is mostly per-host configurations that we do with Ansible.
init.d script runs on boot that, among other things, adds this host into Wavefront (for monitoring) and pulls the cluster files (
/etc/foundationdb/fdb.cluster) down from S3. We also run a memory version on the same host so the memory cluster file is also pulled down.
There are secondary processes that push changes to cluster files (
fdb.cluster and, in our case,
memory.cluster) back up to S3 so any coordinator changes are more or less automatically kept in sync. This is true for all their Fdb the, including the statistically defined KV store (so we can handle hard instance replacements somewhat automatically).
We actually do a post-launch conformance so while Fdb could start on launch - and used to - the AMIs now have Fdb installed but disabled on boot. The Ansible conformance job does many things but that thing it can do is start
foundationdb and enable it on boot.
At our scale, it’s some time disruptive to auto-join on a large cluster. So we will have Terrafrom drop all the new instances, conform and then we can join.)
We contract (or replace all instances with larger/smaller) as SOP. This has been reduced to four Ansible commands that:
- Tag all the existing instances with a specific EC2 tag
- Using those EC2 tag, re-coordinate and exclude (we do this twice, once for the ssd tier, once for memory)
- Disable instance termination protection, stop “old” instances, validate we aren’t hard down (!) and eventually terminate the excludes nodes and issue an “
include all” to the cluster.