I know that coordinators need static IPs, because they only get updated by a manual call to coordinators
(this is assuming not using the K8s operator here, we’re deployed directly onto EC2 instances).
Our pattern currently is that we have ephemeral instances that come online and have an on-boot process that searches the current availability zone for an unattached volume with the correct tags (same cluster name and so forth) and mounts it. The same process uses other tags on the instance to decide things like how many volumes to look for/how many processes to create, whether to run backup and/or DR agents, what class processes should be, etc. It then builds the FDB config file based on that, before telling fdbmonitor
to start.
In the case where an instance is going to be a coordinator, tags on the volume tell it which network interface with a static IP to go and fetch, that ‘matches’ the volume, so the same data always has the same IP. For non-coordinators, we don’t see any performance or reliability problems with an instance X going offline on IP A, and a new instance Y on new IP B mounting the same volume, being configured as the same FDB class, and picking up work from where X left off.
This all works absolutely fine, but we see an awful lot of log spam at warning
about being unable to communicate with IPs that would appear to be instances that went offline days ago, and were replaced by new ones mounting the same data and joining the cluster fine, but on a new IP.