Is there an approach to prevent FoundationDB from ever assigning a storage role to log processes?
According to the Roles / Classes matrix and Locality.cpp, the storage role is defined as WorstFit for Log processes. For our case, it would be preferable to define it as NeverAssign, as assigning a storage role to a log process is likely to render the cluster unavailable.
Context:
We have recently, twice, experienced log processes being assigned the storage role when excluding storage processes. The log processes run on smaller disks compared to the storage processes (64GB vs 1TB). Both times the FoundationDB cluster became effectively unavailable, as the disks of the log processes became full during data rebalancing.
In each case the cluster was healthy and fully replicated (triple redundancy), and one or more exclusion(s) of storage processes, either due to replacement or shrinking, caused the FoundationDB cluster to assign storage roles to log processes.
Do you know why those log processes have the storage role assigned? And is the storage role removed after the excluded processes are removed? What fault domains does your setup use and how were the processes distributed at the time? Just wondering why FDB was choosing to use some log processes as storage processes.
Which seems to be caused by hasHealthyTeam (healthyTeamCount!=0) being false [source]. Theory: Without digging further into the code, I would assume a team is considered unhealthy when one of the processes are excluded. → If all teams have > 0 excluded processes, FoundationDB will start critical recruitment, adding storage roles to log processes.
What fault domains does your setup use and how were the processes distributed at the time?
In both cases we ran with a single pod (process) per machine (k8s node), with each machine being a fault domain. Triple redundancy.
First case:
3 storage processes. Replacing one storage process. One exclusion + one new storage process (new pod and machine) joining the cluster.
Second case:
30 storage processes. Scale down to 6 storage processes (through k8s operator). 24 exclusions.
And is the storage role removed after the excluded processes are removed?
We didn’t observe this as most of the excluded processes didn’t manage to empty their disk/storage before the log disks became full, i.e. they were never removed.
Same issue here. We’re running direct on VMs rather than using the K8s operator, but if we get a critical failure on too many of our storage processes FDB will try to co-opt the log processes to the storage role to keep things running.
This does manage to keep the cluster up for a short time, but pretty rapidly the log processes fill their disks and then it’s much more of a mess to unpick. We’d much rather it failed earlier, but the resolution was likely simpler (remove the source of very high cluster load, etc).
We have alerts for disk usage, but the log disks fill so rapidly when the problem occurs that we don’t really get time to do anything about it. We’re also looking at trying to configure alerts whenever a log class instance is detected running a process with a storage role, but just preventing that from happening in the first place would be better.
I believe this is currently not possible, given the fitness for the TLog role: foundationdb/fdbrpc/Locality.cpp at main · apple/foundationdb · GitHub. In theory this method could be extended with an additional knob to return NeverAssign instead of WorstFit. I don’t know your deployment but how many additional log processes are running in your cluster?