The specific scenario I encountered this status in is:
- FDB cluster is in
double
redundancy mode, with 3 machines - A single machine’s
fdbmonitor
process is restarted via systemd (restarting that machine’s runningfdbserver
processes)
The database remains available to serve requests, but the “Replication health” status enters the state “(Re)initializing automatic data distribution” for a few seconds. I’d like to understand exactly what data is being copied here.
I understand if the machine were lost entirely and a new one was added, data with a single copy would have to be re-replicated to the third machine to ensure double redundancy, but I’m not sure why this is happening during a restart, where the data that machine is responsible for is still present after it restarts.
Is the data being replicated one or more of these?
- Any transactions that started before the machine restarted, but failed to commit their data before the machine restarted?
- Transactions that occurred while the machine was offline that should have been stored there but were stored on the other machines instead?
Assuming these are correct, are these the only two types of data that would be replicated in this scenario? Or are there other types of data that might be copied while this status is present?