This file has been truncated. show original
# FDB Recovery Internals
FDB uses recovery to handle various failures, such as hardware and network failures. When the current transaction system no longer works properly due to failures, recovery is automatically triggered to create a new generation of the transaction system.
This document explains at the high level how the recovery works in a single cluster. The audience of this document includes both FDB developers who want to have a basic understanding of the recovery process and database administrators who need to understand why a cluster fails to recover. This document does not discuss the complexity introduced to the recovery process by the multi-region configuration.
## `ServerDBInfo` data structure
This data structure contains transient information which is broadcast to all workers for a database, permitting them to communicate with each other. It contains, for example, the interfaces for cluster controller (CC), master, ratekeeper, and resolver, and holds the log system's configuration. Only part of the data structure, such as `ClientDBInfo` that contains the list of GRV proxies and commit proxies, is available to the client.
Whenever a field of the `ServerDBInfo`is changed, the new value of the field, say new master's interface, will be sent to the CC and CC will propagate the new `ServerDBInfo` to all workers in the cluster.
## When will recovery happen?
Failure of certain roles in FDB can cause recovery. Those roles are cluster controller, master, GRV proxy, commit proxy, transaction logs (tLog), resolvers, log router, and backup workers.
Network partition or failures can make CC unable to reach some roles, treating those roles as dead and causing recovery. If CC cannot connect to a majority of coordinators, it will be treated as dead by coordinators and recovery will happen.
Better master exists event can trigger recoveries. Better master exists event is the cluster changes such that there is a better location for some already recruited processes (say master role).