After a storage node comes back from a down time, how FDB decides to re-use existing disk content OR re-populate it?

lehu · December 6, 2020, 12:29am

After a storage node is down for some time and comes up, what criteria does FDB use to decide whether to bring its data up to date and continue to use them, OR to discard its existing data and repopulate them from other nodes?

Is it the time elapsed when the node is down? Say, if it is down for 10 mins or more.
Is it the amount of data changed during the downtime? Say, if more than N MB of transaction data has occurred.
Is it related to TX log size? Whether it can bring the node up to date by the data kept in TX log.
Something else.
Or a combination of these factors?

Are there config parameters we can use to tune the threshold?

We find that nodes (Kubernetes pods) in our fdb clusters have downtime quite often.
Sometimes it’s involuntary, like the underlying physical host has hardware problems.
Other times it’s voluntary, e.g., fdb version upgrade, or mandatory quarterly host OS update in secured env where the update and reboot can take 5 to 10 mins.Understanding the exact criteria will help our fdb operations significantly.

Thank you.

ajbeamon · December 7, 2020, 5:44pm

I believe that a storage server that is down will continue to be responsible for any data it has until it has been re-replicated elsewhere. That means if it comes back before data movement is done, it will keep its responsibility regardless of how long has elapsed or how much activity has occurred (though obviously the longer it takes, the more likely movement has finished). This movement isn’t an all-or-nothing proposition, though. It could be the case that some of the shards assigned to the storage server have been fully moved, in which case those shards will no longer be assigned to the affected storage server.

In the upcoming 6.3, there is a new fdbcli command exclude failed that allows you to specify that a storage server should be forgotten even prior to data being replicated. Doing this means that any data residing on the failed storage server would not be easily reusable in the case that you lost other copies.

Topic		Replies	Views
What is the data re-balancing behavior when a node is temporarily not available? Using FoundationDB	2	686	June 4, 2019
Image upgrade in FDB cluster Using FoundationDB	0	198	June 23, 2023
Repartionning after storage server (ss) restart Using FoundationDB	0	15	November 2, 2024
Unknown FDB downtime Running FoundationDB	11	722	June 17, 2021
FDB cluster freeze Using FoundationDB	12	385	March 22, 2023

After a storage node comes back from a down time, how FDB decides to re-use existing disk content OR re-populate it?

Related Topics