Detecting the number of recoveries from the client API

ryanworl · April 16, 2020, 8:47pm

Is it possible for the client to cheaply detect the number of recoveries the database has gone through?

I have a component which requires leader election and I would like candidates to avoid removing the current leader if they can detect the database has gone through more recoveries than they expected.

The candidates would wait some additional amount of time after they detect a recovery may have happened before attempting to become leader.

The timescale upon which a failure needs to be detected is tens of seconds.

alloc · April 16, 2020, 11:37pm

I’m not sure if this is quite “cheap” enough, but the generation is in the machine readable status, which clients can read from \xff\xff/status/json. There’s roughly an increase in 2 in the generation every time there’s a recovery, so you might be able to use that. However, historically getting the machine readable status has been somewhat expensive, though I think that’s improved recently.

It’s the “generation” field mentioned here: https://apple.github.io/foundationdb/mr-status.html

Somewhere, there’s also work going on (mostly spearheaded by the fine folks at Snowflake) to add APIs for clients to extract things like single fields from that the status, so you could do something like ask for \xff\xff/status/json/generation, and just get the generation, though we’re not quite there yet.

Topic		Replies	Views
FoundationDB Status as JSON Development	0	785	May 8, 2018
How to recover a FDB cluster with Recovery Stopped TooMany Old Generations Using FoundationDB	3	191	March 22, 2024
How to expose a metric for the fdbmonitor count of restarts? Using FoundationDB	8	226	July 5, 2024
How to do status check using the Java API Using FoundationDB bindings	5	1974	June 30, 2018
I made a tool for browsing `status json` Using FoundationDB	1	482	November 3, 2022

Detecting the number of recoveries from the client API

Related topics