Detecting the number of recoveries from the client API

Is it possible for the client to cheaply detect the number of recoveries the database has gone through?

I have a component which requires leader election and I would like candidates to avoid removing the current leader if they can detect the database has gone through more recoveries than they expected.

The candidates would wait some additional amount of time after they detect a recovery may have happened before attempting to become leader.

The timescale upon which a failure needs to be detected is tens of seconds.

1 Like

I’m not sure if this is quite “cheap” enough, but the generation is in the machine readable status, which clients can read from \xff\xff/status/json. There’s roughly an increase in 2 in the generation every time there’s a recovery, so you might be able to use that. However, historically getting the machine readable status has been somewhat expensive, though I think that’s improved recently.

It’s the “generation” field mentioned here: https://apple.github.io/foundationdb/mr-status.html

Somewhere, there’s also work going on (mostly spearheaded by the fine folks at Snowflake) to add APIs for clients to extract things like single fields from that the status, so you could do something like ask for \xff\xff/status/json/generation, and just get the generation, though we’re not quite there yet.