How to expose a metric for the fdbmonitor count of restarts?

Is there a better way to count the number of restarts which fdbmonitor performs, other than parsing its logged output?

In my company, we are exporting metrics in Prometheus format. status json is exporting the uptime_seconds per process. A reset in the counter means that fdbserver has been restarted. This info has been pretty useful for us to detect OOM from misconfigured StorageServers

1 Like

Thanks, that is a great idea!

I am actually using a fork of your older Go version for the metrics: GitHub - PierreZ/fdb-prometheus-exporter: A FoundationDB Prometheus metrics exporter

aha, I wrote this a looooong time ago, when I was discovering fdb. It is not really maintained anymore, whereas the CleverCloud one is used internally and maintained :slight_smile:

1 Like

This one? GitHub - CleverCloud/foundationdb-exporter: A FoundationDB metrics Exporter with Prometheus compatibility

Yes, maintained by @AlexandreBrg which has most(if not all) of the commits

2 Likes

FYI, when there is a crash or termination (via SIGKILL) a new metric will be generated. In this case Prometheus will not count it as a reset.

Hello,
According me to count the number of restarts performed by fdbmonitor more efficiently than parsing its logged output, consider using the FoundationDB status API, which can fetch and monitor restart metrics directly, providing real-time data without log parsing. You can also use another option is to integrate fdbmonitor with external monitoring tools like Prometheus or Grafana, which can be configured to track restart events and display them on a dashboard for easier monitoring.

Which status API are you referring to? The JSON status? It does not contain a count of restarts; can you please provide an example of the information you are referring to?