Data integrity resiliency on single node deployments

gaurav · August 22, 2018, 2:08pm

Hi,

We have non-standard deployment model where we want to use FDB in place of Postgres, but without any data replication (i.e. a single node).

I wanted to check, how reliable is FDB itself in such a setting with respect to maintaining integrity of data on disk.

Assuming that the underlying storage itself is reliable and it does not corrupt bits once fsync’d, is it reasonable to assume that FDB will be resilient to data corruption (at a level comparable to Postgres)? It may not be uncommon to have abrupt machine reboots (one source of data corruption that I can think of).

It would be great to know some of the details on data-write path and the checks/methods implemented to overcome events like abrupt process kills/machine reboots etc.

And also - if for some reason the storage files get corrupted with some error (I do not know the kinds possible with the FDB storage files), are there any troubleshooting steps/tools to salvage data (to the extent possible) and get FDB back to healthy state?

–
thanks,
gaurav

bbc · August 22, 2018, 9:27pm

Process (or machine) kills are handled in all cases in FoundationDB to avoid data loss or corruption. You can take a look at some of the documentation about testing to get a sense of how we make sure this is the case. To your question – this safety will be the same for a single machine as for a cluster of machines. There should be no difference between the crash-safety here from any non-distributed database, such as Postgres.

There’s not currently any tools for getting back data that was corrupted by the disk hardware. One thing you could do is consider running a backup continuously for this DB. Backing up your database can help with this – if there is a hardware fault you can restore to a consistent copy of your data. If backup is run in a streaming mode, the delta between the live data and the data stored in the backup can be kept to a minimum.

gaurav · August 23, 2018, 4:57am

Thank you Ben! This is very helpful and it gives a lot of confidence. I will keep the community updated should we notice any errors in this mode of deployment.

Topic		Replies	Views
Replication, automatic repairs, errors and bit rot FoundationDB Core	5	1790	May 10, 2019
Deploying storage-only servers Using FoundationDB	4	674	May 10, 2019
Greenfield project: What is the best course of action? Is single node cluster ready for production? Using FoundationDB	1	808	April 24, 2019
Scalability performance benchmark Using FoundationDB performance	6	2566	March 27, 2019
The cluster is continuously Restoring replication factor, and Moing data has not decreased Running FoundationDB performance , operator	1	49	October 10, 2024

Data integrity resiliency on single node deployments

Related topics