Disk snapshot consistency

Hi,

I’ve read about the upcoming disk snapshotting capability in this page and I have a question regarding the consistency of those backups.

The reference point I have is how snapshots work in HBase. When a snapshot is taken in HBase, the memory store may or may not be flushed, and the hfiles of the various regions are marked as part of the snapshot. Since those hfiles are immutable, there is the certainty that none of them will change after the snapshot. HBase will simply retain those hfiles which are part of a snapshot when performing compactions.

In the case of FoundationDB, the snapshotting script given as example performs a cp of the files in the datadir of each process with persistent data. Does this mean that the files in the datadir are flushed and closed during the snapshot but may be reopen and updated once the snapshot script returns, or does it mean that those files are flushed, closed and will not change afterwards? That latter case would create an opportunity for faster snapshotting by creating hard links to those files and then handling the transfer in an asynchronous way, making snapshots efficient on file systems/disks with no built-in snapshot capabilities.

Is there a more in-depth description of how the various persistent files are managed during and after a snapshot?

1 Like