Differentiating between primary cluster and cluster restored from snapshot

If cluster A is periodically taking snapshots, at some point in time we may want to restore one of these snapshots to cluster B. By the design of the snapshot feature, the restored cluster’s keyspace looks exactly like that of the original cluster. Unlike with traditional backups, even the system keyspace is identical on cluster B. This means we cannot use any system keys to differentiate between a cluster being restored from a snapshot and a cluster that is going through recovery while being snapshotted. There are several reasons we’d like to distinguish between these cases:

  • The system keyspace used by backup agents is no longer relevant to cluster B, and should be cleared before backup agents start reading this keyspace (Issue 3873)
  • We need to do additional setup (i.e. applying a incremental backup) to cluster B before it is ready to handle client workload. Ideally we could lock cluster B in the initial recovery transaction, but we also need to avoid locking cluster A if it happens to go through a recovery while snapshotting.

To get around the above two issues, our current approach is to solve these problems at the operational level, instead of natively in FoundationDB. However, we were wondering if anyone has any other ideas for handling this differentiation natively in FoundationDB? Thank you.

I assume when operator wants to restore the snapshot backup to the cluster B (restore destination cluster), it needs to copy the backup files (i.e., SS, tLog and coordinator files) to the cluster B.

If so, can it also create a dummy file on each host of the cluster B. When a fdbserver worker starts, it scans the files on disk. If it sees such a file, it knows the data is from backup.

Following the same idea, you can also choose to add a prefix to the backup file to distinguish it.

That is one solution that could work. I think unfortunately there’s no way around doing something at the operational level.

there’s no way around doing something at the operational level.

Did you mean “there’s no way around without doing something at the operational level.” ?

If you need to copy backup files from one place (say S3) to your destination cluster, that is an operation you cannot avoid. It does not seem too much operation overhead to add a new dummy file per host.

Yes, I meant no way to avoid doing something at the operational level. I agree, this should not be too much overhead, since there’s already a lot of operational effort involved in snapshot-restore.