Increased system keyspace size after backups finish

larshagen · May 13, 2022, 12:37pm

We are running weekly backups of our FDB clusters, and in our largest cluster the system keyspace [\xFF, \xFF\xFF) seems to be growing week by week. It grows during the weekly backup, and when the backup is completed, it goes down again, but not to the level it was before the backup started.

When looking at the distribution of data in the system keyspace, we that most of the data is under the \xff\x02/backup-agent/uid->config/prefix, and most of the keys have the form \xff\x02/backup-agent/uid->config/[uid]snapshotRangeFileMap... where [uid] seems to be a 16-byte UID.

Is this expected, and is there any way to clean it up?

larshagen · May 27, 2022, 12:31pm

Since writing this post, I have since discovered that the uid is indeed the uid of a backup, and that the tag may be found by lookup at the entries under the \xff\x02/backup-agent/tag->uid/ prefix.

The files for many of the tags are already deleted, and even if they weren’t I don’t know why we need to keep several GB’s of snapshotRangeFileMap entries. The backups are presumably restorable without this data, as we need to be able to restore without it.

For a given backup tag, is it safe to delete the entries under \xff\x02/backup-agent/tag->uid/ and \xff\x02/backup-agent/uid->config/, or is there some other invariant that one needs to maintain?

ajbeamon · June 8, 2022, 5:00pm

If you run fdbbackup status, does it show anything? Also, you could run fdbcli status details and see if reports that there are any running backup or DR tags.

The fdbbackup tool has a cleanup command that can show and help remove some stale backup data. I would try using this first rather than deleting the keys manually. I think there can be some trickiness involved with cleaning up the backup state directly, though this is probably easier if you don’t have any backups or DRs running at all.

larshagen · June 8, 2022, 6:18pm

Typical output from status json (from .cluster.layers.backup.tags.<tag>)

        "current_container": "<blob store address>",
        "current_status": "has been completed",
        "last_restorable_seconds_behind": 4656009.350732,
        "last_restorable_version": 4199961082790,
        "mutation_log_bytes_written": 10418747,
        "mutation_stream_id": "<id>",
        "range_bytes_written": 28418394700,
        "running_backup": false,
        "running_backup_is_restorable": false

From status details:

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

From fdbbackup status I get;
No previous backups found.

It does not seem that fdbbackup cleanup does anything, it outputs nothing.

ajbeamon · June 9, 2022, 3:54pm

Ok, I see. That doesn’t sound like the expected behavior, but I don’t know the inner workings of backup well enough to be able to offer an explanation. @SteavedHams do you have any insights?

SteavedHams · June 9, 2022, 7:56pm

Yes, once a backup is complete (completed or aborted) it is safe to delete these things. Also,

This subspace, specifically, is used for tracking progress of the current active snapshot being written during a backup. It is not used by restore.

The backup configuration for a UID was initially left behind on purpose because it was small and may be useful later if there is no external record of backup actions. Unfortunately, as you’ve discovered, it is no longer small largely due to the snapshot range file map which was added some time later during backup’s development. At the very least, this section of the config should be cleared after a snapshot is completely flushed to the backup destination.

Thanks for finding this, I’ll try to get this into FDB 7.2. It’s a very small fix.

larshagen · June 10, 2022, 7:56pm

Thank you for you detailed reply!

I will go ahead with manual deletes of these ranges for now, and then look forward to a fix when we upgrade to 7.2 (currently on 6.3)

SteavedHams · June 12, 2022, 9:24pm

github.com/apple/foundationdb

Backup metadata cleanup

opened 09:23PM - 12 Jun 22 UTC

sfc-gh-satherton

After a backup is complete or aborted, its config and working state remains in t…he database. This is useful so that `fdbbackup status` can report things about the most recently started backup, and in theory (though there is no UI for it) it could report things about backups older than that. The config and state for a backup would be small if not for the backup snapshot file map, which is filled during a snapshot and cleared at the start of a new snapshot. This subspace under a backup's configuration is not used for anything once a backup is complete or aborted, so at the very least it should be cleared when the backup transactions to any final state.

windsteel · November 1, 2022, 7:44am

Hello, do you notice that how big the mentioned space is? (any relationship between your existing data volume in your database) I am new for FDB management. How did you manually remove those ranges anyway? (those subspaces should be protected by default? ) Thanks!

larshagen · November 1, 2022, 8:00am

The space usage seems to be proportional to the total stored bytes in the database. For a cluster with >10TB, we noticed that system space increased by a few hundred MB after every backup, so relatively small compared to the database volume.

The ranges are protected by default as they are part of the system keyspace, so deleting them requires bypassing those protections by explicitly setting transaction options to allow writes to system keys.

Topic		Replies	Views
Fdbbackup files smaller than the actual storage size of the original database Using FoundationDB	6	57	October 8, 2024
Aborting a DR Backup when destination is unreachable Using FoundationDB	8	1061	July 25, 2022
Sum of key-value sizes seems incorrect Using FoundationDB performance	3	916	August 2, 2021
UnableToWriteStatus error in backup-agent when number of backups grows large Using FoundationDB	1	28	March 29, 2025
Trying to understand the backup mechanism better FoundationDB Core	4	1447	August 24, 2018

Increased system keyspace size after backups finish

Related topics