Sum of key-value sizes seems incorrect

I am trying to figure out where all disk gone and built simple cli app to measure size of the keys in each directory. It is very naive - simply range read all keys in all directories and values are not matching fdbcli output at all. I tried to check if there anything else in database and I measured size by simply iterating of all keys.

Currently fdbclu returns 940.289 GB, but my cli reports only 4.73 GB of keys and 57.52 GB of values overall.

Are there any magic in cli calculation? Any possible delayed cleanups? We are not writing too much.

My code is very naive - I simply range read all keys while transaction is timeouted and then simply continue from last read key until it is done. Also this anomaly is seen only in production database.

The number in fdbcli comes from a sample of the key-space, but it should be reasonably accurate. I believe it is fairly responsive to changes in the data, so I wouldn’t expect cleanup to be a major factor.

One thing that it does that you may not be is that it includes the system key-space. Usually the system key-space doesn’t get particularly large, but it sometimes can as a consequence of backup and/or DR being enabled but not actually making progress or falling behind. You could check this by running fdbbackup> status or fdbdr> status.

Also, if you are using directories to measure space, then it’s possible that there exists data outside of the current directory prefixes. You may have accounted for that, but if not that could explain it.

One option if you have a lot of unaccounted for space is to look at the cluster shard map. Since shards are generally divided by space, you can kind of get a sense where space is being used in your cluster by doing looking at these shards. One easy way to do this is to use the locality API, such as this call in Python:

db = fdb.open()
boundaries = list(fdb.locality.get_boundary_keys(db, b'' , b'\xff\xff\xff'))

If you see some prefix show up a lot in there, then that might give you a clue where to look.

You were right, something is odd in system space!

I have thousands of such lines:

\xff\x02/blog/I\xd0\xa7\xb9T\x01h\x8fo\x86d]\xc1\xa4\x19\x87\x00\x00\x00H

What could it be? There are no DR or Backup active.

How to delete this?

blog is backup data. My first suggestion would be to do a status check with fdbbackup and fdbdr for whichever of those you’ve ever used. If it shows any entries, you can abort them. If there’s nothing to abort or abort doesn’t help, there is also the cleanup command that can be used to git rid of orphaned data.

1 Like