Cannot clear all keys by "clearrange \x00 \xff"

Sraw · August 10, 2019, 5:44am

We have a huge database for testing purpose and now we want to re-test it. So we need to clear the whole database.

I’ve searched a lot and all of them point to the same answer: clearrange \x00 \xff.
But after running this command, there are still a lot of data inside the database:

  Sum of key-value sizes - 2.030 TB
  Disk space used        - 10.812 TB

So I have two questions:

Is this the correct way to clear the whole database?
How could I free the whole disk space usage?

Thanks in advance!

ryanworl · August 10, 2019, 10:31am

I’m assuming you meant to type clearrange \x00 \xff and not \ff.

Past that, deletes do not take effect immediately on disk. There is a background process which periodically frees old disk pages. It was changed relatively recently to run more often in this PR: https://github.com/apple/foundationdb/pull/1485/files

There are some knobs you can fiddle with listed in that PR if your workload will involve a lot of deletes and you want to dedicate more IO and CPU to them to run more frequently and perform more work per run.

Definitely test these out on a non-prod cluster first!

Sraw · August 10, 2019, 10:46am

Hi, you are correct I mean to type clearrange \x00 \xff. I think I have waited a relatively long time - one day. But there is no difference at all.

It looks like there are really still 2TB key-value sizes in the database. But I just don’t know where are they and how to delete them.

ryanworl · August 10, 2019, 10:49am

There hasn’t been any progress at all during that day? What did you start off with?

The only thing that command wouldn’t delete AFAIK is data in the system keyspace, and there wouldn’t be 2TB of data in there unless someone misconfigured their application to write into it.

Sraw · August 10, 2019, 10:51am

I actually use go-ycsb to benchmark the cluster. It is possible that the implementation of go-ycsb’s driver is misconfigured. But as it uses the client, will client write data into it?

And yes there isn’t any progress at all during the day. As it is just a test environment.

gaurav · August 10, 2019, 12:07pm

I am almost sure that there can be keys before \x00 too.

In java bindings, I usually create a range like new byte[]{}, new byte[]{(byte) 0xFF} to capture the entire key space for deletion etc.

Is it possible that you have keys written in keyspace that sort before \x00? I do not know off-hand how to list or delete those from CLI; can you write a small code snippet to test if there is any data that you are getting with the range I mentioned above?

db.run(tx -> {
            tx.clear(new byte[]{}, new byte[]{(byte) 0xFF});
            return null;
       });

Also, I’ve observed that FDB instantly reduces the key-value sizes when keys are cleared, but Disk space used gradually comes down as the background vacuuming progresses. In your case, it seems like there are a lot of live keys in the DB.

Sraw · August 10, 2019, 12:23pm

Yes, I observed the same behavior as you say. The key-value sizes go down instantly while the disk usage goes down slowly. In my case, I think there are still a lot of live keys

I will try your advice, thanks a lot. And hope we can get a way to check all keys including hidden ones through CLI.

ajbeamon · August 10, 2019, 1:45pm

There is only one key before \x00, which is the empty key. This is why I typically recommend using the following command in fdbcli to clear the whole database (note you must use double quotes " instead of single quotes '):

clearrange "" \xff

When you run a clear range like this, you should see the sum of key value sizes drop quickly, but the disk space may take a while to recover. The empty key won’t hold 2 TB, though, so that wouldn’t explain your issue.

What happens if you run a range read on the main key space? It should return no keys:

getrange "" \xff

If that is empty, the next thing to check would be the \xff key space. The most likely way of accumulating data there would be if you turned on backup or DR but didn’t have any agents to do the associated work.

An effective way to check this is to use the locality API to get shard boundaries. If you really have a lot of data somewhere, there should be a lot of shard boundaries that indicate where. In the Python bindings, for example, you would use:

fdb.locality.get_boundary_keys(db, b"", b"\xff\xff")

If that doesn’t turn up anything, or if turns up a bunch of shards which are in fact empty, then there may be something going wrong.

Sraw · August 11, 2019, 6:46am

Thanks for your reply! As I have to keep the test going, I eventually reinitialized the whole cluster. But I think it will happen again during the rest of the test. I will try your advice later.

Some additional information: Because it is a test, I frequently change the settings of the cluster, so it might cause some problem… Or not?

gaurav · August 11, 2019, 11:25am

What settings are those?

Sraw · August 11, 2019, 2:54pm

The class of processes. To figure out the best combination and to understand it better.

markus.pilman · August 11, 2019, 3:29pm

That typically shouldn’t cause any issues. However, I am not sure how well we test those scenarios.

If you want to be on the super safe side (might not be the case here - but just in case you want to do this in production), it is usually better to first exclude the process and then change its class.

osamarin · April 6, 2022, 3:44pm

Hi, All.

I’m tuning an application that is running against fdb.

The problem is that after clearing a large range foundationdb starts performing some background disk i/o activity that influances the application performance.

Are there any knobs that to control the agressivity of the i/o activity?
Are there any capability of temporary enabling/disabling this background activity?

markus.pilman · June 21, 2022, 10:03pm

I think the knobs are optimized for this use-case already. I don’t believe this will ever get better with the sqlite storage engine (and I would expect this getting much worse with RocksDB).

I would recommend you try Redwood in FDB 7.1 which should at this point be pretty stable. We don’t yet recommend using Redwood for production but if you can reproduce your issue in a testing environment you could verify whether redwood solve this problem for you. We hopefully will declare redwood as being stable very soon

osamarin · June 22, 2022, 2:06pm

But the user should have a choice between more fast cleaning and better application performance.

ajbeamon · July 3, 2022, 10:20pm

There are a number of knobs that relate to this cleaning function that you can tune:

github.com

apple/foundationdb/blob/c417df83f31587ab3a3920eb7d6a8d80925a31be/fdbclient/ServerKnobs.cpp#L356


      
          	);
          
          
	// Maximum FDB fragment value bytes in an overflow page
          	init( SQLITE_FRAGMENT_OVERFLOW_PAGE_USABLE,
          					SQLITE_BTREE_PAGE_USABLE
          					 - 4 // next pageNumber size
          	);
          	init( SQLITE_FRAGMENT_MIN_SAVINGS,                          0.20 );
          
          
	// KeyValueStoreSqlite spring cleaning
          	init( SPRING_CLEANING_NO_ACTION_INTERVAL,                    1.0 ); if( randomize && BUGGIFY ) SPRING_CLEANING_NO_ACTION_INTERVAL = deterministicRandom()->coinflip() ? 0.1 : deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_LAZY_DELETE_INTERVAL,                  0.1 ); if( randomize && BUGGIFY ) SPRING_CLEANING_LAZY_DELETE_INTERVAL = deterministicRandom()->coinflip() ? 1.0 : deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_VACUUM_INTERVAL,                       1.0 ); if( randomize && BUGGIFY ) SPRING_CLEANING_VACUUM_INTERVAL = deterministicRandom()->coinflip() ? 0.1 : deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_LAZY_DELETE_TIME_ESTIMATE,            .010 ); if( randomize && BUGGIFY ) SPRING_CLEANING_LAZY_DELETE_TIME_ESTIMATE = deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_VACUUM_TIME_ESTIMATE,                 .010 ); if( randomize && BUGGIFY ) SPRING_CLEANING_VACUUM_TIME_ESTIMATE = deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_VACUUMS_PER_LAZY_DELETE_PAGE,          0.0 ); if( randomize && BUGGIFY ) SPRING_CLEANING_VACUUMS_PER_LAZY_DELETE_PAGE = deterministicRandom()->coinflip() ? 1e9 : deterministicRandom()->random01() * 5;
          	init( SPRING_CLEANING_MIN_LAZY_DELETE_PAGES,                   0 ); if( randomize && BUGGIFY ) SPRING_CLEANING_MIN_LAZY_DELETE_PAGES = deterministicRandom()->randomInt(1, 100);
          	init( SPRING_CLEANING_MAX_LAZY_DELETE_PAGES,                 1e9 ); if( randomize && BUGGIFY ) SPRING_CLEANING_MAX_LAZY_DELETE_PAGES = deterministicRandom()->coinflip() ? 0 : deterministicRandom()->randomInt(1, 1e4);
          	init( SPRING_CLEANING_LAZY_DELETE_BATCH_SIZE,                100 ); if( randomize && BUGGIFY ) SPRING_CLEANING_LAZY_DELETE_BATCH_SIZE = deterministicRandom()->randomInt(1, 1000);
          	init( SPRING_CLEANING_MIN_VACUUM_PAGES,                        1 ); if( randomize && BUGGIFY ) SPRING_CLEANING_MIN_VACUUM_PAGES = deterministicRandom()->randomInt(0, 100);
          	init( SPRING_CLEANING_MAX_VACUUM_PAGES,                      1e9 ); if( randomize && BUGGIFY ) SPRING_CLEANING_MAX_VACUUM_PAGES = deterministicRandom()->coinflip() ? 0 : deterministicRandom()->randomInt(1, 1e4);

One knob in particular, SPRING_CLEANING_LAZY_DELETE_INTERVAL, used to be set at 1.0 and was reduced to 0.1 to cause lazy deletion to run 10x more frequently. You could try increasing it to slow it back down, but one potential consequence of this is that a cluster may not be able to reclaim space from data movement, etc. very effectively.

You could also try increasing SPRING_CLEANING_VACUUM_INTERVAL. This is already fairly slow, but the process is not usually necessary unless since FDB can reuse unvacuumed space. If you run multiple processes on the same disk or share it with non-FDB things, then vacuuming may still be helpful though to return unused space to the OS.

Topic		Replies	Views
Clear all Data but disk space utilization still same Using FoundationDB	6	689	February 10, 2023
Used disk space dramatically increases while sum of key-value sizes is constant Using FoundationDB	5	2092	September 4, 2018
Can't clear database (delete all data) Using FoundationDB performance	1	1344	February 19, 2022
FoundationDB cluster performance issue - Periods of high disk I/O and sustained high latency Using FoundationDB performance	21	2573	July 6, 2020
Disk space used not going down after clearrange Using FoundationDB	6	1370	June 5, 2020

Cannot clear all keys by "clearrange \x00 \xff"

Related topics