Clear all Data but disk space utilization still same

chetan_pg · February 9, 2023, 11:18pm

We are using FDB 6.2 version . We deleted all data using clearrange \x00 \xff command. Status shows as follows and disk space utilization is still high/same (155 GB used out of 184GB on each node).

Is this expected as i was expecting disk space utilization to go down after clearing data in 6.2 version. If disk space not going down but available for the application to write then I am ok with it but just checking in case we really need to reclaim the space any way to do that.

Status showing 9 TB disk space utilization which is close to 184 GB per node * 55 nodes = 10 TB.

Filesystem Size Used Avail Use% Mounted on
/dev/nvme1n1 184G 155G 20G 89% /fdb-mnt

Welcome to the fdbcli. For help, type `help’.
fdb> status

WARNING: Long delay (Ctrl-C to interrupt)

Using cluster file `/etc/foundationdb/fdb.cluster’.

Unable to start default priority transaction after 5 seconds.

Unable to start batch priority transaction after 5 seconds.

Unable to retrieve all status information.

Configuration:
Redundancy mode - three_datacenter
Storage engine - ssd-2
Coordinators - 5
Desired Proxies - 5
Desired Resolvers - 7
Desired Logs - 10

Cluster:
FoundationDB processes - 220
Zones - 55
Machines - 55
Memory availability - 3.4 GB per process on machine with least available
>>>>> (WARNING: 4.0 GB recommended) <<<<<
Fault Tolerance - 2 machines
Server time - 02/09/23 23:08:32

Data:
Replication health - unknown
Moving data - unknown
Sum of key-value sizes - unknown
Disk space used - 9.504 TB

Operating space:
Storage server - 22.2 GB free on most full server
Log server - 0.0 GB free on most full server

Workload:
Read rate - 5 Hz
Write rate - 0 Hz
Transactions started - 2 Hz
Transactions committed - 0 Hz
Conflict rate - 0 Hz
Performance limited by process: Log server MVCC memory.
Most limiting process: 10.49.90.236:4502

Backup and DR:
Running backups - 0
Running DRs - 0

chetan_pg · February 9, 2023, 11:29pm

Also I am getting warning as below and overall commands are running slow.

WARNING: Long delay (Ctrl-C to interrupt)
The database is available, but has issues (type ‘status’ for more information).

chetan_pg · February 10, 2023, 12:31am

I was just told that crearange command hang and still running i.e. we never got prompt back.

If we are not able to run clearrange can we stop FDB on all servers , clear directories (data/log etc.) under mount point used by FDB and then start fdb service so it will start with empty folders . We dont care about data anyway in this case.

SteavedHams · February 10, 2023, 6:59pm

The ssd engine will shrink data files slowly over time to something closer to the logical KV bytes footprint (+overhead), but this shrinking is not required for re-use of the cleared space. Clearing a large range generates internal reusable free space within the file fairly quickly, and this space can be reused immediately as it is generated. The shrinking process is intentionally slow so as to not affect application performance since it is not necessary for reuse of the space.

That said, your cluster is somehow unhealthy and I don’t think it has anything to do with the clear. Committing a clear range, or any transaction of any size, is not dependent on the changes being made on disk in the storage servers. The commit should return instantly unless the log system is not accepting writes, which appears to be the case in your cluster.

chetan_pg · February 10, 2023, 7:55pm

Thanks Steave ,

Today when I checked the status it was showing disk space re-claim (1 TB out of 9 TB) and it’s in progress. Even though the clearrange command hang it looks like it was working in background. Also Now I don’t get the system slow warning.

Thanks for the quick reply . One question though for future event like this one …should be use clearaange or it is ok to stop all servers , clear mount points and then start the servers especially when cleaning 9 TB using clearrange or just stick to this command .

SteavedHams · February 10, 2023, 8:35pm

You can do a stop/clear/start but you will have to configure new ... the database after that as it will be completely gone.

Clearing all data via a clear range is meant to be safe to do, though it is true that the ssd engine struggles a bit with the deferred work involved in large clear ranges particularly if there is other work happening on the cluster after the clear.

The Redwood storage engine is much better at clears and many other things. Its first production ready version is in FDB 7.1 so you would have to upgrade to use it.

Note that Redwood does not shrink its data files (though this may be added later), it just reuses disk space internally.

chetan_pg · February 10, 2023, 9:44pm

thanks Steve for the help.

Topic		Replies	Views
Disk space used not going down after clearrange Using FoundationDB	6	1352	June 5, 2020
Full disk on one machine results in 99% performance degradation Using FoundationDB	5	2202	November 8, 2018
Work around "Storage server running out of space (approaching 5% limit)" on your developer machine Using FoundationDB	2	1863	June 7, 2020
Brand new macOS installation "has issues" Using FoundationDB	12	1531	June 7, 2020
Can't clear database (delete all data) Using FoundationDB performance	1	1254	February 19, 2022

Clear all Data but disk space utilization still same

Related topics