Big increase and sudden drop in key-value size and disk usage over one week

lehu · August 26, 2022, 6:06am

Over the last week (7 days), our FDB cluster experienced a large increase in key-value size and disk usage. We have 85 storage pods with 340GB disks. For the last 7 days, our developer loaded about 300 GBs of data. However the key-value size was 1.1TB, much larger than 300GB, from 4.9TB to 6.0TB.

The loading stopped at August 24, 13:00 pm. Then at August 25 05:30am, the key-value size dropped suddenly to below 5.0TB.

We have a daily backup job for the cluster running for long time. The last week’s data loading is a batch, and it’s much bigger than usual.

We scan the trace logs and see a lot of errors like this:

<Event Severity="20" Time="1661403748.485664" Type="**TLogQueueCommitSlow**" ID="f2acefd5229f05de" LateProcessCount="6" LoggingDelay="\ 1" Machine="10.104.220.130:4000" LogGroup="default" Roles="TL" />

The event TLogQueueCommitSlow does show up in the log. But it is severity of 20. It is not error. Another error occurred often is SlowSSLoopx100.

We still have logs; if we want to look for the root cause, what events should we look into?

Also, the disk usage increase seems to be even more dramatic on the rise, but the drop is very slow (not like the sudden drop of the key-value size).

We don’t understand why the quicker increases and the sudden/slow drops in KV size and disk space usage. Any insight?

SteavedHams · August 30, 2022, 3:03am

The extra KV Bytes could be from backup. If a backup was running during the data load, then a copy of the mutation logs for that entire period including all data loaded would be stored in the database as KV data until it is flushed to the backup destination and then deleted from the database. It would also be deleted if the backup was aborted. The deletion would cause a very fast drop in KV size.

lehu · August 30, 2022, 4:49pm

That Sounds like a good explanation for what happened. Thank you, Steve.

Topic		Replies	Views
Moving Data more than double the key-value size stored Using FoundationDB	0	472	January 11, 2019
DR is growing continuously Using FoundationDB	0	124	March 24, 2024
Key-value sizes at DR source and destination have a big difference Using FoundationDB	5	662	July 20, 2022
Sum of key-value sizes seems incorrect Using FoundationDB performance	3	912	August 2, 2021
Disk space used not going down after clearrange Using FoundationDB	6	1356	June 5, 2020

Big increase and sudden drop in key-value size and disk usage over one week

Related topics