Shards are not splitted into smaller ones

There are three different steps happening when you clear data from the B-tree.

The first step, which I described as detaching the data from the B-tree, is performed when the mutation is applied. I’m not super familiar with the code for this step, but I believe the idea is that it tries to grab the minimal number of pages that are needed to delete the data from the tree and store them elsewhere for later processing. In other words, a node whose entire sub-tree is going to be deleted only needs to touch the root node and not any of the children. You can see the code for this here: https://github.com/apple/foundationdb/blob/f418a85390734373b74f022ae6a61bed23eb92ee/fdbserver/sqlite/btree.c#L7201.

The remaining two steps are what I referred to when I mentioned deferred cleanup. The first of these is something we call lazy deletion, where we traverse the sub-trees removed in the first step and properly free each page. Among other things, this puts each page onto the free list and makes it available for reuse when new pages need to be allocated. See https://github.com/apple/foundationdb/blob/f418a85390734373b74f022ae6a61bed23eb92ee/fdbserver/sqlite/btree.c#L7073.

The last step is what we call vacuuming, and it’s doing what you describe; namely it fills in the holes in the file with pages from the end and then truncates off the empty pages. See https://github.com/apple/foundationdb/blob/f418a85390734373b74f022ae6a61bed23eb92ee/fdbserver/sqlite/btree.c#L3042.

1 Like