When and how to use please_reboot_delete

mengranewo · March 26, 2019, 8:47pm

We’re working on the issue related to memory storage engine recovery optimization where you can selectively abort/skip the process if the cluster is healthy and then let this woker join the cluster as new.

My questions are :

how to reboot and delete data? I found the code to handle rebootRequest inside worker.cpp, but please_reboot_delete() is only thrown under the condition of g_network->isSimulated() == true
Where is logic to deal with the please_reboot/please_reboot_delete exception ?

Thanks!

alexmiller · March 27, 2019, 11:10pm

The please_reboot/please_reboot_delete gets thrown out of the workerServer, falls through fdbd, and ends up getting caught within simulatedFDBDRebooter in fdbserver/SimulatedCluster.actor.cpp. Look at the code surrounding the SimulatedFDBDRebootAndDelete trace event.

I’ll confess that figuring this out also took me much longer than I expected…

EDIT: Sorry, I forgot to actually answer your first question.

As for how to reboot and delete, I don’t think you’d actually need to reboot. I think you’d need to change it so that in worker.actor.cpp when it’s scanning DiskStores that exist on disk and re-creating the corresponding storage and log instances, if it sees that the cluster is healthy, it creates the storage server and then immediately calls dispose() on it. dispose() then deletes the associated files. You’ll then rejoin the cluster as a worker with nothing recruited, and then likely be immediately recruited by data distribution as a new storage server.

This will probably mean having to move the code that does recovery from the constructor of KeyValueStoreMemory into the init() method that IKeyValueStore offers but everything but Redwood doesn’t actually use…

Topic		Replies	Views
How to blow away a DB and recreate it? Using FoundationDB	3	826	October 10, 2018
Fdb cluster is unavailable after delete a disk Using FoundationDB	3	1146	July 9, 2020
What triggers a rollback()? Development	8	1184	August 25, 2018
Recovery/Reviving a Storage Full Cluster Using FoundationDB	1	31	May 15, 2025
Pure in-memory (no disk) instance Using FoundationDB	7	1266	September 6, 2018

When and how to use please_reboot_delete

Related topics