FoundationDB on SSDs with atomic write support

Passion · February 14, 2019, 11:19pm

Hello,

I wonder whether it makes sense to enhance the storage engine (SQLite and the upcoming Redwood) of FoundationDB so that it can natively leverage the atomic write support of SSDs? Due to their indirect write nature, SSDs can easily support atomic write for a write request spanning multiple consecutive LBAs, and such SSDs are emerging on the market. If the underlying SSDs readily support atomic write, FoundationDB storage engine can naturally disable the journaling/WAL (SQLite) and indirect mapping (Redwood), which may lead to lower write amplification and even higher speed performance. Sincerely appreciate any comments and advices!

SteavedHams · February 16, 2019, 10:06am

I think you mean non-consecutive. Atomic write support enables a set of writes to arbitrary LBA’s to be made atomically.

And yes, this feature should make it possible to atomically update SQLite’s B-Tree without the use of a WAL.

In Redwood, however, the indirection layer [ (logical page, version) → physical page ] is not just for facilitating atomic tree updates, it also enables old versions of pages to be kept for a configurable amount of time so that clients can efficiently read from the database at older data versions. Atomic writes provided by an SSD will not fill this need, however it is possible that Redwood could make use of atomic writes when configured to not retain any version history.

Passion · February 16, 2019, 1:32pm

Thank you very much for sharing your comments. For SSDs to support atomic write over non-consecutive LBAs, one has to enhance the interface so that applications/filesystems can pass the atomic group information to SSDs. This may not be trivial and is not supported by current standards like NVMe and SATA (to my understanding). What I meant was atomic write spanning consecutive LBAs (it may not require interface change if we can ensure the write over consecutive LBAs falls into one BIO at the Linux block layer), which however seems to be too strict to be useful to SQLite and Redwood. Thanks!

SteavedHams · February 16, 2019, 3:03pm

The nature of B-Trees and B+Trees is such that making changes to the key space at or between existing keys will result in essentially a random block write pattern. So atomic linear writes are not useful here. In contrast, a log structured merge tree always issues serial block writes regardless of where the key space is changing.

I have not looked into implementation details or the APIs for this feature. I had the impression that arbitrary non-consecutive LBA’s in the same atomic write were meant to be supported from these slides proposing an atomic write interface from several years ago, but I guess what vendors ended up doing is different…

Passion · February 16, 2019, 3:29pm

Yes, over the years quite a few academic papers picked this low-hanging fruits of realizing atomic write support over non-consecutive LBAs. But unfortunately only Fusion-io implemented it many years ago with its own NVMFS. Many thanks!

Topic		Replies	Views
Seeking to understand and fix open rocksdb storage engine issues FoundationDB Core	2	422	April 17, 2024
Redwood page fillfactor support Development	15	1412	May 12, 2020
Hot write keys with atomic operations and constraints? Using FoundationDB	2	483	November 16, 2022
FoundationDB with HDD Using FoundationDB	5	1358	January 3, 2019
Redwood Storage Engine documentation for 7.1 is missing Using FoundationDB	12	3148	February 1, 2024

FoundationDB on SSDs with atomic write support

Related topics