Reasons for not co-locating tlog and SS? IO characteristics of SS

SteavedHams · March 19, 2020, 10:18pm

For #1, two other considerations are

The logs call fsync() for every commit version, so hundreds of times per second, while storage servers only call it once or twice per second. I think most drives incur some small hiccup in performance while an fsync is pending.
An SSD’s write performance per pattern (linear vs random) under a mixed linear+random workload is usually not the same as what each workload can achieve individually. I’m not entirely sure why this is, but it’s a thing. In other words, if a drive can do 300MB/s linear and 50MB/s random writes, if you do 25MB/s of random writes you do not still have 150MB/s of linear write budget remaining, it is something less.

For #2, I certainly agree that storage server I/O characteristics should be better explained and in one place. Probably the single most detailed source of this information right now is my presentation and side deck from the 2019 summit. The slides can be found here, the video is not yet linked but should be soon. FoundationDB Summit 2019: Redwood Storage Engine Update

Regarding the write queue depth: FDB uses SQLite on top of a file caching layer that holds all writes in memory until commit time and then issues them to disk all at once. This is to coalesce multiple writes of the same pages during the commit cycle. So yes, the write queue depth is large when writes are being done but for much of the time writes are not being done, and the bottleneck is for the single threaded writer to read the uncached pages it requires as it traverses the tree for each mutation and applies its changes.

Topic		Replies	Views
Troubleshooting queue build up Using FoundationDB	11	1629	December 13, 2019
SSD engine and IO queue depth Using FoundationDB	2	2601	September 10, 2019
Some Clarification on Storage Engine and Disk/IO Using FoundationDB	12	2261	July 23, 2019
Scaling log server and log to storage ratio Using FoundationDB	5	64	May 15, 2025
CPU limited storage processes Using FoundationDB performance	9	1530	May 18, 2021

Reasons for not co-locating tlog and SS? IO characteristics of SS

Related topics