Someone actually did my recommended benchmarking
If you are not using enterprise-grade SSDs, I think you’d actually be much better off not raid 0’ing your SSDs, and just running one (or more) fdbservers per disk, rather than many fdbservers for one raid0-ed disk. When FDB calls
fsync on a raid0 drive, it’d have to fsync on all of the SSDs, so you’d potentially be having every drive do O(fdbservers) fsyncs, versus O(fdbservers/disks) fsyncs if you don’t raid them. More fsyncs typically means less drive lifetime. The enterprise-grade caveat is that the better classes of SSDs just include a large enough capacitor that they can flush out all the data if power is lost, so that they can turn
fsync into a no-op, and thus it doesn’t matter anyway.
I later re-read and saw you’re running on AWS, so the disk lifetime will be of less concern to you, but I’m still not sure that you’ll get better behavior with raid-ing. Either way, I’ll leave the above in case anyone ever reads this with physical disks in mind.
Be careful about sharing disks between transaction logs, because they care just as much about bandwidth as they do fsync latency, and having transaction logs compete for fsyncs is much more dangerous for overall write throughput than storage servers competing for fsyncs.
Your math seems about right, except that the scaling probably isn’t linear. I’ve gone and dusted off my benchmarking results database, and I had run write bandwidth tests with single replication and got:
1 proxy 1 tlog = 50MB/s
3 proxy 3 tlog = 120MB/s
5 proxy 5 tlog = 170MB/s
7 proxy 7 tlog = 200MB/s
For a workload that I don’t remember, but was probably 100B values.
Numbers irrelevant, but comes out to a quadratic regression of -2.5 x^2 + 45. x + 7.5 being a pretty good fit. Your baseline is 1.5x better than mine, which… hrm. ~10 processes hits the peak on the quadratic equation at 300MB/s. So… it actually turns out I don’t have advice for you here, because I don’t have data laying around past what you can do with ~12 transaction class processes.
My guess is that you’ll probably fall somewhere in 10-15 logs to make it work, an equal number of proxies, and then a few more to give yourself a bit of headroom. I am concerned though that you’re pushing up against the limits of what FDB can do for write throughput right now, so if your use case is going to grow over time, you might be boxing yourself in here. FDB scales reads much better than writes, so a 40:1 ratio favoring writes is going in the wrong way for us.
FWIW, FDB6.2 will make a couple things default that resulted in a ~25% write bandwidth improvement for my tests. You can get them now if you
fdbcli> configure log_version:=3 log_spill:=2. But a 25% change isn’t really going to make or break your use case here…