We just ran YCSB workloads over the latest FoundationDB (with redwood storage engine) on SSDs with built-in hardware-based transparent compression, and measured how transparent compression could reduce the physical dataset footprint and write amplification. One thing we observed is that the workload concurrency (i.e., the number of YCSB clients) has a significant impact on the write amplification, e.g., under 100%-update YCSB with 64B record size and 4KB page size, write amplification (i.e., the total_write_IO_traffic_volume / total_size_of_updated_records) increases by 3x when we reduce the client number from 32 to 4. Could anyone please shed light on what the reason could be? The reason why we are interested in write amplification (in addition to storage capacity) is that the emerging QLC flash has very limited cycling endurance, hence it is critical to reduce the write amplification. Thanks.
Tong Zhang, ScaleFlux