Sorry for not clarifying the above statement. In addition to compression type (e.g., lz4 vs. zlib) and data compressibility (e.g., text string vs. floating point number), the size of data chunk (e.g., 4kB vs. 16kB) to which compression is applied also plays an important role. This is because general-purpose compression algorithms (e.g., Snappy, lz4, zlib, and ZSTD) all use LZ search that tries to find repeated bytes in the to-be-compressed data chunk and accordingly use a pointer to replace those repeated bytes. In general, the larger the to-be-compressed data chunk is, the higher probability that we may find more repeated bytes and hence achieve a better compression ratio. My experience is that compression-over-16kB can achieve 10%~30% better compression ratio than compression-over-4kB.
As you pointed out, aligning compressed data with storage sector boundary (e.g., 4kB) is always desirable (especially for storage engines using B-tree structure like InnoDB and Redwood). Taking the transparent page compression feature in InnoDB as one example, with the default 16kB page size, after applying compression to each 16kB page, InnoDB aligns the compressed page to the 4kB boundary (e.g., if we compress one 16kB page to 10kB, then the page is stored as 12kB, wasting 2kB). To simplify the data management, InnoDB relies on sparse file and hole punching support from the underlying filesystem to return the unused one or more 4kB physical sectors back to the filesystem. Because of the use of sparse file, we do not need to change the LBA allocation when the compression results change. Yes, it certainly creates fragmentation, which is nevertheless not a big problem for modern SSDs.
The 4kB-alignment requirement indeed could cause significant space waste or even make page compression unfeasible (e.g., InnoDB/Redwood with 4kB page size). One of the major selling points of LSM-tree (e.g., RocksDB) is that its compression is not subject to the 4kB-alignment constraint, hence MyRocks claims to achieve better compression efficiency than InnoDB. Certainly prefix key encoding also helps MyRocks to improve the compression efficiency.
The Pager interface sounds interesting. Could you please point me to relevant documents or codes so that I could learn more about it? Thanks!