I’ve noticed a consistent pattern on our cluster in regards to the Ratekeeper WorstTLogQueue metric. Additionally we’ve found this GH issue from @alexmiller https://github.com/apple/foundationdb/issues/620. As of yet I have not been able to correlate these to specific performance issues, but as noted in the GH issue, there is potential for this spillage to impact write workload performance.
I am wondering if anyone else has encountered this pattern and what actions were taken. Is this generally an indicator that we should add more logs (per @ajbeamon’s comment) or is it best to attempt to tune with --knob_server_mem_limit if RAM is available?Cluster tuning cookbook