Understanding slow log servers

Daniel-B-Smith · September 18, 2020, 7:30pm

I have a write-heavy workload where at steady state regular writes are sometimes throttled and batch transactions are somewhat consistently throttled. Looking at the output of status json, it says that the factor limiting batch transactions is a long log queue. From looking at the ProcessMetrics, MachineMetrics and our system metrics that the problem is not a network or CPU bottleneck. My working hypothesis is that the process is bottlenecked on the disk, but I’m wondering how best to measure that. For storage servers, the server can maintain a deep enough disk queue depth that % of AWS published IOPs is a meaningful measure. The tlogs seem to intentionally not build up a queue depth, so I’m not quite sure what metric to be looking at.

Topic		Replies	Views
LogServer disk busy in production deployment Using FoundationDB performance	4	148	October 28, 2024
Scaling log server and log to storage ratio Using FoundationDB	5	89	May 15, 2025
What do you monitor? Using FoundationDB	35	10024	September 26, 2022
Log processes and CPU saturation Running FoundationDB performance	4	112	May 15, 2025
How to troubleshoot throughput performance degrade? Using FoundationDB performance	35	4356	June 20, 2019

Understanding slow log servers

Related topics