The one thing that stands out to me is that your tests are specifying a transaction rate that they aren’t achieving, which suggests that they’re saturating the cluster. To measure meaningful latencies, you’ll want to run at a rate lower than the max throughout of the cluster. For example, you could try choosing a transaction rate that is half the throughput your saturating test achieved and measure latencies there. Or even better, you could measure latencies at different rates to get a sense for how latency responds to load up to the saturation point.