Large amount of log messages with SlowSSLoopx100

Hi everyone (my first post here).

We have started using FoundationDB with a Kubernetes operator installation in AWS EKS. After adding a sidecar container to ship the trace logs with Severity > 20 in a central location, we noticed that all storage servers are constantly logging SlowSSLoopx100 messages (every about 15s).

Thinking we must have made some configuration mistake (the production cluster we are testing has some different specs than below), i went and did a very basic install of

… and the results were similar, a lot of SlowSSLoopx100

trace.10.100.29.70.4501.1750674795.uJrSIG.0.1.xml:<Event Severity="20" Time="1750675166.314823" DateTime="2025-06-23T10:39:26Z" Type="SlowSSLoopx100" ID="c6437fbcc6166a71" Elapsed="0.107017" ThreadID="1680275776998010791" Machine="10.100.29.70:4501" LogGroup="test-cluster" Roles="RV,SS" />

My main question is, is anyone else seeing the same messages in their clusters ?
Has anyone troubleshooted something like this before ? It is especially puzzling since I have been looking into cpu/disk and nothing stands out, everything is idle as it can be. It makes no sense for this event loop to take over 50ms (as per the code foundationdb/fdbserver/storageserver.actor.cpp at 2b9b0f778a6ccdcb82bee1b945cbd99e17e1c1b3 · apple/foundationdb · GitHub)

Thanks everyone :slight_smile:

The default provided cluster is not meant for performance testing and is only meant to provide an example configuration. Per default the operator will set the same limits as requests are set: fdb-kubernetes-operator/docs/manual/warnings.md at main · FoundationDB/fdb-kubernetes-operator · GitHub in the case of the example cluster the following resources are specified:

              resources:
                requests:
                  cpu: 100m
                  memory: 128Mi

So you probably are hitting CPU throttling.

Thank you for the link. I went ahead and disabled cpu quotas in k8s and increased requests to 1 CPU and still see these messages constantly.

After a few more tests, it looks like this may be a false positive message, due to the DB being mostly idle (maybe, hopefully here
). I configured the knob knob_no_recent_updates_duration down to ~15ms and almost all of the warnings related to SlowSSLoopx100 disappeared, the logs are clear now. Is this a wrong assumption ?

Thank you!