Thanks @alexmiller . The information in this thread: All Coordinators Crashed At Same Time(Foundationdb 6.2 - fdbserver going out of memory) is very useful.
We now can support 2000 connected clients stably after increased the open file limit to LimitNOFILE=262144
at process level.
However, if we keep increase more clients, we started to seeing the similar behavior as we tested earlier.
-
#client hit > 2000: 1-2 proxies failed with
Fatal Error: Network connection failed
- #client hit > 3000: More stateless processes crashed and some VMs dis-joined from the cluster. Some tests seeing the coordinators crashed too.
We also observed that the number of ls -l /proc/<process-pid>/fd/ | wc -l
keep growing but do not see it exceeded the max limit we set.
@alexmiller Wha’t the maximal connected clients that you ever tested for a large scale cluster?
What would be limit or ideal number of the connected clients for Foundationdb on production?