However when we check status json for processes with role=proxy, we only see 5. fdb_trace_role_master_proxy_server_value == 1 shows 7 processes
Wondering where can we find the correct number of proxies running in the cluster.
fdbserver --version
FoundationDB 6.2 (v6.2.15)
source version 20566f2ff06a7e822b30e8cfd91090fbd863a393
protocol fdb00b062010001
Any insight here is much appreciated. To give more context, we noticed performance issues with one of our prod cluster and observed that proxy process was running hot. We decided to up the number of configured proxies to 6. We still see the proxy process using almost 100% CPU all time.
We have 21 machines with 202 active storage processes. foundationdb.conf are not same across all nodes, but 17 nodes have 12 storage processes and 4 stateless each, rest 4 nodes are dedicated for coordinators and Tx processes. We don’t specify any other role preferences to processes.
This is a bug that was fixed in 6.2.16. Prior to that, status could only label at most 5 processes as proxies, even if the cluster was actually running more.
So is it safe to assume cluster will be running the desired number of proxies or more?
Bit confused here since fdb_trace_role_master_proxy_server_value (wavefront-fdb-tailer metric) shows we have 7 proxies.
I’m not that familiar with that metric, but if it’s reading from trace logs and trying to look for processes that are logging proxy related events, it could be getting confused by the presence of special trace events that get repeated even in some cases when a process is no longer serving a role.
My guess is that you do in fact have 6 proxies. There are probably a couple ways you could confirm this:
Look at the trace logs for recent events with Type=ProxyMetrics and TrackLatestType=Original.
Run fdbcli --exec 'status json' multiple times and take the union of proxies you get back. This is a bit hacky and maybe difficult if you have a lot of proxies, but the idea is that each time you run fdbcli you should be assigned 5 random proxies. Doing this enough times (especially with only 6 proxies) should give you all of them.
I set configure proxies=3 on my fdb cluster but fdbcli --exec "status json" | grep "prox" returns that only 1 process has the role proxy.
There are three fdbserver processes with the class = stateless in the cluster, one of them receives the proxy role. When I stop the machine running this process the other two stateless processes receive the proxy role. When the first machine starts again, these two processes lose the proxy role and the initial process receives it.
So I cann’t achive running 3 proxy processes at all. There are only 1 or 2 processes with the proxy role.
How to enforce fdb cluster to run 3 proxy processes?