Identifying number of proxies


In one of our cluster, we tried increasing number of proxies to 6.

fdb> configure proxies=6

status shows below:

Desired Proxies - 6
Desired Resolvers - 6
Desired Logs - 8

However when we check status json for processes with role=proxy, we only see 5.
fdb_trace_role_master_proxy_server_value == 1 shows 7 processes
Wondering where can we find the correct number of proxies running in the cluster.

fdbserver --version
FoundationDB 6.2 (v6.2.15)
source version 20566f2ff06a7e822b30e8cfd91090fbd863a393
protocol fdb00b062010001

Is it safe to assume processes with role=proxy in status json as currently active number of proxies?

Any insight here is much appreciated. To give more context, we noticed performance issues with one of our prod cluster and observed that proxy process was running hot. We decided to up the number of configured proxies to 6. We still see the proxy process using almost 100% CPU all time.

I think status json has some latency to reflect the current cluster status. But that latency is really small, not up to 3 mins in most cases…

How many machines do you have in the cluster? Post your foundationdb.conf may give some clues about why the cluster didn’t recruit enough proxies

I don’t think it is a latency issue, it still shows 5.

fdbcli --exec "status json" | grep "prox"
            "proxies" : 6,
                        "role" : "proxy"
                        "role" : "proxy"
                        "role" : "proxy"
                        "role" : "proxy"
                        "role" : "proxy"

We have 21 machines with 202 active storage processes.
foundationdb.conf are not same across all nodes, but 17 nodes have 12 storage processes and 4 stateless each, rest 4 nodes are dedicated for coordinators and Tx processes. We don’t specify any other role preferences to processes.

This is a bug that was fixed in 6.2.16. Prior to that, status could only label at most 5 processes as proxies, even if the cluster was actually running more.

Thank you @ajbeamon

So is it safe to assume cluster will be running the desired number of proxies or more?
Bit confused here since fdb_trace_role_master_proxy_server_value (wavefront-fdb-tailer metric) shows we have 7 proxies.

I’m not that familiar with that metric, but if it’s reading from trace logs and trying to look for processes that are logging proxy related events, it could be getting confused by the presence of special trace events that get repeated even in some cases when a process is no longer serving a role.

My guess is that you do in fact have 6 proxies. There are probably a couple ways you could confirm this:

  1. Look at the trace logs for recent events with Type=ProxyMetrics and TrackLatestType=Original.
  2. Run fdbcli --exec 'status json' multiple times and take the union of proxies you get back. This is a bit hacky and maybe difficult if you have a lot of proxies, but the idea is that each time you run fdbcli you should be assigned 5 random proxies. Doing this enough times (especially with only 6 proxies) should give you all of them.

Thanks again @ajbeamon Let me try what you suggested.