Hi!
I’m currently running a POC for a new project that will use FoundationDB as its backend. The workload is inherently bursty and has a mix of read and write workload (20/80 I’d say).
My current configuration is the following
Configuration:
Redundancy mode - triple
Storage engine - ssd-2
Coordinators - 7
Desired Proxies - 7
Desired Resolvers - 3
Desired Logs - 3
Cluster:
FoundationDB processes - 72
Machines - 18
Memory availability - 3.7 GB per process on machine with least available
>>>>> (WARNING: 4.0 GB recommended) <<<<<
Retransmissions rate - 0 Hz
Fault Tolerance - 2 machines
Server time - 05/30/19 16:46:55
Each of machines are running on c5d.2xlarge
instance types. I currently have three dedicated machines (4 processes on each) dedicated to the stateless
class. So, 12 stateless
processes in total.
My issue is the following:
I cannot get the cluster to recruit more than one resolver. I’ve set the desired amount to 3. I’ve even gone as far as to manually setclass <ip:port> resolution
, but for some reason FDB is refusing to recruit any additional process as a resolver
.
The odd part is, some of my stateless
processes have zero roles assigned to them. So you’d think they’d be perfect candidates for the role.
Normally this wouldn’t be an issue, but it seems that the resolver
is in fact becoming my bottleneck. My write throughput starts hitting a ceiling at around 250k Hz with my resolver
process’s CPU maxed out at 100%.
I’ve attached a recent [1] status json
report for anybody who is willing to help
Most notably, if you search for 10.49.58.119:4503
you will see that the class type is set to resolution
, yet the process has zero roles assigned to it.
[1] https://gist.github.com/rickysaltzer/cf6327a26a7fc45a553253f1ed51e19a
Ricky