I’m currently running a POC for a new project that will use FoundationDB as its backend. The workload is inherently bursty and has a mix of read and write workload (20/80 I’d say).
My current configuration is the following
Configuration: Redundancy mode - triple Storage engine - ssd-2 Coordinators - 7 Desired Proxies - 7 Desired Resolvers - 3 Desired Logs - 3 Cluster: FoundationDB processes - 72 Machines - 18 Memory availability - 3.7 GB per process on machine with least available >>>>> (WARNING: 4.0 GB recommended) <<<<< Retransmissions rate - 0 Hz Fault Tolerance - 2 machines Server time - 05/30/19 16:46:55
Each of machines are running on
c5d.2xlarge instance types. I currently have three dedicated machines (4 processes on each) dedicated to the
stateless class. So, 12
stateless processes in total.
My issue is the following:
I cannot get the cluster to recruit more than one resolver. I’ve set the desired amount to 3. I’ve even gone as far as to manually
setclass <ip:port> resolution, but for some reason FDB is refusing to recruit any additional process as a
The odd part is, some of my
stateless processes have zero roles assigned to them. So you’d think they’d be perfect candidates for the role.
Normally this wouldn’t be an issue, but it seems that the
resolver is in fact becoming my bottleneck. My write throughput starts hitting a ceiling at around 250k Hz with my
resolver process’s CPU maxed out at 100%.
I’ve attached a recent 
status json report for anybody who is willing to help
Most notably, if you search for
10.49.58.119:4503 you will see that the class type is set to
resolution, yet the process has zero roles assigned to it.