Hello! We have been experimenting with FDB for a while and would like to use it as our primary DB. It seems like it’d be greatly beneficial to have at least an elementary understanding of how user-specified process classes and automatic recruitment work together. I have a few questions to clarify my understanding, and also would like to validate some rules-of-thumb I extracted out of prior discussions in this forum. So here goes:
-
Can a process with an assigned class (i.e. not
unset
) still get recruited to another class if there is a need? My understanding is that user-specified assignments are more like guidelines, and can be overridden in run-time if necessary. Is that correct? -
This is a follow-up question to the above. Let’s say I have a host with a single storage device and two processes whose classes are
transaction
andstateless
. Does such an assignment imply that this host will not store any data (modulo transaction logs), or does it simply state that this host prioritizes transaction log processing over data storage? If the behavior is the former, one would need to take this fact into account when thinking about replicas and such. -
Does it ever make sense to configure a host so that
num processes
>num cores
? In such a case, my understanding is that extra processes will mostly stay idle and only do meaningful work if the firstnum cores
processes are all stuck waiting on things. Is that even plausible? I’d think processes would refrain from synchronous waits so that such a situation is avoided. Therefore, I’m inclined to usenum processes
==num cores
. What am I missing? -
After studying some discussions on this forum, here are the rules-of-thumb I extracted to come up with a reasonable “default configuration” for our FDB clusters. I’d be glad if you could eye them and let me know if any of them doesn’t make sense.
-
num disks
=num storage processes
+num log processes
for each host -
num total proxy processes
=num total log processes
- Proxy and log processes are not on the same host (so that they don’t compete for BW)
- Storage and log processes do not use the same storage device (so that they don’t compete)
- When using SSD storage devices,
num of storage processes
/num of log processes
should roughly be around 8. - Any extra processes that we don’t know what to do with should be of the class
stateless
instead ofunset
so that they do not get auto-assigned tostorage
.
I know this is a long post… thanks in advance for all the responses; they will be greatly appreciated.