I have 3 sever for testing fdb, each has 48 core and 512G RAM, 8 SSD disk. I want to find a topology to get best performance . When I use more stoage and log process, the perfomance test result didn’t increase(I use perfomance.py in source code and go-ycsb). Is there some guide manul about how to decide the number of process and it’s role?
Hi
I haven’t had the chance (yet) to deploy any FDB cluster, but @meln1k wrote a blogpost about deploying a cluster, you may find it useful.
There’s a family of threads about performance tuning:
- Cluster tuning cookbook and the threads linked therein
- How to troubleshoot throughput performance degrade?
Note that both python_performance.py and go-ycsb will test the amount of work that you can generate from only one client. It’s possible that the bottleneck in your benchmark is the client. Why doesn't my cluster performance scale when I double the number of machines? has instructions on how to run multitest
and benchmark fdb clusters using FDB in a way that runs across multiple client processes.
thanks, I will study these
I have a small questions about using large machines for FDB.
Because a single fdb server process can not utilise more than one CPU core, for better performance I should run several processed on the same machine.
But these processes become not hardware isolated. What happens with replication in this case? Must I specify locality_machineid
explicitely for such configuration or it is done automatically by FDB?
locality_machineid
is automatically set if not specified.
It’d done by opening /dev/shm/fdbserver_shared_memory_id
and either using the ID there or generating a new one. So all processes on the same machine will get the same randomly generated machineid. If you manually specify locality_zoneid
, then that takes preference for replica placement, otherwise, locality_machineid
is used as the zone id.
There is one caveat, where if you run multiple processes so that their /dev/shm/
are not mutually visible, for example by running each process in its own docker container, then you can end up with fdbservers on the same machine with different machine ids. If you don’t specify locality_zoneid
in this case, then you’d end up with one machine failure potentially losing all replicas of a shard of data.
Does FDB kubernates operator set locality_zoneid
or allows it’s setting manually?
The operator always sets the locality_zoneid
, based on the fault domain options you configure. The Multi-Kubernetes Replication configuration allows specifying a hardcoded fault domain value for every process associated with a FoundationDBCluster
object. Is that what you’re looking for?