Is there some guide manual for a reasonable way to deployment topology ？

FateTHarlaown · July 6, 2021, 6:49am

I have 3 sever for testing fdb, each has 48 core and 512G RAM, 8 SSD disk. I want to find a topology to get best performance . When I use more stoage and log process, the perfomance test result didn’t increase(I use perfomance.py in source code and go-ycsb). Is there some guide manul about how to decide the number of process and it’s role?

PierreZ · July 6, 2021, 8:13am

Hi

I haven’t had the chance (yet) to deploy any FDB cluster, but @meln1k wrote a blogpost about deploying a cluster, you may find it useful.

alexmiller · July 6, 2021, 11:05pm

There’s a family of threads about performance tuning:

Cluster tuning cookbook and the threads linked therein
How to troubleshoot throughput performance degrade?

Note that both python_performance.py and go-ycsb will test the amount of work that you can generate from only one client. It’s possible that the bottleneck in your benchmark is the client. Why doesn't my cluster performance scale when I double the number of machines? has instructions on how to run multitest and benchmark fdb clusters using FDB in a way that runs across multiple client processes.

FateTHarlaown · July 7, 2021, 2:52am

thanks, I will study these

osamarin · July 9, 2021, 8:20am

I have a small questions about using large machines for FDB.

Because a single fdb server process can not utilise more than one CPU core, for better performance I should run several processed on the same machine.

But these processes become not hardware isolated. What happens with replication in this case? Must I specify locality_machineid explicitely for such configuration or it is done automatically by FDB?

alexmiller · July 9, 2021, 5:43pm

locality_machineid is automatically set if not specified.

It’d done by opening /dev/shm/fdbserver_shared_memory_id and either using the ID there or generating a new one. So all processes on the same machine will get the same randomly generated machineid. If you manually specify locality_zoneid, then that takes preference for replica placement, otherwise, locality_machineid is used as the zone id.

There is one caveat, where if you run multiple processes so that their /dev/shm/ are not mutually visible, for example by running each process in its own docker container, then you can end up with fdbservers on the same machine with different machine ids. If you don’t specify locality_zoneid in this case, then you’d end up with one machine failure potentially losing all replicas of a shard of data.

osamarin · July 12, 2021, 6:10am

Does FDB kubernates operator set locality_zoneid or allows it’s setting manually?

john_brownlee · July 19, 2021, 3:39pm

The operator always sets the locality_zoneid, based on the fault domain options you configure. The Multi-Kubernetes Replication configuration allows specifying a hardcoded fault domain value for every process associated with a FoundationDBCluster object. Is that what you’re looking for?

Topic		Replies	Views
Why doesn't my cluster performance scale when I double the number of machines? Using FoundationDB performance	20	3305	August 17, 2018
FoundationDB cluster setup Using FoundationDB performance	14	2658	February 13, 2023
Production optimizations Using FoundationDB	20	6439	August 15, 2018
FDB Server deployment resources? Using FoundationDB	9	2646	May 5, 2018
Cluster tuning cookbook Using FoundationDB	26	8859	February 1, 2019

Is there some guide manual for a reasonable way to deployment topology ？

Related topics