Supporting 100K concurrent clients

Ankit · October 19, 2018, 12:16pm

We need to support very large number of concurrent clients. We cannot use proxy because that will increase latency.

I believe the limiting factor is cluster controller which is a single process.
Is it possible to separate out client and server controller? Then there can be more than one client controller? The client controller can be chosen in round robin or data center aware way.

These are just my initial thoughts. I am sure, I would have ignored lots of scenarios that needs to be though of.

Would like to start a discussion on same.

dave · October 21, 2018, 7:18pm

Note that each of the 100K clients may also connect to any given storage server; it’s not just the cluster controller that will be under stress in this scenario. The cluster controller does little for clients, basically just forward them (rare) changes to configuration information, so it should be easy to have multiple places to go for that. But without adding an extra hop to the read path, with huge numbers of clients you are either going to have huge numbers of open connections on storage servers, or lots of connections being opened and closed. It’s possible that the extra hop is in fact the highest performance solution for your use case.

Also note that to find the cluster controller clients have to talk to the coordinators…

Ankit · October 22, 2018, 1:27am

But without adding an extra hop to the read path, with huge numbers of clients you are either going to have huge numbers of open connections on storage servers, or lots of connections being opened and closed.

Is there an connection open to storage servers even when these is no read/no transaction ongoing?

How does the client gets to know about changes in data partitions? (i.e. which storage server holds which key ranges).

KrzysFR · October 22, 2018, 6:51pm

I’m curious about your use case: What do you call exactly a ‘client’, if you have at least 100K of them? And do they really need to talk directly to the database cluster?

If your definition of client is a unique process running somewhere, would it be possible to have these 100K clients talk to some API backend, with a lot less servers, and then have these server connect to the database? With database connection pooling, this could dramatically reduce the load on the cluster, and even though it is a “proxy”, maybe the latency would be decreased overall (due to batching, reduced memory overhead, …)

Topic		Replies	Views
Cluster Controller CPU Utilization Pegged Using FoundationDB	6	940	October 1, 2018
Proxy layer for securing the cluster FoundationDB Layers	8	1912	September 10, 2019
How to connection pool requests for foundation db cluster Using FoundationDB	2	960	December 11, 2019
Allowing client APIs to use an "in-memory" fdb.cluster file? FoundationDB Core	2	1749	September 1, 2018
How FDB react to too many connected client FoundationDB Core performance	4	1126	September 13, 2023

Supporting 100K concurrent clients

Related topics