i know this has been asked before , but this post covers a slightly different issue as compared to the other one.
We are using Record Layer to model our data and dealing with volumes in the range of ~10TB.
Below captures the essence of our Data model -
message Parent {
int primary_id = 1; //Primary Key
Nested1 nested1 = 2;
Nested2 nested2 = 3;
}
message Nested1 {
int foo = 1;
int bar = 2;
}
message Nested2 {
string baz = 1;
string bat = 2;
}
We have a legacy Relational Database , from which we read records and attempt to first hydrate or bulk load the data into a record layer fronted FDB Cluster ( 6.2.19 ).
We have repeatedly run into bottlenecks on the write speed we can see on the cluster.
Cluster shape -
Number of machines - 16
Config -
RAM - 600 GB
DISK - 5.5 TB ( NVMe SSDs )
CORES - 40
Resolvers - 4
Proxies - 4
Storage Processes - 84 ( 12 dedicated machines as storage servers )
Tlogs - 4 ( separate from Storage )
We use a SINGLE record store , since model-wise that makes sense for us.
Essentially, our keyspace looks as follows -
/<some_parent>/<some_static_id>/<enviroment_value>/<record-store-name>/<message_type>/<primary_key>
The prefix is constant for all records , varying only at the level of the primary_key
, which is the identifier.
Our bulk loading processes run in parallel, to make more FDB network threads available to scale the write throughput, but we notice that invariably, after some time we see one of the storage processes running at 100% CPU, and as soon as that happens the cluster write throughput comes to a crawl.
I’m guessing RateKeeper
kicks in here and limits the throughput.
Record Layer Specifics -
This is how we get the Record store keyspace -
private static final com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace KEY_SPACE =
new com.apple.foundationdb.record.provider.foundationdb.keyspace.KeySpace(
new DirectoryLayerDirectory("some_parent")
.addSubdirectory(new DirectoryLayerDirectory("some_static_id")
.addSubdirectory(new DirectoryLayerDirectory("enviroment_value"))));
Finally ,
- Is there a way we can scale the write throughput with the above model in place ?
- We currently see write rate range around ~800000 hz, but the rate of increase of key-val sizes is ~3.5GB/min. Which is almost the same when the write rate were ~400000Hz. Is there a way to have a better control on the shard assignment? since i guess our prefix for keyspace is not allowing better partitioning of the data