Torture test gRPC facade w/ Record Layer

panghy · August 2, 2021, 10:46am

Had been trying to find a good torture test for the the Lionrock gRPC facade for FoundationDB and decided to go down the path of building a gRPC Record Layer client: https://mvnrepository.com/artifact/io.github.panghy.lionrock/lionrock-fdb-record-layer-client (it currently depends on a snapshot build of the record layer from my PR)

Needed to refactor RL to support non-native FoundationDB Java Client - Refactor FDBDatabase and FDBDatabaseFactory by panghy · Pull Request #1344 · FoundationDB/fdb-record-layer · GitHub (currently open) in the record-layer
On my private branch, swapped the real FDBDatabaseFactory to the gRPC version - Hard-wire to test the record-layer with a locallly running lionrock s… · panghy/fdb-record-layer@ccc6d16 · GitHub
Ran the test, fix some bugs, repeat =p
Only the following tests would fail (expectedly):
- FDBDatabaseImplTest.performNoOpAgainstFakeCluster()
- FDBRecordContextTest.timeoutTalkingToFakeCluster()

Apart from the odd transaction_too_old exceptions from a couple of large tests (they are flaky due to the database responding slower than the real thing), all of the tests passes!

Some flaky ones include:

TextIndexTest.[4]Index{‘Simple$text_suffixes’, text}#4
OnlineIndexerBuildRankIndexTest$Unsafe.addWhileBuildingParallelRank

With that, one can connect the RL library to an FDB cluster over gRPC without native binaries. The tests passes normally in about 6-7 minutes but takes about 14-15 minutes via gRPC. That’s mostly from the fact that there is no client-side caching (RYW or otherwise) and every get/getRange call needs to hit the network.

scgray · August 2, 2021, 4:02pm

Hi Clement!

Yes, thanks for the contribution for the refactoring of the FDBDatabaseFactory and FDBDatabase! I saw that @MMcM made some comments, and I’m doing a pass as well (I’ll try to get you some feedback over the next day or two). I’m not too surprised about the performance difference – as you noted, without a RYW cache sitting on the client, there is a lot of back-and-forth that wouldn’t have otherwise been necessary and, obviously, each of these low level calls now involves a proxy through the RPC service and, when you get to large chained pipelines of work (for example, in complex queries), the latencies will be magnified.

There had previously been some discussion about a formal RPC interface for FDB (FoundationDB RPC Layer Requirements · apple/foundationdb Wiki · GitHub) and I had had similar concerns over the resulting performance without a RYW cache in the client. Even then, I think that for really complex operations, it would require something akin to a Record Layer Service to allow the complexity to be pushed closer to the database.

panghy · August 2, 2021, 4:58pm

Thanks! Yeah, I was mulling about actually writing the cache in the client itself but that means keeping track of read ranges and layering that with mutations (and tracking keys that are invalid if read) given that one can turn RYWs on and off. That seemed a bit too dangerous (but we now have quite a bit of stress testing that can be applied against it at least =p).

Topic		Replies	Views
Lionrock: A gRPC Facade for FoundationDB and more Using FoundationDB bindings	7	1560	July 29, 2021
Record Layer query performance benchmarking against traditional RDBMS Record Layer	3	1513	April 22, 2020
Thoughts from using and deploying the Record Layer as a learning exercise Record Layer	0	184	May 14, 2024
gRPC binding / gateway FoundationDB Layers bindings	10	2252	September 5, 2019
Using Record Layer via other languages, e.g. Python Record Layer	6	1035	September 9, 2021

Torture test gRPC facade w/ Record Layer

Related topics