Query latency increase in fdb-java library version 6.1.8 onwards

srujann · December 7, 2020, 2:26pm

We recently upgraded our fdb-java library version from 5.2.8 to 6.2.22 and observed that latency for some of our queries increased by ~18%. FDB client and server version used are 6.2.20 and the only change is fdb-java library version. So we tried out different fdb-java versions (available on maven) and found that this increase started occurring from fdb-java version 6.1.8 onwards. Looking at fdb-java library code, it looks like there were quite some changes between 6.0.18 (the latest version where we don’t latency increase) and 6.1.8. Below is our query latency data for different versions of fdb-java library. Would appreciate any help with addressing this problem.

FDB client and server version: 6.2.20

FoundationDB CLI 6.2 (v6.2.20)
source version 77b5171e81754f2fda8869703d662e59d85b7f23
protocol fdb00b062010001

Query latencies for different versions of fdb-java library:

FDB Java Lib version	Query latency average % difference from baseline	Query latency range
6.2.22	+18%	~19-21 seconds
6.2.10	+18%	~19-21 seconds
6.1.9	+18%	~19-21 seconds
6.1.8	+18%	~19-21 seconds
6.0.18	0 %	~16-17 seconds
6.0.15	0%	~16-17 seconds
5.2.8 (baseline)	0%	~16-17 seconds

ajbeamon · December 7, 2020, 5:36pm

What does your query look like inside the block being timed?

srujann · December 8, 2020, 5:04pm

Below is a snippet of code that shows the FDB queries in the timed block. It involves multiple queries to FDB. First, a range query is issued FDB and for each of the rows returned in the first query we issue another range query (which typically yields only 1 row). So, its n+1 range queries to FDB. We do batch the queries using a BatchReader in our open source library.

// first query to fetch metadata
Iterator<KeyValue> metadata = db.getBatchReader().getRangeAsync(
          (F<ReadTransaction, AsyncIterable<KeyValue>>) input ->
              input.getRange(startRange, endRange, 1000, false, StreamingMode.WANT_ALL)).get(1, TimeUnit.MINUTES);

// queries for each entry in metadata
Iterator<CompletableFuture<List<KeyValye>>> records = transform(metadata,
        metadataRow -> {
          byte[] rowKey = extractRowKey(metadataRow)
          return db.getBatchReader().
              getRangeAsync((F<ReadTransaction, AsyncIterable<KeyValue>>)
                  input -> input.getRange(Range.startsWith(rowKey)));
        });

ajbeamon · December 9, 2020, 12:19am

Does this code involve the tuple layer or creating Database objects in Java? There were some substantive changes to both of these so ruling either or both out could make this easier.

srujann · December 9, 2020, 1:25am

No new objects for com.apple.foundationdb.Database class are created in this code. We reuse an instance that is already created at the startup but new Transaction objects are created.

Can you clarify what you mean by tuple layer? As part of the timed code, rows are decoded using calls like Tuples.getLong() and Tuples.getBytes() but this Tuples class is within our code and not from fdb-java library. So this did not change between different version of bindings used.

ajbeamon · December 9, 2020, 1:41am

I was referring to the FDB Tuple layer in the bindings that helps with packing and unpacking tuples into byte strings suitable for keys and values. It sounds like you aren’t using it.

I’m not sure what else is different between the two versions, but the rest of the changeset should be small. I’ll have to take a deeper look.

ajbeamon · December 9, 2020, 9:11pm

There’s not much else that seems to have changed between 6.0.18 and 6.1.8 that I can see. There is one change to make it so that we are no longer creating a throwable when we have a non-error result, which you could try undoing and seeing if it makes a difference:

There’s also some changes in ByteArrayUtil, though unless you are calling the changed functions in here directly, I think the only way you could encounter the changes is by using the Tuple layer. They are also intended to benefit performance:

I guess I should also check, you aren’t using the Subspace or Directory functionality from the Java bindings, are you? These make use of the Tuple layer.

srujann · December 9, 2020, 11:55pm

We are not using Subspace, Directory and ByteArrayUtil classes in this code. So likely we are not hitting Tuple layer. Will try undoing the exception handling change you mentioned. Is there anything else I can try with to narrow down the problem?

ajbeamon · December 10, 2020, 5:38pm

I don’t have any other specific things to check. You could always try bisecting the various changes that went into it, though, if it turns out the one I suggested above isn’t it. I don’t know if this is completely comprehensive, but here are the PRs from our release notes that should cover the major changes:

* Java: Successful commits and range reads no longer create ``FDBException`` objects, which avoids wasting resources and reduces memory pressure. `(Issue #1235) <https://github.com/apple/foundationdb/issues/1235>`_
* The API to create a database has been simplified across the bindings. All changes are backward compatible with previous API versions, with one exception in Java noted below. `(PR #942) <https://github.com/apple/foundationdb/pull/942>`_
* Java: Deprecated ``FDB.createCluster`` and ``Cluster``. The preferred way to get a ``Database`` is by using ``FDB.open``, which should work in both new and old API versions. `(PR #942) <https://github.com/apple/foundationdb/pull/942>`_
* Java: Removed ``Cluster(long cPtr, Executor executor)`` constructor. This is API breaking for any code that has subclassed the ``Cluster`` class and is not protected by API versioning. `(PR #942) <https://github.com/apple/foundationdb/pull/942>`_
* Java: Several methods relevant to read-only transactions have been moved into the ``ReadTransaction`` interface.
* Java: Tuples now cache previous hash codes and equality checking no longer requires packing the underlying Tuples. `(PR #1166) <https://github.com/apple/foundationdb/pull/1166>`_
* Java: Tuple performance has been improved to use fewer allocations when packing and unpacking. `(Issue #1206) <https://github.com/apple/foundationdb/issues/1206>`_
* Java: Unpacking a Tuple with a byte array or string that is missing the end-of-string character now throws an error. `(Issue #671) <https://github.com/apple/foundationdb/issues/671>`_
* Java: Unpacking a Tuple constrained to a subset of the underlying array now throws an error when it encounters a truncated integer. `(Issue #672) <https://github.com/apple/foundationdb/issues/672>`_

Topic		Replies	Views
High client tail latency with Go bindings, increasing with CPU usage Using FoundationDB bindings , performance	2	626	April 6, 2022
Latency of range queries that return large number of key-value pairs Using FoundationDB	9	2088	June 13, 2023
Are spikes of 500ms+ MaxRowReadLatency normal? Using FoundationDB	7	1206	July 11, 2019
How to scale foundation db reads Using FoundationDB	20	6380	March 18, 2019
FoundationDB 6.1.8 Released Development	4	796	May 28, 2019

Query latency increase in fdb-java library version 6.1.8 onwards

Related topics