Scanning a large range with Locality API hangs

Hello!

I need to scan a huge keyrange in fdb. Because such volume of data can not be processed within a 5-seconds limit, I divide the source keyrange to smaller pieces with LocalityUtil.getBoundaryKeys and then I process the pieces. The database is not changed during the scan so I do not warry about consistency.

When I make some code like this (in java)

final CloseableAsyncIterator<byte[]> boundaryKeysIter = LocalityUtil.getBoundaryKeys(db, keyFrom, keyTo);
byte[] laskKey = keyFrom;

while (boundaryKeys.hasNext()) {
  final byte[] nextKey = boundaryKeys.next();
  // do something long with the range lastKey .. nextKey
  lastKey = nextKey;
}
// do something long with the range lastKey .. keyTo

The problem is boundaryKeysIter.hasNext() hangs after about 1500 iterations.

If I modify my loop to iterate over AsyncUtil.collectRemaining(boundaryKeysIter).get() intead of direct iterationg of boundaryKeysIter then my loop works without hang, but this solution will not work when the boundaryKeys collection does not fit to memory.

Seems boundaryKeysIter hangs aftersome period of time. When I do something long during iteration then some transactional timeout occur. But I’d prefer to receive an exception for reinitialising boundaryKeysIter from lastKey and not hang.

What is the right way of iterating through large range keys?

I’m not sure about boundaryKeysIter behavior, but scanning large amount of data is quite common pattern. You will find lot of discussions around it in the forum. For example see this for one of the approaches:

Thank you for the fast answer.

I read the topic Large Range Scans - avoid 5s limit
and I saw this snipped. Its problem is processing of each key sequencionally.

I’d like to iterate subranges (not individual keys) sequencially and to start processing of each subrange in parallel with limitation of the number of ranges are processed simultaneously. So sometimes I have to wait until some ranges finished so it takes some time and makes getBoundaryKeys.hasNext hangs.