FoundationDB as backend for JanusGraph - Iterate through all vertices

I’m using JanusGraph 0.4.1 and FDB 6.2.22 as backend to store data. I’m using FDB-JG adaptor code to connect, store and fetch data

The problem comes when data size is more, for eg. 2M vertices, and we watch to iterate through all of them…
eg:

transaction.vertices().forEachRemaining()

It fails in

public RecordIterator<KeyValueEntry> getSlice(KVQuery query, StoreTransaction txh) throws BackendException
{
        final FoundationDBTx tx = getTransaction(txh);
        final StaticBuffer keyStart = query.getStart();
        final StaticBuffer keyEnd = query.getEnd();
        final KeySelector selector = query.getKeySelector();
        final List<KeyValueEntry> result = new ArrayList<>();
        final byte[] foundKey = db.pack(keyStart.as(ENTRY_FACTORY));
        final byte[] endKey = db.pack(keyEnd.as(ENTRY_FACTORY));

at

final List<KeyValue> results = tx.getRange(foundKey, endKey, query.getLimit());

with following error
Permanent failure in storage backend followed by Max transaction reset count exceeded

On further evaluation of error found below exception:

java.util.concurrent.ExecutionException: com.apple.foundationdb.FDBException: Transaction is too old to perform reads or be committed
        at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
        at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
        at com.experoinc.janusgraph.diskstorage.foundationdb.FoundationDBTx.getRange(FoundationDBTx.java:183)
        at com.experoinc.janusgraph.diskstorage.foundationdb.FoundationDBKeyValueStore.getSlice(FoundationDBKeyValueStore.java:138)
        at org.janusgraph.diskstorage.keycolumnvalue.keyvalue.OrderedKeyValueStoreAdapter.getKeys(OrderedKeyValueStoreAdapter.java:116)
        at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getKeys(KCVSProxy.java:56)
        at org.janusgraph.diskstorage.BackendTransaction$4.call(BackendTransaction.java:384)
        at org.janusgraph.diskstorage.BackendTransaction$4.call(BackendTransaction.java:381)
        at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
        at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
        at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:469)
        at org.janusgraph.diskstorage.BackendTransaction.edgeStoreKeys(BackendTransaction.java:381)
        at org.janusgraph.graphdb.database.StandardJanusGraph.getVertexIDs(StandardJanusGraph.java:412)
        at org.janusgraph.graphdb.transaction.VertexIterable$1.<init>(VertexIterable.java:43)
        at org.janusgraph.graphdb.transaction.VertexIterable.iterator(VertexIterable.java:41)
        at com.google.common.collect.Iterables$6.iterator(Iterables.java:589)
        at org.janusgraph.graphdb.tinkerpop.JanusGraphBlueprintsTransaction.vertices(JanusGraphBlueprintsTransaction.java:128)
        at com.ibm.fci.graph.algorithm.risk.VertexMapCreator.createVertexMap(VertexMapCreator.java:216)
        at com.ibm.fci.graph.algorithm.risk.VertexMapCreator.runCreateVertexMap(VertexMapCreator.java:305)
        at com.ibm.fci.graph.algorithm.risk.VertexMapCreator.main(VertexMapCreator.java:322)

What’s the best way to get the desired outcome?

1 Like

See if this helps.

Gaurav,
Thanks for the link. I tried few things which were mentioned there.

The idea is,

  1. I get the startKey, endKey and limit from query.limit() - then get Iterable<KeyValue> Range
  2. Start consuming keyValues
  3. Hit error → create new txn → get nextKey from where the error occured → go to Step 1.
  4. When consumed all, then break

2 questions here:

  1. Is this idea okay?
  2. If yes, what would be the break condition here? How do I get that

Thanks!!

The transaction will automatically complete successfully (.join() on transaction future will return success) when there are no more keys to be read in the given range.

Or you can add your own break conditions at the beginning of transation - like max time spent, max number of keys read, or max number of retries due to continuation etc.