JanusGraph FoundationDB storage adapter

twilmes · April 26, 2018, 10:29pm

Hello,
I spent some time since the FoundationDB open sourcing announcement last week putting together the first, crude version of a FoundationDB storage adapter for the JanusGraph graph database. JanusGraph is LinuxFoundation hosted fork of TitanDB. We’ll be evaluating it to see if FDB a good fit for the JanusGraph access patterns/use cases but on the face of it, Janus is lacking a good, distributed ACID option for folks, and I think FDB is quite compelling in this department. I plan on picking the experts brains here as I get a bit further along to make sure my implementation/data modeling is sympathetic with FDB best practices.

In the meantime, if anyone is interested in living dangerously, the work in progress can be found here.

–Ted

wwilson · April 27, 2018, 2:23am

Welcome! This is very exciting to see!

Before the Apple acquisition, a FoundationDB employee had been working on a Titan storage adapter for FDB. As I recall, the trouble we ran into was that Titan’s execution engine wasn’t architected in a way that allowed for easy pipelining of reads. This in turn made it tricky to get satisfactory latency from a distributed storage implementation.

I’d be really curious to hear whether this issue has been fixed, and if not then whether you figure out a clever way to work around it. Please do let us know if we can be helpful with any of this!

Will

bbc · April 27, 2018, 3:51pm

This sounds like a terrific idea! I’m very glad to see projects like this. Please keep us up to date on where you get with this…I’m sure lots of folks on the forums will be happy to give advice

The first thing that came to mind as a pitfall is pretty much exactly what Will said: the simplest way to satisfy a read API will be to create a new transaction for every read (or do the reads purely sequentially). This probably will work, but will may leave you wanting more speed. Two good practices in general for any use of the key value API are:

Each transaction is doing a good number of reads. Transactions need to do the “get read version” step at the start, and it’s best if that cost is amortized over many reads, each of which is very cheap.
Lots of reads are “outstanding” at the same time. This will almost always take the form of lots of reads issued on a Transaction (either get() or getRange()) and the resulting Future objects put into some container. If the code creates many Futures before calling get() on any one of the them, the database will process them all in parallel.

twilmes · April 27, 2018, 8:10pm

Hello and thanks for the input!

We now have the ability to issue parallel, asynchronous read requests against our storage layer of choice within the context of a single transaction. This isn’t on by default at this point, but I think we’ll have it at a good spot fairly soon where it will be. As you all have pointed out, this greatly helps the death by a thousand sequential reads situation that arises if we lazily execute the I/O as we traverse the graph. I’ll continue to refine the first cut implementation and report back with progress, updates, and questions.

–Ted

Topic		Replies	Views
FoundationDB Summit 2019: NuGraph: GraphDB as a Cloud Service Built Upon JanusGraph and FoundationDB Community	6	2238	March 6, 2023
FoundationDB as Storage Backend for JanusGraph: still feasible and advisable ? Using FoundationDB	2	534	November 6, 2023
FoundationDB read performance Using FoundationDB performance	4	1496	September 26, 2018
Queries on foundation db usage patterns Running FoundationDB	0	572	September 3, 2020
FoundationDB as backend for JanusGraph - Iterate through all vertices Using Layers	3	2100	August 10, 2020

JanusGraph FoundationDB storage adapter

Related topics