Apache Arrow and FoundationDB


Apache Arrow data format is being used more and more every day. It’s very useful for data analytics because many libraries support it for zero-copy computation (no de-serialization cost). Also, Arrow has an IPC for data transfers.

Has anyone used Arrow with FoundationDB? Has anyone implemented Arrow IPC with the current clients? what are the challenges?


We use flatbuffers for IPC - which also can be read without deserializing it. Is there any particular feature you would hope to get from Arrow?

I’m trying to imagine a scenario that I can load from FoundationDB directly to Arrow which enables us to go directly a data frame in many computing libraries (e.g. Rapids, Panda …)
I feel it will involve implementing a lot of low-level C++ codes on both sides. I’m wondering if anyone has already done it.

I am not sure I understand this use-case correctly. Do you want to read some data from FDB and then send it directly to Rapids? Or do you want to ingest FDB data into another system (like have a up-to-date copy of your data in a data warehouse)?

@markus.pilman I want to load data from FDB and zero-copy use it in Rapids via Arrow protocol.