Hi all!
I thought you might like to see something neat - I’m working on a realtime data processing pipeline / event sourcing system lately called statecraft. Over the last few days I’ve added foundationdb backend support.
That means I can seamlessly run my little collaborative text editing demo on top of a foundationdb cluster. Editor demo here!
The editor allows any document name after /edit
in the URL and it creates documents on the fly. It has full reconnection support, and it’ll render whatever you add into markdown.
Apologies in advance if people have vandalised the page (there’s no access control) or if the server has gone down (there are more bugs - this is still POC territory).
Background & implementation
If you’re interested in some background on this, I worked on Google Wave back in 2011, and then when that got cancelled I reimplemented my own small version of a lot of the tech into ShareJS, which became ShareDB for arbitrary realtime document editing. Foundationdb was the first database I wanted to build sharedb on top of because of its rock-solid fundamentals. But then FDB disappeared and I was bullied by my company into using redis + mongodb instead.
Anyway, statecraft is in many ways the latest iteration of this long path. It supports … actually, I’ll hold that explanation for later. I’m very pleased with it though.
In terms of realtime support with FDB, it leans heavily on fdb’s versionstamps. Every operation is added to a queue (op/VERSIONSTAMP
). The documents themselves is stored with the versionstamp prefixed at the front. The client knows the versionstamp of the document snapshot its looking at, and when it sends a change to the server I’m checking for conflicts per-key on top of FDB. (code) Multiple users can edit different documents concurrently with conflicts. But if two users do conflict, I’m reading out the operation log and doing a separate eventual consistency pass. (Code, though it is far from straight forward)
For realtime messaging, each write operation also writes into a big shared global version key (with conflicts turned off, though this might still be a bottleneck in big systems). Then each frontend server watches that global version key. Whenever it changes, it reads all client operations since that version and sends them to all connected listening clients.
Its a bit bad using a single global current version key. I expect it would be more efficient if I could just follow the raw fdb oplog directly, or watch the versionstamp some other way. With FDB the way it is now I could also instead make a whole set of ‘latest version’ keys scattered throughout the keyspace and have each frontend server just watch all of them and take the max value when they change. But, as I said, this is a proof-of-concept. Eventually I want to rewrite the backend into rust, though I’m hoping futures land in stable before that happens (and then that someone makes a rust FDB binding layer which adapts FDB futures into rust futures).
The current code also re-stores the whole text document with every edit, but this is just because I haven’t tuned it. It should store a collection of recent edits alongside each document, and then bake the edits back into the document itself in the database only when there’s enough of them that it makes sense. That way it would be ammortized O(1) with the size of the document.
Enjoy! And AMA!