Success Story: FoundationDB at SkuVault

I’d like to share a success story of FoundationDB at SkuVault, where it is used as a multi-model database for event sourcing, queues, docs and metadata: blog post with more details.

FoundationDB is running in production on Azure cloud for more than a year. Despite all the network glitches and reboots, it is the most reliable component in the distributed system. System administrators particularly love FDB - unlike the other databases, it never requires any babysitting or time-consuming maintenance.

If you have any questions, please don’t hesitate to ask! :slight_smile:

8 Likes
  • How do you deploy/update the fleet of fdb machines?
  • How do you handle schema changes / migrations?

About lmdb you say:

Embrace the synergy between LMDB and FoundationDB, making sure that internal tools (debugging, dumps, REPL, code generation DSL) can target both from the start.

Did you develop an abstraction layer on top of lmdb that mimics foundationdb API to be able to use the same “transaction” function on different backends. Basically, how do you swap between lmdb and fdb.

Tx in advance!

We deployed a cluster manually to a set of pre-configured virtual machines. The production deployment started in the beginning of 2017 (before FDB was open-sourced), so there were no updates to expect.

FoundationDB was essentially used as a distributed commit log and for intra-cluster coordination. These aspects don’t go through a lot of schema changes. Most of changes where within the node state (stored in LMDB).

The answer is two-fold:

  1. Node storage was in LMDB and it was highly optimised for in-process access. There was no point in supporting swapping between LMDB and FDB. So FDB toolset was used to develop against the LMDB directly (“toolset”: tuples, key-ranges and modelling techniques).
  2. Cluster interaction layer (everything that ran on FoundationDB) was indeed developed on top of the abstraction. FDB implementation was plugged for the cloud deployments (PROD, QA, STAGE etc), while LMDB implementation was used for local development, demo mode and some tests.

Does this help?

1 Like

You use lmdb instead of sqlite as backend storage? AND you use lmdb in development environment without using the official client library? As it seems like, I don’t understad where lmdb is placed in your codebase. Is it only a dev dependency?

I apologise for the confusion. By “node storage” I’ve meant “application node storage”.

In other words, application nodes (containing business logic) persist all the relevant state inside an optimised LMDB database (which is sourced from events). These nodes communicate to each other using the various layers on top of the FoundationDB. The most important of these layers is the Commit Log.

You can learn more about the overall design by reading through the older posts in the SkuVault story. In particular, I’d recommend checking out System Architecture Overview and LMDB and DSL for it.

2 Likes

No worries.

It seems you use foundationdb as en event log like a single source of thruth and create images of consolidated data for particular service in lmdb.

Is that correct?

Correct.

In other words, LMDB can be used to create various data representations (tables, indexes, graphs etc) that are populated by replaying events from the log and then used to serve write/read requests by the application logic.

1 Like

Awesome, I did not think about that. My database does write 4 tuples (quad tuple or RDF) in a tree like history like in git. It’s branche-able database, I will put your feedback for good.

Wow, you guys managed to negotiate a license a year ago with Apple? We always thought it was closed source and not available for licensing since the acquisition and before the open-sourcing.

No, Apple would never do that.

We used FoundationDB under the community license which still worked after Apple bought the company and the database.

I see, I guess as long as you run really small production clusters, it’s still covered. That’s still a pretty gutsy call (to me), we had the source code for the last 4 years and we thought we ultimately would replace it with something else (we were in the process of that exploration when Ben reached out =p). You guys actually started using it a year ago.

On that project it was a small calculated bet on the outstanding level of engineering employed by the FonudationDB team. Besides, no other database even came close in terms of the required features, performance and survivability.

We had no other choice :]

1 Like

Rinat, are these layers deployed as a separate services or it’s just a library? How do you handle authorization and authentication?

Denis, these layers were just an infrastructure library, available to the application logic. Application logic was responsible for the authorisation an authentication.

Potential approach for strengthening the security at the app level is be to encrypt all messages with tenant-specific keys. This way FDB would be handed data that is already encrypted.

Regarding auth, I was talking about service -> db auth, not application logic auth. Just to be sure that no other (micro-)service accidentally writes to (or deletes) a keyrange owned by another service.

No, there was nothing like that.

Keyranges were protected by the layer logic (nodes could access FDB only through them, no raw access) and some extra simulation testing on top.

Okay, I got it. Raw FDB access is incapsulated by the layer. Thanks!

Exactly.

Also one of the great things of using layers with FoundationDB - you can conveniently compose multiple layer calls in a single atomic transaction. For example: mark job as processed, publish events and update some metadata. If something goes wrong, the entire transaction would roll back.

Thanks for the write-ups on this and all of the information on your implementation!