Why was Flow developed?

Why was Flow developed for FDB instead of using something like Boost ASIO? Was ASIO not around (or too new) when FDB development began? Were there major design concerns with ASIO or other asynchronous libraries?

I’m taking a look at different asynchronous C++ frameworks and wanted to tap into the wisdom of the FDB development team.

1 Like

Flow and Boost ASIO are kind of orthogonal to each other. In fact, Flow uses asio internally…

ASIO is a platform independent abstraction for event based programming (select, poll, epoll etc). Flow however is a programming language (or rather a C++ extension) that implements the actor model. Flow uses stackless coroutines to do that. Additionally it also provides a rich set of libraries (flow, fdbrpc) for asynchronous programming within said framework. But these don’t actually use operating system functionality directly but rather use asio and libeio (the exception here is kernel AIO on Linux).

I am probably the wrong person to comment too much about FDBs history. But AFAIK the motivation behind Flow was testing. FoundationDB (the company) built a simulator before they built a database. And in order to build this simulator they needed a proper abstraction. So they wrote flow.

Flow also solves some problems that you have in a typical program that uses asio: you don’t get a callback hell which is nice (instead you can use something similar to async and await - but in flow it is called ACTOR and wait). Basically writing flow code feels like writing blocking code but the actor compiler will translate it to callbacks for you. This should also be faster than using a stackful coroutine library.

2 Likes

Flow is more of a stateful distributed system framework than an asynchronous library. It takes a number of highly opinionated stances on how the overall distributed system should be written, and isn’t trying to be a widely reusable building block, like ASIO. It is built out of a number of components, which we can pick apart instead:

The actor compiler, in today’s modern world, is essentially a backport of c++17 resumable functions to c++98. It breaks functions apart at suspend points as a source-to-source transformation. You get to write stright line, normal-looking code, and brings you the wonders of async / await that a number of other language camps have found preferable over callback style code. It’s worth noting that what Flow refers to as actors aren’t actually actors. There is no Erlang or Akka-style actor system in Flow. An ACTOR could be an infinite loop that listens for messages, which resembles something similar, but there’s no concept of behaviors. The majority of ACTOR functions are just ones that need to suspend at some point during their computation to wait for network, io, a time delay, or another actor function to finish.

Using the futures-based async/await framework, Flow provides an evented run loop, which uses boost’s ASIO. This choice of a run loop is not an important detail to flow’s implementation. I suspect that it would could be replaced by libuv, to trim some dependencies, or DPDK’s runloop, if one wished to explore high performance networking. The important part, is that Flow controls exactly which completed future is run, and in which order. This provides the control over scheduling that is required for deterministic simulation.

Flow also provides a number of smaller libraries that all provide asynchronous and future-ified APIs for interacting with the host system. All file IO, network IO, logging, system metrics gathering, etc., must be done through these libraries. Also transparently provided are simulated/faked implementations of each interface for when a process is being run in simulation mode.

However, programs will not directly use an asychronous network interface. Instead, Flow offers an RPC framework. An Interface defines a collection of PromiseStream<> s, each one representing an RPC method. An actor is on the other side, waiting on the matching set of FutureStream<> s. Sending an RPC is using a PromiseStream to send a completed Future to the receiver, where the value is the request. Part of the request is a ReplyPromise that the sender waits on its corresponding future for the reply. This models RPCs as just an extension of the future framework, and thus the caller is unaware and uncaring if the destination is local to the same process, or far away across the network. This provides the location agnosticism that is required to collapse all actors into one process for simulated testing.

This now gives us the pieces required to build deterministic simulation, but we need to build a bit more on top of it to gain the full benefit…

In addition to the more commonly discussed master, proxy, resolver, etc., roles that are in FDB, there also exist testers. Testers run workloads, which is roughly just a class that has a start() , run() , and verify() . This is used to build tests that perform specific, composable functions. Killing simulated machines is a workload. Verifying that the final state of a series of operations done to a simulated FDB matches that of an in-memory map is a workload. Fuzzing the client API, and asserting that any errors (or lack of errors) that occur are expected in the client API contract/model is a workload. Running concurrently with all of these is a comprehensive assortment of fault injection built into all of the simulated variants of file, network, etc. libraries that programs are required to use. This can also be used to test upgrades of FDB. An older version of FDB is run, and then killed at a predefined moment. A newer version is then started, and the previously run workloads are used to ensure that the resulting cluster is both available and correct.

Built into FDB itself is an extensive amount of code that proactively tries to make dangerous and hostile decisions, using the BUGGIFY macro. For a given test seed, a random assortment of BUGGIFY lines are enabled, and then each line will trigger 5% of the time. This means that FDB is proactively helping fault injection and simulated nemesis workloads to find correctness issues in FDB, by trying to make it continuously standing on the brink of disaster already.

So viewing Flow and Boost.ASIO as equivelents undersells what Flow was built to provide. However, it’s definitely not a silver bullet either. There hasn’t really been any special effort paid to trying to allow Flow to be easily broken out and reusable elsewhere. (Although the Document Layer might be a decent example of doing so.) I also think Flow is coming due for some overhauls. I keep on eyeing entirely replacing the actor compiler with C++'s resumable functions. A lot of the filesystem code could use some cleanups on all platforms. I’m not sure anyone is overly cheerful about the state of TLS. All that said, I don’t think anyone would be resistant either if one wished to submit changes to make it easier to separate Flow into its own reusable project. :slight_smile:

2 Likes

Thanks @alexmiller and @markus.pilman! I’m doing my initial read-through of Flow so I’m sure I’ll have more questions along the way. I’m interested in Flow’s potential for use outside of FDB (including the RPC mechanism FDB has for Flow). Hopefully I can ask the right questions to help get the design details documented on this forum.

BTW, it’s not hard to build Flow outside FDB. For instance, I’ve attempted moving the FDB 6.1.9’s flow library out as a separate repo and added a few examples. It shouldn’t be hard to move fdbrpc out too (I have done it before).

1 Like

If you’re familiar enough with cmake, you can also just add foundationdb as a sub-project and add the flow and fdbrpc targets as dependencies to your own software. That way you won’t have to maintain a fork and building should be fast enough (as cmake shouldn’t rebuild all of fdb - only flow and fdbrpc - if you get your dependencies right).

2 Likes