Flatbuffers for Network Communication

(Markus Pilman) #1

Hi all,

Those of you who were at the FoundationDB Summit already have some context on this topic, but I wanted to make people aware of this ongoing work.

The serializer FoundationDB currently uses for everything (network messages and metadata written to the ‘\xff’ key space) is very simple. You can think of it as copying memory over the network. This simplicity is at the same time a great strength and a great weakness. It is probably as fast as it can be but it comes with some major downsides:

  1. Every node (client or server) has to run the exact same version of the code. This proved to be an operational nightmare for us. The multiversion client makes this a bit better but this is imho a hacky solution…
  2. It is not very safe: if a reader expects a different structure than a writer, the reader will simply interpret some memory in a wrong way. In other words: you look at undefined behavior. Best case, it will segfault, worst case it will destroy your data. This is mostly solved with versioning but it has to be done very carefully (think of version X writing something to disk and version Y reading this - if it doesn’t test the version correctly, bad things can happen).
  3. It is not portable. Should you ever plan to run a heterogeneous environment (like ARM and Intel CPUs in the same cluster), you will probably run into issues. Even compiler versions and operating systems might be problematic in some cases.
  4. Interpreting messages without context is virtually impossible. If you get a byte string of a message, you will not be able to interpret that message. Doing so requires knowledge about endpoints and this knowledge is virtually impossible to obtain.

For this reason, Snowflake replaced the serialization implementation with our own implementation of FlatBuffers. FlatBuffers is a serialization library implemented by Google and the underlying protocol is well documented. It has the following benefits:

  1. It is very fast. Logically it can’t be as fast as the memory-mapping that FDB currently is doing, but in our benchmark we couldn’t measure a significant performance difference.
  2. It is well tested: FlatBuffers is widely used in production systems.
  3. It is simple: we couldn’t use the official FlatBuffers implementation as this relies on an IDL. However, reimplementing it within FoundationDB was easy (and it is much simpler to do than something like Capn Proto).
  4. Messages are backwards and forwards compatible. This means, that an old client can communicate with a new server and the other way around. There’s obviously always some additional work needed to make sure that this always works, but so far it worked well for us. We actually even do online upgrades (we upgrade one machine at a time).

I started working on a port for this to Open Source. The ongoing process can be seen here:

The code is almost code complete, but it is not yet working (there are some weird corner-cases that need to be taken care of). This first implementation misses a few features, the most important one being that one can’t remove any fields.

The initial idea for this is the following:

  • FlatBuffers can be turned on and off by command line argument. We even support mixed mode - so some processes can use the FlatBuffers serializer while other can use the current serializer. This is made possible by using a bit in the protocol version that denotes whether a client serializes with flatbuffers (client in this context means the process that opened the connection).
  • For testing, each simulated process will flip a coin to decide whether it will be using flatbuffers or not.
  • For meta data, nothing changes yet. However, whenever you add a new meta data field that needs to be persisted, you will have the option to use flatbuffers, which will be easier to use as it makes all version checks virtually unnecessary.

I am interested on comments and criticism and code reviews for this change.

(Ryan Worl) #2

I noticed you removed the actor simDeliverDuplicate, which didn’t appear to be used anywhere. Probably off-topic, but that seems to me (not knowing much about the simulator) like it would be kind of important for simulation. The knob it references is also only used there, so I don’t think it was replaced with code somewhere else.

(Markus Pilman) #3

I agree, that delivering duplicates seems important. I am not sure whether the simulator does this somewhere else.

But we know for sure this is dead code - which I stumbled upon while working on this and I don’t like dead code (also this way I don’t have to implement SerializeSourceRaw). IMHO: delivering duplicates is outside of the scope of this patch, so I didn’t bother looking into it. The important thing is that this patch doesn’t remove this functionality.

Good catch on MAX_DELIVER_DUPLICATE_DELAY - I missed that. I will remove this knob as well.