Pure in-memory (no disk) instance

(Daniel Dunbar) #1

Is it possible to run the memory storage engine in a mode that won’t make any use of a disk (i.e. no log files, no ability to restore state)?

I can’t find any mention in the documentation of a way to do this, but it seems like it could work (e.g. with triple replication). The only major limitation I see from reading the docs (not the code) is that it appears the model is that fdbserver is restarted to pick up configuration changes, and cannot hot load them. Therefore, in such a mode rolling out config changes would necessarily put the cluster in a less fault tolerant state.

Are there other reasons I missed why this isn’t an option, or is it just a missing feature?

(Markus Pilman) #2

The first obvious problem is that you would always lose all your data whenever you lose more than two processes simultaneously - for most use-cases memory is therefore not an option.

That being said, one could implement this pretty easily. However, this would result in a serious limitation: you would never be able to upgrade to a new major version of FDB without losing all your data.

The reason for that is that FDB does not support online upgrades. We work on our own fork that does support online upgrades. But this is a major change (we completely rewrote the serialization protocol and are currently working on a way to test online-upgrades in the simulator) and we don’t know when this is stable and whether this will be merged back. But something like this would be a requirement for a pure memory instance.

If you want to use FDB as a caching layer (instead of memcached for example), you could simply fork the source and do it yourself. All you need to change is in the file You could just delete all code that writes to the passed IDiskQueue. Alternatively you could implement your own IDiskQueue (an interface you can find in IDiskQueue.h) and pass that to the storage engine (in IIRC). This implementation would simply do nothing. While I am not 100% sure, I think this would work and would be a 30 minute task.

(A.J. Beamon) #3

If you simply want to have an implementation of KeyValueStoreMemory that doesn’t use the disk, then I think I agree with Markus that it wouldn’t be too hard to do. If you want your cluster to use no disk at all, however, you’ll have some other files to deal with as well.

I believe for the memory storage engine, the KeyValueStoreMemory is used by the data files on the storage server, the persistent data store on the transaction logs, and the coordinators. I haven’t really considered whether there are any issues (besides lack of durability and upgradability) that could arise from not using the disk in each of these situations.

In addition to the above files, there are trace logs, a cluster file, and a processId file for each process. For the transaction logs, there is also another DiskQueue that is not associated with a KeyValueStoreMemory. Depending on how strict your requirements to not use the disk are, you may be ok to have some of these files on disk given that they don’t get used much. The transaction log’s DiskQueue is very frequently written and synced, however, so using it may not meet your requirements. It’s possible that a similar approach to what Markus suggested could work here (i.e. don’t use the DiskQueue or give it an implementation that does nothing), but I again haven’t given it much thought.

There are also problem a few scattered places that interact with the disk in various ways (for example, to collect metrics). It wouldn’t surprise me if some of those would require changes to work too.

(Daniel Dunbar) #4

Using the disk some is fine, I was just wanting to avoid disk space O(N) in the size of the data.

(Daniel Dunbar) #5

Thanks Markus, that all makes sense and matches what I expected.

The one point I wonder about is this:

That being said, one could implement this pretty easily. However, this would result in a serious limitation: you would never be able to upgrade to a new major version of FDB without losing all your data.

I see, what I was expecting was that one could roll an instance at a time (and let the fault tolerance handle it), but now I realize this doesn’t work because the current FDB expects all versions of the cluster to be on the same FDB version at once. Is that it?


(Markus Pilman) #6

Yes. FDB assumes that all processes in a cluster run on the same major version. So you could upgrade with a process-by-process strategy between minor-version upgrades, but not to the next major version.

(A.J. Beamon) #7

To be more precise, we have a versioning scheme that has a major, minor, and patch version (e.g. 5.2.8, with major version 5, minor version 2, and patch version 8). Patch releases are compatible, so processes running 5.2.X can interoperate. Major and minor releases are not compatible, so for example you couldn’t run 5 and 6 together, and you couldn’t run 5.1 and 5.2 together either.

(Sam Pullara) #8

This is kind of OK though if writing to disk is temporarily possible. If you need to upgrade at some point you could configure a different storage engine that is persistent, do the upgrade and then revert back to your in-memory only version.