How difficult would this be? Is it just a matter of creating a new IKeyValueStore
implementation which holds an instance of both existing storage engines and passes through the method calls based on e.g. a key prefix? This makes recovery from a cold start degrade to the speed of reading all the memory engine logs + snapshot off disk, but that is expected.
There are two use cases I’m thinking of primarily here:
-
Ingesting event or log data. If it were to land in the memory engine first, it would be entirely sequential writing and no reading during normal operation, ideal for the large capacity, cheap block storage available in the cloud. Data would be batched into larger chunks and moved elsewhere (such as the SSD engine or directly to S3). This is similar to what Wavefront does today from what I understand, except with two different clusters. A new engine is not strictly necessary for this use case, but it would simplify it.
-
Change data capture for data that lives in FDB already. If you could write changes to existing records into the memory engine instead of spending double on IO that you’d do with the SSD engine. The change log is immutable and can be written out to S3 like in the prior example, or heavily compressed by a layer before writing back to the SSD engine.
In both cases, I envision the memory portion being used as a temporary place where data will be stored for a short period and then deleted. There are surely other use cases I haven’t thought of. This does require that the new storage engine be able to handle transactions which span the storage engines.
I think a single configuration option is enough to make it usable. In the configuration for your storage processes, you choose to use this storage engine by name. Then you supply a prefix where your memory engine will be mounted, leaving everything else as the SSD engine. You would be not be allowed to change this at runtime, and all storage processes would have to use the same value. The system keys would only ever be stored on the SSD engine. An easy default could be the memory storage engine runs at prefix \xFE
, so anyone using the existing layers that come with the bindings would need to explicitly choose to use this feature, like creating a directory layer which lives inside \xFE
.
Another reason to do this is that it has the potential to delay the need to adapt or write an LSM engine from scratch, which I’m sure some people want just for the capacity of absorbing lots of small writes without requiring a ton of write IOPS.
Thanks again to the team for indulging my questions and suggestions.