Thanks for the detailed reply @MMcM. I really appreciate it
Currently there are no plans for a query execution engine. Please allow me to share with you my current plans.
My first objective for Rust Record Layer is to have something like DynamoDB API (but on steroids) along with queuing system API. In addition, I want to have the APIs to handle non-idempotent transactions.
My second objective is to figure out the operational issues around running a production cluster on AWS: Access Control, Audit, Logging, Metrics, Alerting, Upgrades, DR + Multi-region replication, Failover, Load testing, etc. We will also need to lay the groundwork for regulatory compliance.
In case of logging, our business case will not be able to pay for current log management solutions. Here I am planning on using Quickwit. I also plan on adding FoundationDB/Rust RecordLayer support for Quickwit Metastore. When it is done, I think this will be helpful for others in the FoundationDB community as well.
Hopefully we’ll be able meet the above two objectives in 2023.
For my use-case, I will need to figure out a way to do
-
Analytical queries. For this, I am thinking of using DuckDB and its extensions mechanism.
-
Graph Queries. There is a new project in this space called KuzuDB. I am planning on experimenting with it to see if I can make it work. If it works, it will greatly simplify graph side of the architecture. Being able to do graph queries is very important for us.
-
Tantivy would be used for full text search.
My current thinking is that I’ll use the Rust Record Layer queuing system to integrate the above three features.
That is correct. There is no serde style serialization within Rust FDB bindings. I’ll explore this further and see if the support can be introduced.
Few months back I did a survey around the potential serialization formats that I could use for serializing the values. The primary contenders were Protobuf and Avro.
In the Rust ecosystem, for Protobuf there are two libraries: prost
and rust-protobuf
. Unfortunately, prost
does not support Protobuf reflection. rust-protobuf
has support Protobuf reflection and message descriptors but the current maintainer is looking for a new maintainer. As you would know there is no official Protobuf spec. Currently there is no “blessed” Google Rust implementation of Protobuf in the open either.
Even though Protobuf would have given me lot more capability, I did not want to put myself in a situation where I have to debug a Protobuf seralization issue in production. prost
is primarily being used to support gRPC usecase and not the data storage usecase.
On the other hand, Avro has much less capability. However, it is properly specified and its backward and forward compability semantics is well understood. The specification and code base is simple enough for me to be comfortable in case of any potential issues with the “official” Apache Avro crate. So currently I am leaning more towards using rsgen-avro and apache-avro crate. Additionally upcoming arrow2 and parquet2 crates have built-in support for Avro, so that would make things a bit easier when building data pipelines.