I was wondering if there are any publicly available resources (papers, blog posts, videos) that describes Snowflake’s Entity Layer?
The only information that I could find was a mention at 10:22 from FDB Summit Presentation.
Snowflake and Apple have the most operational expertise with FoundationDB.
Before embarking on layer development on top of Tokio/Rust bindings, I wanted to make sure I’ve learned as much as possible from existing prior-art.
With Record Layer, its bit easier.
The areas where I am looking to gain some insight about Snowflake’s Entity Layer are around: Type System, How schema migration is handled and Index Maintenance.
Also I am very open to design level contributions/suggestions that I can get from existing layer developers.
If you think Tokio/Rust might be something that you might adopt in the future and would like to shape the design for your needs as well, please let me know!
I’ve created an Entity layer for Java, similar to JPA. The key features are:
Every entity has a primary key
Entities can be queried via primary key or secondary indices.
It has async (CompletableFuture) and sync access modes
Entities can be immutable or mutable
Mutable entities have dirty checking and auto-save at the end of the transaction
Retrieval can be instant fetch (exact), lazy stream (iterator) or reactive stream (iterator)
Entities can have references to other entities
Immediate query, Lazy reference proxies or Eager reference proxies for related entities
Session cache and second level cache
No code generation needed
Indices are Java POJOs too
Indices can be queried using method references instead of string field names or statically generated classes
Covering index queries (returns index POJOs)
Conditional indices (with WHERE clause)
Multi range queries are prefetched in the background
Entity management dashboard with live statistics down to index-level access
Distributed grid computing engine on top of entities with micro batching and job-embedding support (jobs and tasks can form a directed acyclical graph) (e.g. for background micro batching jobs like interest calculation of different account ranges etc.)
The company I work for created a high performance micropayment engine and an IoT time series collection engine on top of this proprietary framework, and can provide commercial support if someone is interested.
Sample usage:
// Entity and index definition in POJO style:
@IndexDefinition(shape = IxCustomer.class, type = StoreMode.REFERENCE)
public class Account implements Entity<Long> {
private Long id;
private Customer customer; // another entity
private int someOtherField;
// getters setters constructors omitted
public static class IxCustomer implements Index<Account, Long> {
private final Customer customer;
private final int someOtherField
// getters setters constructors omitted
// new entity with related entity
public void create() {
var customer = orm.getReference(Customer.class, 100); // this is a lazy proxy, no DB query is performed
var account = new Account(42L, customer, 100); // new entity
orm.persist(account); // entity gets persisted to memory only
} // At the end of transaction demarcation, the newly persistend entity gets flushed to FDB
// existing entity gets modified
public void update() {
var account = orm.get(Account.class, 42L); // this is an eager proxy, the statement immediately returns, but an async query is immediately started in the background to fetch the entity
} // At the end of transaction demarcation, the entity gets modified, and the account with the balance of 200 gets written to the DB. All changed indices are updated, if needed.
// query by index
public void index() {
var customerReference = orm.getReference(Customer.class, 42L);
var accountsForCustomer = orm.query(Account.class)
.index(Account.IxCustomer.class, qurery -> query
.field(Account.IxCustomer::getCustomer, c -> c.equal(customer).ascending())
.field(Account.IxCustomer::getSomeOtherField, f -> f.between(200,400).greaterThan(1000))