Snowflake Entity Layer

I was wondering if there are any publicly available resources (papers, blog posts, videos) that describes Snowflake’s Entity Layer?

The only information that I could find was a mention at 10:22 from FDB Summit Presentation.

Snowflake and Apple have the most operational expertise with FoundationDB.

Before embarking on layer development on top of Tokio/Rust bindings, I wanted to make sure I’ve learned as much as possible from existing prior-art.

With Record Layer, its bit easier. :slight_smile:

The areas where I am looking to gain some insight about Snowflake’s Entity Layer are around: Type System, How schema migration is handled and Index Maintenance.

Also I am very open to design level contributions/suggestions that I can get from existing layer developers.

If you think Tokio/Rust might be something that you might adopt in the future and would like to shape the design for your needs as well, please let me know! :slight_smile:

I’ve created an Entity layer for Java, similar to JPA. The key features are:

  • Every entity has a primary key
  • Entities can be queried via primary key or secondary indices.
  • It has async (CompletableFuture) and sync access modes
  • Entities can be immutable or mutable
  • Mutable entities have dirty checking and auto-save at the end of the transaction
  • Retrieval can be instant fetch (exact), lazy stream (iterator) or reactive stream (iterator)
  • Entities can have references to other entities
  • Immediate query, Lazy reference proxies or Eager reference proxies for related entities
  • Session cache and second level cache
  • No code generation needed
  • Indices are Java POJOs too
  • Indices can be queried using method references instead of string field names or statically generated classes
  • Covering index queries (returns index POJOs)
  • Conditional indices (with WHERE clause)
  • Multi range queries are prefetched in the background
  • Entity management dashboard with live statistics down to index-level access
  • Distributed grid computing engine on top of entities with micro batching and job-embedding support (jobs and tasks can form a directed acyclical graph) (e.g. for background micro batching jobs like interest calculation of different account ranges etc.)

The company I work for created a high performance micropayment engine and an IoT time series collection engine on top of this proprietary framework, and can provide commercial support if someone is interested.

Sample usage:

// Entity and index definition in POJO style:

@IndexDefinition(shape = IxCustomer.class, type = StoreMode.REFERENCE)
public class Account implements Entity<Long> {
  private Long id;
  private Customer customer; // another entity
  private int someOtherField;
  // getters setters constructors omitted
  
 public static class IxCustomer implements Index<Account, Long> {
    private final Customer customer;
    private final int someOtherField
    // getters setters constructors omitted
  }
}


// new entity with related entity
@Transactional
public void create() {
  var customer = orm.getReference(Customer.class, 100); // this is a lazy proxy, no DB query is performed
  var account = new Account(42L, customer, 100); // new entity
  orm.persist(account); // entity gets persisted to memory only
} // At the end of transaction demarcation, the newly persistend entity gets flushed to FDB

// existing entity gets modified
@Transactional
public void update() {
  var account = orm.get(Account.class, 42L); // this is an eager proxy, the statement immediately returns, but an async query is immediately started in the background to fetch the entity
  account.setBalance(200);
} // At the end of transaction demarcation, the entity gets modified, and the account with the balance of 200 gets written to the DB. All changed indices are updated, if needed.

// query by index
@Transactional
public void index() {

  var customerReference = orm.getReference(Customer.class, 42L);

  var accountsForCustomer = orm.query(Account.class)
    .index(Account.IxCustomer.class, qurery -> query
      .field(Account.IxCustomer::getCustomer, c -> c.equal(customer).ascending())
      .field(Account.IxCustomer::getSomeOtherField, f -> f.between(200,400).greaterThan(1000))
    )
    .stream()
    .take(200)
}


1 Like