Scheme Request For Implementation

SRFI is standardization process for Scheme programming language.

I am pleased to announce that we will reach the last round of drafts for two proposals I made to SRFI standardization process to support Ordered Key-Value Store (SRFI-167) and Generic N-Tuple Store (SRFI-168) in Scheme implementations.

SRFI-167 abstract is the following:

This library describe an interface for ordered key-value store that is suitable to implement a storage engine for the generic tuple store SRFI. It maps cleanly with existing ordered key-value databases that may or may not provide transactions.

SRFI-168 abstract is the following:

This library is a generic approach to the database abstractions known as triple store and quad store. Generic Tuple Store Database implements n-tuple ordered sets and associated primitives for working with them in the context of data management.

Like I said, it is the last call before finalization, so if you have feedback to make about those documents, now is the time!

I made bindings for GNU Guile: https://git.sr.ht/~amz3/guile-foundationdb

Thanks!

SRFI-167 OKVS

I see that okvs-ref instead of okvs-get is to be consistent within scheme, but I’m struggling to find another egg that uses *-rm as a function name. Copying from hashtable, is *-clear or *-remove more standard?

I also notice that you’ve elided range clears entirely from your API. Was that intentional? They are inconsistently supported across kv stores…

CONFIG might contain the following options:

key description
… …
'isolation a symbol among 'read-uncommited , read-commited , snapshot .

Be careful about using the ANSI definitions of isolation, as they don’t map cleanly to anything that’s not a single-version pessimistically locked database. See A critique of ANSI SQL Isolation Levels and Generalized Isolation Levels.

I feel like isolation is something you’d be better off leaving out of the SRFI. Which you kind of did already, as it’s in an optional config struct.

(okvs-transaction-commit transaction config)

Various transaction APIs disagree on if transactions need to be explicitly committed or aborted, so you’d probably want to be conservative, and mandate in the SRFI that you must commit or roll back any started transaction.

(okvs-range some start-key start-include? end-key end-include? [CONFIG])

SOME can be a okvs or transaction object. This procedure should be decorated with with okvs-transactional.

The only difference between an ordered and unordered key-value store is if it supports efficient range operations. If your SRFI requires arbitrary get-ranges, you’re going to require someone to write an almost-equivalent unordered key-value store SRFI, and you also technically ban systems like Cassandra that distributes data such that a known prefix of the key is randomly distributed and a suffix of the key is sequentially distributed.

(okvs-transactional proc) -> procedure

I’m happy to see that your proposal includes a transaction runner, but I don’t recall seeing “decorator” be a common scheme pattern? I feel like I see explicit transaction runner functions in most other language’s database bindings. You could just offer a function like (okvs-run okvs proc), and then (okvs-run okvs (lambda (db) (okvs-ref db key))) would be equivalent to the current ((okvs-transaction (lambda (db) (okvs-ref db key))) okvs), the latter of which I find harder to mentally parse.

That said, I have enjoyed @fdb.transaction making life simple in the python bindings, and I don’t really have a great sense of what is or is not idiomatic for scheme anyway.

SRFI-168 Tuple Store

I don’t have any additional comments on this, other than, could you provide an example of nstore-from → nstore-where chaining with nstore-select. My brain is breaking slightly trying to understand this generator of bindings getting manipulated.

I’m struggling to find another egg that uses *-rm as a function name. Copying from hashtable, is *-clear or *-remove more standard?

Indeed, I will replace it with okvs-clear! because R7RS hash-table use hash-table-clear!

I also notice that you’ve elided range clears entirely from your API. Was that intentional? They are inconsistently supported across kv stores…

I will add it.

I feel like isolation is something you’d be better off leaving out of the SRFI. Which you kind of did already, as it’s in an optional config struct.

Ok.

The only difference between an ordered and unordered key-value store is if it supports efficient range operations. If your SRFI requires arbitrary get-ranges, you’re going to require someone to write an almost-equivalent unordered key-value store SRFI, and

Unordered key-value store seems not very useful to compose higher abstraction. I will think about it. I might make the okvs-range and okvs-prefix optional.

you also technically ban systems like Cassandra that distributes data such that a known prefix of the key is randomly distributed and a suffix of the key is sequentially distributed.

I don’t have enough experience with Cassandra to support it.

I don’t recall seeing “decorator” be a common scheme pattern?

It is not really a common pattern in the sens there is no word for it. That said, higher order procedures, that is procedures that take procedure as argument are common place. memoize is an example of high order procedure that is also a decorator.

I feel like I see explicit transaction runner functions in most other language’s database bindings.

I don’t understand this paragraph:

You could just offer a function like (okvs-run okvs proc) , and then (okvs-run okvs (lambda (db) (okvs-ref db key))) would be equivalent to the current ((okvs-transaction (lambda (db) (okvs-ref db key))) okvs) , the latter of which I find harder to mentally parse.

Right now, the way to use okvs-ref is as follow:

(define db (okvs config))
(okvs-set! db #vu8(42) #vu8(2019))
(define value (okvs-ref db #vu8(42))

Which is equivalent to:

(define db (okvs config))

;; set
(define tr (okvs-transaction-begin db))
(okvs-set! tr #vu8(42) #vu8(2019))
(okvs-transaction-commit tr)

;; ref
(define tr (okvs-transaction-begin db))
(define value (okvs-ref tr #vu8(42))
(okvs-transaction-commit tr)

That said, I have enjoyed @fdb.transaction making life simple in the python bindings, and I don’t really have a great sense of what is or is not idiomatic for scheme anyway.

As far as I understand, it transactional in scheme is not very idiomatic. The R7RS sheperd proposed something else:

A more Schemey approach would be (in-transaction database proc failure success), which evaluates PROC passing a newly created transaction as its argument. When PROC returns, the transaction is committed and in-transaction applies whatever PROC returned to the SUCCESS procedure and returns its result. If PROC calls “rollback” instead, then the transaction is rolled back, the execution of PROC is abandoned, FAILURE is called with no arguments, and in-transaction returns its result. By default FAILURE is raise and SUCCESS is identity.

My rationale behind transactional, tell me if I am mistaken, is that it is very easy to use procedures decorated with transactional in REPL. The in-transaction approach is a little bit more ceremony.

It seems like the above okvs-run you mention looks like in-transaction.

This is more portable, but I’m afraid less useful for anything non-trivial, by not exposing that the key-value store might be asynchronous. Everything that could be staging more work will be in fdb-future-block-until-ready.

Ah, thank you, I had forgotten this part.

To pull from some wisdom previously left on our forums:

So to be conservative, you’d need to be async on reads, writes and commit.

Ideally, you’d be able to write code like

(define do-something (tr)
  (let ((a (okvs-get tr "a"))
        (b (okvs-get tr "b"))
        (c (okvs-get tr "c")))
    (+ (bytes->int a) (bytes->int b) (bytes->int c))) 

and the get of a, b, and c could be done in parallel.

This might eventually come down to how would a non-blocking version of this fit with whatever the event loop/framework constructions in scheme look like, and I have no idea what those are.

I had played around with using proxy objects in python once, which are potentially an option in dynamic languages, but I suspect that scheme wouldn’t be amenable to trying to teach (+) that if it gets something that is-okvs-future? that it needs to wait on it to turn itself into an int.

1 Like

Thanks for the feedback.

This might eventually come down to how would a non-blocking version of this fit with whatever the event loop/framework constructions in scheme look like, and I have no idea what those are.

I tried to explain how to do non-blocking socket on stackoverflow.

So, in theory, it should be possible to support non-blocking using call/1cc or call/cc or similar (when it is fast enough).

The specification is entering the finalization stage, again!

In particular, I found a way to handle schema-on-write using hooks. This allows to validate write and delete before executing the commit.

This is put into pratice in the nstore, you will find the unit tests at: https://github.com/ashinn/chibi-scheme/pull/587/commits/ac2d4f33a4bc708de0e2ccf5681a23e8bc49dd56

I made a draft pull request adding the necessary machinery in Python bindings to put this idea into motion: https://github.com/apple/foundationdb/pull/2298

Basically, I propose to add two hooks in fdb bindings to make it possible to create schema-on-write layers: hook_on_transaction_begin and hook_on_transaction_end. Layers (or abstractions) must add functions to hook_on_transaction_begin to setup the transaction state. Layers must keep transaction state synced if necessary during layer operations. Before the transaction is committed, hook_on_transaction_commit is run giving an opportunity to every layer to check that everything is ok, and otherwise raise an exception.