Stored procedures

Most of the transactional usage patterns in FoundationDB require multiple round-trips, for example having idempotent transactions by first checking for the presence of a key.

That is, any transactional logic that depends on reading data and which isn’t baked into an atomic op, needs to transfer that data out to a client, using bandwidth and increasing latency.

Are stored procedures on the road map as a solution to this – for example using a WASM-based ABI?

As FoundationDB is a distributed database, I’m not sure having support for stored procedures would realistically result in the gain that you’re hoping for. The next question would be where is the stored procedure executed and where is the data that it needs to access? If the process executing the stored procedure needs to go across the network to fetch data, then that likely gives approximately the same costs as having a client do the round trips instead.

One could instead look at this as adopting a Calvin/Fauna-like model, where one could commit a transaction to execute. When a storage server goes to apply the transaction to disk, it executes it, and then writes the result. This would have the same restriction that read and write sets would need to be pre-declared, which complicates index usage. I’d also be highly concerned from an operational perspective about stalling the write loop of one storage server while waiting for reads from a different storage server. I don’t think that we’re likely to go in this direction anytime soon.

For the specific case of wishing to execute a read operation that only requires data from on shard (or is independently parallelizable across multiple shards), there was pushdown predicates on the roadmap, but work on them stalled due other priorities cutting in. Thus, unlike what the roadmap says, they won’t be in 7.0. The idea would be that with this feature, one would be able to push down e.g. “count the number of keys between X and Y”, and the aggregation for that could be executed on the storage server.

Ideally we could offer an API that allows passing in a program, but there’s a few constraints that we have on the language and execution environment.

  1. Evaluating a user program must be hard limited to a small execution time (let’s say 1 microsecond), or must be able to pause execution and yield to the event loop every ~1 microsecond.
  2. User-provided code must either provably terminate, or be deterministically timed out/halted if it runs too long. We need to make sure that sending while(1) {} as pushdown program doesn’t allow one client to kill the database.
  3. The execution engine must be secure. RPCing arbitrary code is generally a security hazard, so great attention must be given to this point.
  4. The execution engine must be able call into and return from user provided code with low overhead. One of the major performance wins would be if one could remove inspecting subtrees of a B-Tree during a range read for a pushed down operation, but doing so would involve repeatedly calling into the execution engine in a performance-sensitive loop.

In a quick search I had done before, I couldn’t find anything that seemed to satisfy all of these requirements well. LLVM IR would fail at (1), (2) and (3), I think. WASM would fail at (1), (2), and (4), I think. If you’re aware of anything that can hit all 4 points to a good degree, I’d love to hear.

If we had the above, allowing custom atomic ops would potentially also become feasible, which could potentially address your request if what you’re reading and writing is the same (one) key.

1 Like

I don’t know about the internals enough to tell whether this could be implemented without a read potentially stalling the write loop.

The semantics would be the same whether the transaction was executed via a stored procedure or through an API client, but there would be a “happy path” when a transaction (stored procedure) can be completed without going across the network to fetch data.

In terms of what sort of program might constitute a stored procedure:

  1. The WASM code would either naturally yield (requiring for example a network fetch operation) or it would either complete in a reasonable amount of time or it would not – in which case we’d have to kill the thread.
  2. In terms of computing resources in the complete system, having an infinite loop in the storage layer or in a client program is perhaps not all that different. In both cases, someone or something needs to react. In practical terms, the stored procedure in question might be quarantined or subject to rate-limiting to ensure that a single bug won’t bring down the whole system.
  3. I think WASM being supported by major browsers show that it’s a rather secure execution model. In addition, presumably WASM code in this scenario would be trusted on the same terms as an API client is.
  4. In the browser, WASM code calls in and out of “system code” all the time, e.g. to modify DOM elements or make asynchronous network calls. The devil might be in the details of course.

@malthe I’m curious what application (in production) do you want to implement that can not be achieved with the current system (due to performance issue) and have to use stored procedure?

A side though about the discussion above:
The network latency for the web app. is WAN latency; while the client to DB latency is typically within DC because the client here is more like the micro service backend.

The network latency difference in these two scenarios can be huge and affect the ROI of implementing store procedures.

@mengxu well this is hardly a new idea, see for example introduction on coprocessing which talks about this functionality in HBase and drawing inspiration from BigTable. Having predicate pushdown is also on the roadmap.

I’m sorry. I guess I didn’t express myself well in my last post.

Let me try to be more straightforward.

I understood the general motivation of stored procedures.
I’m curious:

  1. Are you running some production system that has used stored procedure and considering to migrate it to FDB? or
  2. Are you exploring which system uses the stored procedures and spring discussion on this topic for good? But there is no production system in the near future that will use stored procedures.

I think this can also help developers prioritize features.

Have you looked at eBPF? It’s the Linux kernel extension VM, designed to run user code in the kernel. The kernel has similar requirements to your 1-4, so the design would be interesting to you.

There is an apache licensed implementation called uBPF.

I have! eBPF is my personally preferred direction to go, and I had also found uBPF. It indeed looks like a great project to use as a starting point, is Apache2 and thus license compatible, it has both an x86_64 JIT and an interpreter fallback for other platforms, and lets one re-use the whole eBPF toolchain.

My only point of concern is that the kernel does a good amount of the termination verification, which I don’t believe had been ported to uBPF. So while it theoretically satisfies (2), I’m concerned that it wouldn’t in practice (until/unless verification gets added to uBPF).

It also wouldn’t be the most user-friendly, as one would have to write C for every pushdown operation one wishes to do, and then compile and load that as eBPF. I’ve seen some nice python/go tooling around making this easy, but it’d still be quite a jump for someone writing a python/java/go/etc. layer.

The verification in the kernel has specific tests for the data that is being accessed like packet buffers. FoundationDB would likely need that for things like data buffer access. So even if there was a verifier in uBPF, it would likely need to be extended. Creating an extensible verifier framework that is easy for clients to extend is an interesting problem.

https://www.kernel.org/doc/Documentation/networking/filter.txt has a nice description of the verifier towards the end. FoundationDB usage would have a different set of states to track than the kernel.

ideally, the user wouldn’t write the eBPF programs for predicates, the client libraries would mostly do that. The record layer could do that in the query planner.

The threat model for FoundationDB is somewhat different than the kernel threat model. Ideally, FoundationDB has an authentication step in front of submitting a request to the database, in the kernel that is more wide open. The authors of the client applications are likely working collaboratively. The threat is more mistakes and bugs vs malicious intent. It may also be that FoundationDB doesn’t need all of the VM features in eBPF to push predicates, that might simplify the verifier, by disallowing certain instructions vs verifying them.

1 Like