Consistency of queries using continuation cursor

rishabh · October 22, 2019, 9:05pm

In record layer, long running queries are supported by concept of continuation.
After firing a query, client gets some query results and a continuation as output. It can then use this continuation to again open a (new) transaction and get remaining results. This can be done repeatedly.
Is my understanding correct?

Beacuse we are doing one big query in multiple fdb transactions (or on multiple db snapshots), we might get inconsistent result (as a whole).
Is this understanding correct?

Or is there some milestoning logic in record layer, which might be storing record with some milestone and fetching record asOf milestone and hence providing consistency on such qureies?

The docs say that RedWood Storage Engine will support long running read transactions. Does this mean that it will make continuation redundant?

nblintao · October 23, 2019, 4:31pm

To the first two questions, yes, you are correct. Also note AutoContinuingCursor might be handy for this usage.

I’m not aware of a plan to snapshot the records, I’ll leave it for others to comment.

alloc · October 23, 2019, 4:49pm

At the moment, no, there isn’t any logic in the Record Layer to add some kind of long-running MVCC-ish type thing for that kind of checkpointing, no. In theory, one could add something like that with application logic (I suppose), but it wouldn’t necessarily be trivial.

I’m not sure about redundant. Perhaps less necessary, as you would no longer need to use continuations to work around the 5 second limit (I suppose), but you might still want it to resume a query if, say, you have two services, one that reads data directly from FDB through the Record Layer and another that reads data through the first service. You may want to the first service’s API have some kind of pagination API (to, for example, decrease time to first byte for the end user and avoid needing to hold large result sets in memory in the first service), and then you could use the continuation and some sort of indication of the read version as a way of resuming a query across pagination requests. (You could also do something like have a gRPC stream where the second service issues “get more” requests to the first one, but then you have to resume from the beginning if the stream closes, and the API is more “stateful”.)

alloc · October 23, 2019, 4:51pm

Perhaps @SteavedHams has more on long running read transactions with Redwood

rishabh · October 23, 2019, 5:09pm

Thanks a lot. This answers my queries regarding record layer.

It would be helpful to know if Redwood will give an API to open a read transaction ASOF a specific db version. This will make implemention of consistent pagination API straightforward.

alloc · October 23, 2019, 5:12pm

We already actually have a way of specifying the database version when you begin a read transaction, as its independent of the storage server. We expose it through the Record Layer through the setReadVersion method on the FDBRecordContext (on fairly recent versions of the library), and the FDB transactions have that as well. Then all reads will only see that version.

The problem right now is that if you set it to a version that is more than five seconds in the past, then the transaction will immediately fail with transaction_too_old, and Redwood is needed to avoid that error.

Topic		Replies	Views
How Record Layer Provides APIs to Handle Large Range Scan Longer Than 5 seconds Record Layer bindings	6	1790	December 16, 2020
Rust FDB Record Layer Work-in-progress Repository FoundationDB Layers	1	1531	January 26, 2023
Record Layer Design Questions FoundationDB Layers	21	1663	July 20, 2023
`fdb_transaction_get_range` API and `limit` FoundationDB Layers	0	297	December 2, 2022
Txn too old exception when performing reads Record Layer	9	1231	September 18, 2020

Consistency of queries using continuation cursor

Related topics