FDB_STREAMING_MODE_ITERATOR ignores iteration


(Kirill Titov) #1

Hello everyone. I’m working on a Swift FoundationDB client. From day one I used FDB_STREAMING_MODE_WANT_ALL as default streaming mode in range get function (link) as I’m feeling it suits best for most of my usecases. In rare cases when I used different modes (I think it only was FDB_STREAMING_MODE_ITERATOR), I haven’t noticed anything suspicious.

However, I recently started working on a stable LTS version of my client and wanted to add a handy sugar iterator(-ish) functionality to the client (link) which would manage iteration counter for user and return nil when there are no more records in DB (though it’s not compatible with Swift builtin IteratorProtocol as my implementation can throw error).

When I was covering this new functionality with tests (link), I noticed quite strange behaviour: iterator mode wasn’t working as expected. Say I insert 10 rows to the FDB (value is just one byte, incrementing from 1 to 10), and then I’d like to get this range as an iterator.

Firstly, I noticed that iterator mode ignores limit argument (I wanted to iterate 3 records per iteration), though it wasn’t a big surprise as I eventually found out in FDB sources that FDB returns increasing portions of rows in this mode. No problems here really, however, it would be nice if it would actually obey limit when I do want it.

But the funniest thing is it would return only first 4 records every time, no matter the iteration counter. I even introduced ultra-verbose printing of all C function invocations, and there’s no mistake (or at least it seems so to me):

// First invocation here is WANT_ALL to return all rows
// from this range and confirm that there are actually 10 rows
[FDB] [Transaction] Calling C function fdb_transaction_get_range(
    FDBTransaction* tr: 0x0000000100c19660
    uint8_t const* begin_key_name: [..., 114, 0, 0]
    int begin_key_name_length: 42
    fdb_bool_t begin_or_equal: 0
    int begin_offset: 1
    uint8_t const* end_key_name: [..., 114, 0, 255]
    int end_key_name_length: 42
    fdb_bool_t end_or_equal: 0
    int end_offset: 1
    int limit: 0
    int target_bytes: 0
    FDBStreamingMode mode: FDBStreamingMode(rawValue: -2)
    int iteration: 1
    fdb_bool_t snapshot: 0
    fdb_bool_t reverse: 0
)
["all": [[1], [2], [3], [4], [5], [6], [7], [8], [9], [10]]]

// And after that I call `next()` four times,
// and you can see that iteration is incremented every time,
// but only first four rows are returned every time
[FDB] [Transaction] Calling C function fdb_transaction_get_range(
    FDBTransaction* tr: 0x0000000100c217d0
    uint8_t const* begin_key_name: [..., 114, 0, 0]
    int begin_key_name_length: 42
    fdb_bool_t begin_or_equal: 0
    int begin_offset: 1
    uint8_t const* end_key_name: [..., 114, 0, 255]
    int end_key_name_length: 42
    fdb_bool_t end_or_equal: 0
    int end_offset: 1
    int limit: 0
    int target_bytes: 0
    FDBStreamingMode mode: FDBStreamingMode(rawValue: -1)
    int iteration: 1
    fdb_bool_t snapshot: 0
    fdb_bool_t reverse: 0
)
[[1], [2], [3], [4]]
[FDB] [Transaction] Calling C function fdb_transaction_get_range(
    FDBTransaction* tr: 0x0000000100c217d0
    uint8_t const* begin_key_name: [..., 114, 0, 0]
    int begin_key_name_length: 42
    fdb_bool_t begin_or_equal: 0
    int begin_offset: 1
    uint8_t const* end_key_name: [..., 114, 0, 255]
    int end_key_name_length: 42
    fdb_bool_t end_or_equal: 0
    int end_offset: 1
    int limit: 0
    int target_bytes: 0
    FDBStreamingMode mode: FDBStreamingMode(rawValue: -1)
    int iteration: 2
    fdb_bool_t snapshot: 0
    fdb_bool_t reverse: 0
)
[[1], [2], [3], [4]]
[FDB] [Transaction] Calling C function fdb_transaction_get_range(
    FDBTransaction* tr: 0x0000000100c217d0
    uint8_t const* begin_key_name: [..., 114, 0, 0]
    int begin_key_name_length: 42
    fdb_bool_t begin_or_equal: 0
    int begin_offset: 1
    uint8_t const* end_key_name: [..., 114, 0, 255]
    int end_key_name_length: 42
    fdb_bool_t end_or_equal: 0
    int end_offset: 1
    int limit: 0
    int target_bytes: 0
    FDBStreamingMode mode: FDBStreamingMode(rawValue: -1)
    int iteration: 3
    fdb_bool_t snapshot: 0
    fdb_bool_t reverse: 0
)
[[1], [2], [3], [4]]
[FDB] [Transaction] Calling C function fdb_transaction_get_range(
    FDBTransaction* tr: 0x0000000100c217d0
    uint8_t const* begin_key_name: [..., 114, 0, 0]
    int begin_key_name_length: 42
    fdb_bool_t begin_or_equal: 0
    int begin_offset: 1
    uint8_t const* end_key_name: [..., 114, 0, 255]
    int end_key_name_length: 42
    fdb_bool_t end_or_equal: 0
    int end_offset: 1
    int limit: 0
    int target_bytes: 0
    FDBStreamingMode mode: FDBStreamingMode(rawValue: -1)
    int iteration: 4
    fdb_bool_t snapshot: 0
    fdb_bool_t reverse: 0
)
[[1], [2], [3], [4]]

I’m genuinely confused, please help. Either I’m missing something (most probably) or there is a bug in FDB.

I’m using FoundationDB v6.0.18 with 600 header API version.

Thanks in advance!


(A.J. Beamon) #2

I’m not immediately spotting why your test isn’t working, but I did run a test of my own with kv pairs of roughly 32 bytes that demonstrates the expected behavior (iteration 0 is WANT_ALL):

Got 2518 items for iteration 0
Got 18 items for iteration 1
Got 65 items for iteration 2
Got 261 items for iteration 3
Got 391 items for iteration 4
Got 585 items for iteration 5
Got 877 items for iteration 6
Got 1315 items for iteration 7
Got 1972 items for iteration 8
Got 2518 items for iteration 9
Got 2518 items for iteration 10
Got 2518 items for iteration 11
Got 2518 items for iteration 12
Got 2518 items for iteration 13
Got 2518 items for iteration 14
Got 2518 items for iteration 15
Got 2518 items for iteration 16
Got 2518 items for iteration 17
Got 2518 items for iteration 18
Got 2518 items for iteration 19

The way that the iterating mode works is that each successive iteration increases the byte limit up to the maximum used in WANT_ALL. When byte limits are specified (i.e. when not using EXACT mode without a byte limit), I believe it’s possible for the query to return early in some cases, such as if your range crosses a shard boundary. That probably wouldn’t happen if your database is small, but you could probably rule that out as an issue by running your WANT_ALL and ITERATOR mode queries in the same transaction. I would expect WANT_ALL to return early in the same cases as ITERATOR would, so if you see a difference then something else is probably going on.

Another possibility is that the 4th kv pair in your range is sufficiently large that it exceeds the limits specified in the first several iterations. That seems like possibly a stretch, but if it’s the case then you could try more iterations to confirm that it eventually gets past it. See here for the limit progression: