Range returned by `Subspace#range` are inclusive on both ends

nicolascouvrat · November 10, 2023, 1:19pm

Hello!

I’ve noticed that Subspace#range returns a range that is in fact inclusive on both ends

    var s = new Subspace(new byte[] {(byte)0xf0});
    var r = s.range();
    System.out.println(hex(r.begin)); // 0xf000
    System.out.println(hex(r.end)); // 0xffff

This is made clear enough by the Javadoc

Gets a Range respresenting all keys strictly in the Subspace .

However, I found that the documentation for ReadTransaction#getRange is quite misleading:

Gets an ordered range of keys and values from the database. The begin and end keys are specified by byte[] arrays, with the begin key inclusive and the end key exclusive. Ranges are returned from calls to Tuple.range() and Range.startsWith(byte[]).

Yes, this says that the end is exclusive, but it also suggests that you can get a whole subspace with transaction.getRange(subspace.range()), but in fact, this will not read the very last key (0xf0ff in this case). Why is this so? Should it be more explicit that this pattern does not work exactly as expected? It seems natural to use it to iterate over a whole subspace, (and in fact I’ve found it at multiple places in our codebase…), and it’s unfortunate that doing so in fact misses the last key

Imperatorx · November 10, 2023, 1:59pm

This is indeed interesting

fdb> set \x00\x00 1
Committed (349243765927)
fdb> set \x00\x01 2
Committed (349246846354)
fdb> set \x00\xff 3
Committed (349250806671)
fdb> set \x01 4
Committed (349440253516)
fdb> getrange \x00\x00 \x00\xff  <- the end the range returns

Range limited to 25 keys
`\x00\x00' is `1'
`\x00\x01' is `2'

fdb> getrange \x00\x00 \x01 <- the proper end to use?

Range limited to 25 keys
`\x00\x00' is `1'
`\x00\x01' is `2'
`\x00\xff' is `3'

However, I think if you always use the tuple encoding for your keys, this (first byte inside a subspace being xFF) cannot happen, since the first byte is the data type of the first tuple item, and there is no such mapping:

private static final byte nil                   = 0x00;
private static final byte BYTES_CODE            = 0x01;
private static final byte STRING_CODE           = 0x02;
private static final byte NESTED_CODE           = 0x05;
private static final byte INT_ZERO_CODE         = 0x14;
private static final byte POS_INT_END           = 0x1d;
private static final byte NEG_INT_START         = 0x0b;
private static final byte FLOAT_CODE            = 0x20;
private static final byte DOUBLE_CODE           = 0x21;
private static final byte FALSE_CODE            = 0x26;
private static final byte TRUE_CODE             = 0x27;
private static final byte UUID_CODE             = 0x30;
private static final byte VERSIONSTAMP_CODE     = 0x33;

harikb · November 13, 2023, 7:48pm

Ideally the Subspace.range() method should have used something like this

github.com/apple/foundationdb

Add byte array "increment" method to go bindings

opened 05:59PM - 16 Oct 18 UTC

closed 08:03PM - 01 Nov 18 UTC

alecgrieser

Some of the other bindings, e.g., the Java bindings, have methods to increment b…yte arrays to produce the next byte array that doesn't include that array as a prefix). Here's the Java one: https://github.com/apple/foundationdb/blob/c93c426bc6cc4f87c986601ce645d69808ce9b9a/bindings/java/src/main/com/apple/foundationdb/tuple/ByteArrayUtil.java#L344-L353 This makes things like querying for a range of things that start with easier, for example, or querying until the end of a range. This came up in this forum discussion: https://forums.foundationdb.org/t/ranges-without-explicit-end-go/773

SteavedHams · November 17, 2023, 5:21am

I think this is the key misunderstanding here. The key \xf0\xff should not be placed into the subspace.

Subspaces are meant to contain the same keyspace that you can store if you are not in a Subspace. If you are not in a Subspace, then keys starting with \xff are not allowed as that is the reserved system space of FDB. While keys could start with \xff after any non-empty prefix, it’s not a supported use case.

This is why the range for a Subspace with prefix P which includes all keys in the subspace is <P> to <P>\xff. From the Subspace perspective, this range is inclusive of all keys expected to be in the Subspace. The raw range returned to use with the raw key getRange() is exclusive of the end key, but this does not exclude any expected keys in the Subspace because keys which begin with \xff after the prefix are not a supported use case.

Topic		Replies	Views
Range keys for subspace don't include first or last key Using FoundationDB	3	869	September 16, 2020
How to get exact range of keys using fdb_transaction_get_range in C Programming Using FoundationDB bindings	4	1540	May 20, 2019
Copying data from one "table" to another Using FoundationDB	10	1647	September 19, 2019
Why can't get the key-value I need for using fdb_transaction_get_range Using FoundationDB	1	702	November 27, 2021
Is it possible (or useful) to fetch the last key? Using FoundationDB bindings	2	1236	May 11, 2018

Range returned by `Subspace#range` are inclusive on both ends

Related topics