Ranges without explicit end (Go)

There really needs to be a big red warning on top of the documentation about tuples and tuple encoding, because I see a lot of people falling victim of the same pitfall:

  • Strings, in the tuple encoding, are escaped as 0x02 followed by the utf-8 bytes, and terminated by a NUL byte, so "\xFF" is not the FF byte (a single byte), but the string composed of a single character with code point 255, which translates to 02 FF 00 (well not really, code point 255 would be encoded by UTF-8 into some sequence of bytes that I don’t remember off the top of my head, but you get the point).
  • Integers use a smart trick to preserve ordering, and 0 is encoded as 14, and small integers (<255) are encoded as 15 xx, then 16 xx yy for 2-byte integers and so on.

So, the tuple {…, “\xFF” }, once transformed into bytes by the tuple encoding, represents a key that is actually BEFORE the tuple {…, 1} ! (... 02 FF 00 < .... 15 01).

If the end key selector is less than the begin key selector, the set is by definition empty and your get_range will never return anything! :slight_smile:

If you want to generate a byte sequence that will always be greater than whatever value (or type!) you could fit in the last element for the age value, you would need to provide of a custom token that would be recognized by the tuple encoder as a special value that outputs as a single FF byte. So not a string, integer, or any other common type. I don’t know if the go implementation has this special singleton or not…

And of course, you could also use a different encoding scheme than the tuple encoding (which is used by default, but is not the only one possible!) and you’d still want to be able to do that.

If you want a generic way to create such an open ended range, there are two ways:

  1. add a single FF byte at the end of the result of packing the tuple (i.e: add it to the resulting byte array, not the tuple’s items). So encode the tuple {"age", } into bytes (will output something like 02 a g e 00) and append to that the FF byte (to get 02 a g e 00 FF). With most tuple encoders, this would NOT be a valid encoding (since FF is not a valid type header). Other encoders may accept the FF anyway, depends on the binding`s implementation.

  2. increment the last byte of the key prefix, so for our tuple above, ‘02 a g e 00’ + 1 = 02 a g e 01. Again, this is NOT a valid tuple encoding (un-terminated string). If the last byte is already FF then carry the one like regular addition in base 10.

But in both cases, all possible tuples { "age", <some_int> } are less than both 02 a g e 00 FF and 02 a g e 01.

The first solution works for tuple encoding, because it NEVER* produces with items that starts with FF (*: some do! :slight_smile: ), but other encodings MAY NOT, so solution 2. is universal, while 1. is a bit easier to do in practice and works fine with tuples.

I think in most bindings, there is a method called increment (or with increment in the name) that does that for you.


Edit: so yeah, if the go binding would introduce some singleton of a custom type (let’s call it ALEPH_0), that would be encoded as FF then you could write fdb.lastLessThan(tuple.Tuple("age", ALEPH_0)) to get what you want. This would not be a canonical encoding with the current spec, though. In the mean time, when dealing with integers, you can cheat a little and think of 2147483648 as the best approximation of ALEPH_0 we can use right know :slight_smile: )

2 Likes