Release: NodeJS FoundationDB mixed order data encoder

Hello, I have just released an encoder library build on top of the nodejs foundationdb binding by @josephg.

https://github.com/couchplus/collator

There are one particular feature that I am proud of, which is the ability to collate mixed order data.

A lot of effort has been given to make this works, I hope you guys will like it.

Do leave me any feedback and thanks in advance for that.

1 Like

If I understand correctly that is a replacement tuple.pack and tuple.unpack functions that offer the ability to change the order between object of different types. Like for example true can be bigger than 0.1 given a particular configuration or the inverse that is 0.1 can be bigger than true. Is that correct?

Does msgpack.lite preserve lexical ordering. Pseudo code:

if (a > b) then
  assert msgpack.encode(a) > msgpack.encode(b)

In Python bindings of msgpack I used, there was a problem with utf8 / string and bytes, I may be wrong but my understanding was that it was not possible to encode the equivalent of a JS Object that contains both bytes and strings of characters in values. Is it the case with the current javascript msgpack-lite?

In the readme there is no mention of support for JavaScript Int8Array, that would be the equivalent of Scheme’s bytevector type or Python 3 byte. Int8Array are very useful in my experience, because spatial indices relying on z-curve need to play with the binary form of the coordinates. Converting the bytes into a number and storing that number is not necessarily always possible, depending on the coordinates precision and whether the host language support big integers.

I considered changing the byte encoding in my Scheme bindings, to better map with Scheme (eg. add support for pair type, the primitive of what lists in lisp are made of), so far I did not do it, because re-using the same byte encoding as Python allows having direct access to the same FoundationDB cluster from several services possibly written in different languages. My understanding is using CouchPlus:Collator, it will require to re-implement the byte encoding algorithm. Is that correct?

I am not familiar with that feature. Can you give an example, please?

There is two typos in the README.md:

After that, you can start using it as follows;

Maybe:

After that, you can start using it as follow:

The sed command would look like:

s/s;/:/

Thanks for sharing.

Hello @amirouche

Your understanding of this library is a replacement for tuple.pack and tuple.unpack is correct. The combination of these methods is called encoder. There are key encoder and value encoder for the foundationdb library. The msgpack described in the readme serve the purpose of the value encoder in which you can replace with any other value encoder. (Side note: msgpack-lite encoding by default is not lexicographically sorted.) My library is best used as key encoder.

By default, the key is ordered in following order;

asc  - null
asc  - false
asc  - true
asc  - NaN
asc  - number
asc  - string
asc  - array
asc  - object
asc  - undefined
desc - undefined
desc - object
desc - array
desc - string
desc - number
desc - NaN
desc - true
desc - false
desc - null

and you can change the collation order by changing the CustomCode prefix value as shown below;

import { 
  BooleanCollator, 
  ComplexCollator, 
  NumberCollator, 
  StringCollator 
} from "@couchplus/collator";

const CustomCode = { /** Custom order **/ };

const collator = new ComplexCollator(CustomCode);
collator.register(new BooleanCollator(CustomCode));
collator.register(new NumberCollator(CustomCode));
collator.register(new StringCollator(CustomCode));

For CustomCode configuration, please refers to https://github.com/couchplus/collator/blob/master/source/DefaultCode.ts

The library does not currently supports Uint8Array as key but I do have plan to support spatial data as part of the key type in the future. Though as a workaround, maybe you can use my library at certain key space and use the default binary encoding at the key space where you requires spatial index

Yes, currently my library have it’s own byte encoding.

As for the explanation for mixed order data, given I want use a list/array as key, the catch is, I want to order the first element of the array in ascending order but I want to order the second element in descending order, I can do that as follow;

transaction.set([collator.order.asc(first), collator.order.desc(second)], value);

The resulting order inside the database will be as follows;

["abc", 5]
["abc", 4]
["abc", 3]
["abc", 2]
["abc", 1]
["def", 10]
["def", 9]
["def", 8]
["def", 7]
["def", 6]

note: the above is lexicographically sorted. means, you will get the benefit of range scan while ordering element in an array in mixed order.

Thank you for your attention at my library, feel free to ask me anything.

That is great!

The idea of getting together a specific type for geospatial index makes sense.

1 Like

Cool! I’m glad that abstraction was useful to you!

1 Like

The link is not working any more. I am search for how to do this, can you please share how this was done.

Thanks

@dallen Sorry for the late reply, due to some reason I have to pull down the previous project repository.

But I have rewrite the whole library to be more performance then the previous release and released it to https://github.com/himpun/collator

Do let me know if you have any question.

@endyjasmi Many thanks for re-sharing your code.