I will leave the inline replies on GitHub, though I’ll post my reply to your summary comments here as it’s more discussion-oriented in some ways:
- I think this probably needs additional tests. There is a tuple test that was recently added, the
tuple_test.go
file, that should probably be updated with a uint64
example. It would also be great if the bindingtester for go could be made to generate uint64
types, but that’s a little hard given that it can’t support negative numbers with 64 bits of information.
I did manually add a test for it in a local stash, though seeing as it appears to expect the test data to be manually verified, I had left this uncommitted until I received some sort of feedback and set aside some time to make sure that the end result does not interfere in any way with the types that I may not be using in my keys.
- I think you ran into this (and I had kind of meant to ask this to you on the forums but it slipped my queue), but I’m a little concerned that it doesn’t honor the contract that if I give you
x
and you tuple pack it and unpack it, I always get x
back (in that I could give the tuple encoder a uint64
containing, say, the integer 1066, encode, decode, and then I get the int64
1066 back). We already don’t quite honor that contract, so maybe it’s not a huge deal, but it’s something I’m somewhat concerned about.
You are right, I did have to pick a side with this. These changes externally behave similarly to the Java implementation, returning an int64
(Long
) when it is in range, preserving the existing behavior regarding int64
values.
At first, I had decodeInt
return an int64
for only its min value through values of -1
, and return uint64
for all of the values that the type can represent. In practice, there are int64
type assertions in the directory
package that cause a panic using this approach.
I tend to default to using the switch v := v.(type)
pattern and specify type cases that I expect, though it is generally reasonable in Go for someone to simply assert an int64
type, like in the directory
package, so there could also be existing user code that may break due to int64
type assertions.
I considered adding a new type code and updating the bindings. You probably already realize this, though like the first approach in Go, it wouldn’t be backwards compatible, affecting Java users that might attempt to decode existing data.
So for those reasons, and the way that the Java implementation was already converting BigInteger
to Long
where possible, I found the current behavior to be the most intuitive.
I do have to use the switch v := v.(type)
pattern in situations where the value range spans sections of the multiple int
types, though I understand the big picture and why this appears to be a reasonable compromise, so it doesn’t bother me personally, though I can see this as being a con of the current approach.
I don’t think we’d be opposed to using big.Int
from math/big
in the Go tuple implementation. The Java bindings didn’t always use big integers (they used Longs), but when they were updated, we just didn’t also update the Go bindings. My main concern would be that, memory and performance wise, using big integers could have more overhead than one might want. (This issue already exists in the Java bindings, and it’s something that we should probably address sooner rather than later.) But handling larger integers in the general case sounds fine to me.
I now see that using BigInteger
makes more sense in Java, as the 64-bit unsigned Long
type is a more-recent addition. If the overhead is enough to be noticeable in the Java implementation, and you might consider deprecating support for BigInteger
, in favor of using a byte string for example, it would make sense to me if implementing big.Int
were skipped in favor of encouraging the use of byte slices.
That would of course raise the issue of backwards compatibility, though there might not be nearly as many users who would rely on BigInteger
encoding/decoding who require sizes larger than the now current built-in types. That is purely anecdotal, and I am sure a scientist somewhere would disagree, so some sort of data or at least a poll raising the issue of performance costs might be helpful. I am not opinionated here for this reason.
I am interested if you have any thoughts on other potential approaches after reading my comments. I’ll reply to your inline comments shortly.
Thanks for quickly reviewing this PR!