We’ve recently discovered a bug in the tuple layer encoding of the integers
-2^64+1 and are interested in gathering feedback about some of the options available to address it. The bug is limited to the Python bindings, but there’s nothing about this that’s really specific to the Python bindings and we’d encourage anyone with an opinion to chime in.
The technical details of the problem aren't super relevant to the discussion, but I've hidden them in this section in case you are interested.
First I’ll provide some quick background about how integers are encoded in the Tuple layer. Each type in the Tuple layer has its own type code(s), and for integers there are 19 of them. There are a set of fixed-length type codes that represent integers of different byte lengths and signs (in order, there is a negative eight-byte type code, then a negative seven-byte type code, …, then a type code for 0, then a positive 1-byte type code, …, and finally a positive 8-byte type code).
In addition to the fixed-length type codes, there is a type code on either end for positive and negative variable length integers up to 255 bytes.
The problem is that in Python, the variable length encoding is used for the problem integers, whereas every other binding uses the 8-byte fixed length encoding.
The Python binding uses a different encoding for these two integers than the other bindings do, but it’s not completely incompatible. All bindings will be able to decode both encodings of the integers, and both encodings still sort correctly. This means that you’ll be able to unpack any tuple that you read from a binding that uses the alternate encoding. However, if you explicitly encode this integer using the Tuple layer, the result in Python will be different than in other bindings. This means that Python may not be able to interoperate with other bindings in a single database if multiple bindings are used to encode this integer.
Because of the difficulty of providing a general-purpose migration to correct this problem for the data in an existing database, our thought is that any change would involve allowing the user to choose which encoding behavior they want. Here are some of the ways we could do that:
Add the ability to opt-in to the correct behavior in the Python bindings. The benefit of this is that it’s unlikely to cause any problems for someone already running the Python bindings, but the downside is that it requires new users to take a proactive step to get the desired behavior. Some users likely won’t and may end up with a database that has a binding interoperability problem that they don’t discover until later.
Change the behavior in API version 610 to require users to opt-out of the new behavior if they want to maintain the old behavior. This allows us to phase out the incorrect behavior more effectively, but at the risk that an existing user fails to opt-out when they needed to. If this happens, keys that they expect to be in the database may appear to have gone missing.
Change the behavior in API version 610 to require users to choose the behavior they want. This approach mostly avoids the problems of the previous two, but the extra step required for all Python Tuple users would definitely be a wart in the API.
It’s also possible we could transition between these states in different releases. For example, we could start with an opt-in and move to an opt-out, or we could require the user to specify for a while and then switch to an opt-out.
How do people feel about this? Would any of these options make you uncomfortable if we implemented them?