Thoughts on cross module read error in special-key-space

At present, we introduce an error called special_keys_cross_module_read when your getRange touches more than one module.
I am thinking we can use ResultResultRef's fields readToBegin, readThroughEnd to make the call more smooth without an error.
At present, readToBegin, readThroughEnd are only set to be true if you read to \xff\xff or \xff\xff\xff\xff, respectively, which rarely happen since you will always get the special_keys_cross_module_read first, since, by default, your module is only a range like \xff\xff/transaction/, \xff\xff/transaction0.

Instead, we can omit the error when key selectors go outside the module’s boundaries and just set readToBegin, readThroughEnd as reference for user to know whether his calls crossed the module.
If he wants cross-module results, he can set the transaction option SPECIAL_KEY_SPACE_RELAXED to achieve.

A potential use case is when you want to use key selectors that are potentially cross-module.
For example, getRange(firstLessOrEqual("\xff\xff/transaction/read_conflict_range/foo), <end_key>) will throw special_keys_cross_module_read if there are no read conflict ranges’ begin keys <= foo.
This is a little bit not easy to use, since, by intuition, the user needs to deal with the error which he does not have a good idea how to deal with when he’s writing the code. If he wants to avoid this error, he can only use key selectors conservatively, which loses some flexibility.
Instead, we can just set readToBegin=true and return the result. Then the user knows there is no such begin keys<=foo.

The implementation is straightforward to change so we only need to care which one is better to use.

1 Like

I think I like this idea - basically this would mean that each subspace/module will be treated like its own little database unless SPECIAL_KEY_SPACE_RELAXED has been set.

I didn’t think of this before, but this would actually give us quite a nice semantics

I think this makes sense when resolving key selectors if the base keys all lie within a single module. If you specify keys that aren’t contained within a module, then I’m not sure what the best choice is. In particular, the precedent for this in other parts of the key-space is that it would be an error to specify base keys outside of the legal key-space, but key selectors that start within the legal space would clamp to the bounds of that space.

One problem here is that readToBegin and readThroughEnd aren’t actually exposed through the c bindings: https://github.com/apple/foundationdb/blob/a167bf344e87946376f6e02243e37c831c7f7299/bindings/c/fdb_c.cpp#L253. Maybe they should be exposed.
I think this sort of thing makes sense generally for using subspaces for multitenancy. The current c api doesn’t really afford it though. Maybe we could add “boundaries” to the read range api, and consider a read as readThroughEnd if it crosses the right boundary, and readToBegin if it crosses the left boundary. Then you could get key selector semantics as if it’s truly isolated from other subspaces.

Yeah, this is what I mean. Thanks for a concise summary.

I see what you mean here.
The difference in special keyspace compared to the general keyspace is that we do not know what is the legal keyspace until we get the base keys in key selectors. The module of the base key is the legal key-space to read (with SPECIAL_KEY_SPACE_RELAXED not set by default). So to be clear, I think we always hold the condition resolving key selectors if the base keys all lie within a single module in cases I describe here.
For cases you have begin key selector and end key selector having different base keys in two modules, it will still throw the special_keys_cross_module_read error.
I think the idea is to achieve like you described here, key selectors that start within the legal space would clamp to the bounds of that space.

Ah, I do not know they are omitted in c bindings. Maybe we can add it?
Ha, I see what you mean.
Having some concept like boundaries in range read should be quite useful, though I think there is some effort to add it into the existing general get range code.

I’m not sure it’s strictly necessary that you have these fields at the C level. At least, it probably isn’t a blocker to implementing this feature, even though maybe there could be valid uses of having that information.

Yeah, it is not necessarily needed to have it in c api to implement it. I will first implement it and create an issue about adding those two fields in c-api, which can be discussed separately.

Do you have a reference to the precedent?

I’m wondering when the error gets thrown, will it still return the result that fall within the legal key space?

For example, say, legal key space is [m, x); the begin key selector’s base key is b; the end key is n. Will the precedent code still return the result in [m, n)?


The proposal here seems allows the result in [m, n) to return.
@zjuLcg Does the proposal throw the error special_keys_cross_module_read in the above example?

For this example, it will throw the special_keys_cross_module_read error. My proposal is for begin keys and end keys with two base keys belong to the same module, then by default(SPECIAL_KEY_SPACE_RELAXED is not set), we treat the getRange only happens in this module’s range, which means even your key selector resolves outside the range, we still return the result and set readToBegin or readThroughEnd as true for reference.

More context:
My original post is not clear and looks like I cannot update now. This more should be more clear.
The goal is like Markus said, “Each module is treated like its own little keyspace by default”.
There are two cases:

  1. base keys of the begin key selector and the end key selector belong to different modules
  2. they belong to same module, but one or two of them will touch the module’s boundary during resolution

The current implementation will throw special_keys_cross_module_read in both cases. The proposal is to make the second one legal and set readToBegin or readThroughEnd as reference.

1 Like

IIUC, the precedent here means the normalKeys(“” to “\xff”) and the systemKeys(“\xff” to “\xff\xff”). By default(read_system_keys is disabled), if any of your two base keys is in systemKeys, it throws the error key_outside_legal_range. Once your two base keys are in normalKeys, the results will clamp to the boundary of normalKeys and will never throw this key_outside_legal_range error, which in other words, the key resolution only happens in the normal key space.

1 Like

Ah-ha, this description is crystal clear. Thanks for that!

Looking forward to your PR.

BTW, remember to add the semantics definition of this extension (feature) in the document in your PR. :slight_smile:

1 Like