FunctionKeyExpression is not honored when evaluating the QueryPlan

FunctionKeyExpression don’t seem to be honored in the evaluation of a query plan.
We came across a test casewhich exposes this .
Is there any plan to make FunctionKeyExpressions allowed to be picked for indexes ?
Are there any workarounds for the same ?

No, queries cannot currently use FunctionKeyExpressions. The main problem is that the they currently have to way to express the structure of the index keys that they produce in a manner that would be meaningful to the planner. For example, you could have a function that picks the third letter of every string field in the record and concatenates them to produce a fan-out index key — there planner has absolutely no insight into what is going on in the guts of the function to know how query arguments could be applied against the index.

There is work going on in the planner that I will let others chime in on that may pave a path to having fully exposed function based indexes.

What sort of index are you trying to create?

Thank you for the insight @scgray.

Our use case is pretty straightforward, We’re trying to create Uppercase string indexes on a few fields of our data.

Oh, ok, then you probably want to take a look at collating indexes. These are in the fdb-record-layer-icu package. Here is an example: https://github.com/FoundationDB/fdb-record-layer/blob/master/fdb-record-layer-icu/src/test/java/com/apple/foundationdb/record/icu/TextCollatorICUTest.java

Ah ok! Let me take a look at that. I’ll come back with my findings.

First, I will second Scott’s recommendation that you try a collating index with case-insensitivity (3) strength. They are designed for just this kind of use.

That said, an index defined with a simple function, such as one that uppercases, on a field can in fact be used in a query. The restriction is that the entire key expression that defines the index must be used verbatim in the query with [in]equality predicates. And the FunctionKeyExpression implementation must be marked as a QueryableKeyExpression.

So, given UPPER(f), you can do the equivalent of `WHERE UPPER(f) >= ‘e’ AND UPPER(f) <= ‘m’ to scan such an index between those bounds.

What doesn’t work is planner support for “understanding” what the index does. An imaginary case of this would be something like WHERE UPPER(f >= 'e' AND f <= 'm').

I have created a pull request to fix the test that you found to demonstrate the simple case that does work today.

Hi Mike,

I have the same use-case. I am trying to create a uppercase testCollate index and referring to FDBCollateQueryTest class but my query is not behaving as expected. I’m not sure if my inference is correct.

Message Order {
         int32 order_id = 1;
         int32 price = 2;
         string buyer = 3;
}

I have inserted data with values of buyer : [“foo”, “bar”, “vibhuti”]. My use-case is that user queries buyer: “FOO” and it returns the order with value “foo” (case-insensitive search). For the above model, the index that I have created is:

KeyExpression NAME_FIELD = function(CollateFunctionKeyExpressionFactoryJRE.FUNCTION_NAME, field("buyer"));
builder.addIndex("order", "buyerIndex", NAME_FIELD );

And the query that I’m firing is (to make sure it picks up this index):

RecordQuery query = RecordQuery.newBuilder().setRecordType("Order").setFilter(Query.field("buyer").equalsValue("vibhuti")).setAllowedIndex("buyerIndex").setRequiredResults(field("buyer)).setSort(field("buyer")).build();

When I fire this query, it throws me an error stating “Cannot sort without appropriate index: Field { ’ buyer’ None}.” The index has already been registered (I have verified this with an IndexScan.). If I remove the setSort method, it does a full range scan instead.

Is the index I have constructed correct? If not, could you please guide me as to how to proceed further?
Thanks.

Your query and sort expressions must match the collating index; that is, they must involve the collate function expression.

Unlike in SQL, where COLLATE clauses affect the behavior of the column itself in indexes and queries, in the record layer that remains unaffected and applications of a collation function are indexed and queried.

Specifically, instead of Query.field("buyer") use Query.keyExpression(NAME_FIELD) and instead of Key.Expressions.field("buyer") again NAME_FIELD. Review the test class you referred to again and note how it does that.

Since there is an equality filter on the collated field, the sort won’t make any difference in this query. But I assume that it will in the actual use case.

Ahh got it. Yes the use-case is more complex than this. It works now.
Thanks a ton.