Batch priority transactions

I have a workload which has three tasks:

A. Inserting new keys
B. Aggregating those inserted keys through some function which takes input of N keys and outputs 1 key. This tasks performs many small range reads of the un-aggregated keys and decides to aggregate when the number of keys in a given range grows too large.
C. Reading the aggregated and un-aggregated keys in a range at snapshot isolation

Task B is critical to ensure task C completes in a timely manner. Task A is also important, but mostly useless if B doesn’t happen soon enough.

Is it reasonable to use a batch priority transaction for Task C? Retrying later is generally OK for this task, so I’m hoping this means read operations from Tasks A and B will always be serviced first if running concurrently with operations from Task C.

Batch priority transactions can be indefinitely starved while one storage server is failed. That’s acceptable for purely batch work, but I’m assuming not for your usecase?

Have you considered evaluating Transaction Tagging — FoundationDB 6.3 for this? You should be able to let B choose from N random tags while A and C have one, and thus effectively prioritize B higher than A or C.

1 Like

I haven’t tried that, but I’ll check it out. Thanks.