Watch cancellation and conflicts caused by watches (?)

I have 2 questions regarding watches:

1.
The Javadoc says “Because a watch outlives the transaction that creates it, any watch that is no longer needed should be cancelled.” How do I cancel a watch in Java when I do not need it anymore?

I got into a situation where a transaction in a concurrent system fails with Transaction not committed due to conflict with another transaction. I read some keys in snapshot isolation and some in normal mode, set some values and then set a watch on a key that I have not read.

The conflicts API shows that the conflicted rows for the transaction contain only the key I set the watch on. Multiple other transactions update the key the watch watches (using set or atomic add) (but they do not read it either).

The only conflict range (single key range I set the watch on):
{conflict=FROM ((235, "main-queue", 27)) TO ((235, "main-queue", 27, null))}

I think not reading a key and setting a watch on it should not cause transaction conflicts in a transaction. Am I missing something? Is there a way to set a watch in a transaction where I modify entries and do not care if the value of the watch changed while the transaction was ongoing?

Edit: I tried 6.3.24 and built 7.1.21, same result.

Simple reproducer:

    @Test
    public void watchCausesConsistencyFailure() {

        var db = FDB.selectAPIVersion(710)
                .open();

        Runnable updater = () -> {
            while (true) {
                db.run(tr -> {
                    tr.mutate(MutationType.ADD, "foo".getBytes(), new byte[]{1, 0, 0, 0, 0, 0, 0, 0});
                    try {
                        Thread.sleep(ThreadLocalRandom.current().nextLong(5, 300));
                    } catch (Exception e) {

                    }
                    return null;
                });
            }
        };

        // Start 5 updater threads
        for(int i = 0; i < 5; i++) {
            new Thread(updater)
                    .start();
        }

        while(true) {
            db.run(tr -> {
                tr.options()
                        .setRetryLimit(0);


                // Modify unrelated key
                tr.set("bar".getBytes(), ("something-"+ThreadLocalRandom.current().nextInt()).getBytes());

                // set a watch
                tr.watch("foo".getBytes());
                try {
                    Thread.sleep(ThreadLocalRandom.current().nextLong(5, 300));
                } catch (Exception e) {

                }
                return null;
            });

            System.out.println("no problem");
        }
    }

Looks like I missed that the CompletableFuture’s cancel() method is fully implemented in NativeFuture, and that can be used, so the first question is clear now :slight_smile:

I’ve also hit this second problem.

My expectation is that if I set a watch on key that has a value A, even if that value changes to B due to another transaction before committing the watch, the commit will succeed and the watch will immediately fire.

Is there any way to get that working?

How would you describe the contract of such a watch? Today the contract is: fire when the value is different than at the time this transaction did commit. So semantically, this is a read.

I think what you want is different. You want to get notified if a value changes after the start of a transaction, not the end. The way to achieve this is by using two transactions:

  1. Create transaction T1
  2. Create transaction T2 and set the read version to the read version of T1
  3. Create the watch in T2
  4. Commit T2
  5. Do your reads and writes in T1
  6. Commit T1
1 Like

I solved the original issue back in the day (sorry for not updating topic) by

  1. fetching the expected state (an UUID) of the watched key in the original transaction via snapshot isolation (state at START of the transaction, not commit) and save this expected state: runtime/src/main/java/io/gitlab/qfoundation/watch/WatchManagerImpl.java · main · QFoundation / QFoundation · GitLab
  2. launching a separate transaction in this method that immediately fires a future manually if the expected state has changed, otherwise sets the watch. runtime/src/main/java/io/gitlab/qfoundation/watch/WatchManagerImpl.java · main · QFoundation / QFoundation · GitLab
  3. Also note that if you are using timeouts and the timeouts fire often (like in my case), then you’ll hit the 100.000 active watches limit because the watches are not properly cleaned up in case of errors by the java binding. I added a WatchResource wrapper. It uses synchronos blocking instead of exposing the future due to virtual threads, but you get the idea:
    1. runtime/src/main/java/io/gitlab/qfoundation/watch/WatchResource.java · main · QFoundation / QFoundation · GitLab
    2. runtime/src/main/java/io/gitlab/qfoundation/watch/WatchResource.java · main · QFoundation / QFoundation · GitLab

You can see the workflow in this class, the worker (poll) loop fetches the watch state in line 70 and saves it as a class variable. Then when the poll is done, we create the new watchresource in line 111. If the state has changed since that, this resource will contain an already completed future witout setting any watches, otherwise, it will contain a watch future. At the next loop at line 100, we await the watch, but have a safety timeout to do the poll regardless the watch. The watch is cancelled and properly cleaned up then.

1 Like

Thank you for this example! I was thinking of roughly this workaround but wasn’t aware of the setting of the read version. My version was to create the watch under T1, commit it, do all the other work T2 and only then await on T1’s watch, but in my version we risk doing unnecessary work that T2 already caught.

I still think that to get the most performance, having a way of setting a watch without a read conflict would be ideal - otherwise on constantly changing keys even this tiny T2 transaction in your example will, I think, conflict reasonably often.

The contract would be something like: fire when the value is different than what it was at the read version of this transaction.

That’s exactly what my two transaction approach does.

No, it will never conflict. Here’s what’s happening:

  1. Create transaction T1 → just allocates memory, doesn’t really do anything yet
  2. Create transaction T2 with read version of T1 → T1 now needs to get a read version, this will create one network request. If you are super concerned about performance, you can create T2 after you did the first read with T1, then getting the read version will be free. But of course if your T1 only does blind writes, this isn’t an option.
  3. Create watch on T2 → client will read the value of the provided key so you get one round trip to a storage server
  4. Commit T2 → will never conflict, because the write conflict set of T2 is empty. Note that the commit itself also won’t even send a commit request to FDB
    1. T2, in the commit pipeline might install watches on the storage server (there’s some optimization that if you have multiple watches on the same key, even this step might get omitted, but this is non-trivial to explain here).
  5. T1 runs normally to completion

In other words: I don’t think there’s a way to do anything more efficient on the FDB side.

1 Like