Changefeeds (watching and getting updates on ranges of keys)

I’m afraid that this thread, like the previous one about watches, might leave the reader with an impression that they are a lot harder to use correctly than (I think) they actually are.

The basic idea of a watch is that you have something that you could monitor with a polling loop:

@fdb.transactional
def get_light_switch_state(tr):
    return tr["light_switch"] == "on"
while True:
    state = get_light_switch_state(db)
    set_lamp_state( state )
    time.sleep(1)

But this will either be slow to react (if the sleep is long) or use lots of resources (if the sleep is short). So you can replace the sleep with a watch:

@fdb.transactional
def get_light_switch_state(tr):
    return tr["light_switch"] == "on", tr.watch( "light_switch" )
while True:
    state, watch = get_light_switch_state(db)
    set_lamp_state( state )
    watch.wait()

and for the most part it will “poll” only when the state changes, but pretty quickly when it does, so you get the best of both worlds. Watches should be totally reliable for anything that polling would in principle work for.

Of course, if you change the light switch state and then change it back quickly, either version can “miss” that “ABA” update. Neither of these methods should be used when the goal is to produce a log of updates. The preferred approach to do the latter with FDB is to maintain such a log transactionally when doing the updates (as @alloc explains above). FDB also has the ability to log all updates to a given key range (this capability is used by the backup and DR tooling), but this facility is somewhat dispreferred and underdocumented because there is no way to provide FDB’s excellent backward compatibility to applications dependent on the format of these logs. They are likely to need updating to work correctly with new major versions of the database, and that’s a significant disadvantage.

Now to the original question: how can a layer or application best implement reactive features using FDB, where the data to be monitored for changes is not limited to a single key? I think the most attractive design is to maintain an index of what is currently being watched, and at update time only keep track of changes for the watched things. Depending on your data model and use case, you could make various decisions about what granularity to track each of these things at. For your requirements, you would probably only track which keys have changed since the watch was created, rather than keeping a log, but the latter option is available for use cases that need it.

Are there any features at the FDB level that would make this sort of thing more efficient? Range watches analogous to our single key watches are feasible, and would provide a nice interface in many cases closely analogous to the light switch example I give above, but when they fired you would have to read the entire range to find out what was changed, which makes them generally less attractive than single key watches. I think they would be unlikely to be the most efficient implementation of what you want, and they would come at some performance cost (the data structures for range watches on the storage server being somewhat slower than for individual keys). So I’m open to suggestions, but my first guess is that this is a very desirable layer feature but that the FDB API already offers the necessary low level tools.