DR configuration

Three questions about DR:

a) Is is possible to start dr_agent from /etc/foundationdb/foundationdb.conf?

b) Since dr_agent knows source and target fdb.cluster, why fdbdr requires the cluster files?

c) To keep a DR database in sync, you repeatetly call fdbdr start?

Best wishes
Marcus

I… uh… apparently not? @ajbeamon or @SteavedHams, am I confused here? This would almost be possible, except fdbmonitor seems to force a --cluster_file option, and -s and -d don’t have long forms?

DR maintains metadata in the source and destination clusters about the DR, so fdbdr needs to be able to create/modify/delete that metadata in order to affect DRs.

No, once you start the DR between the two clusters with fdbdr, the dr_agent should pick up the work of keeping the two databases in sync.

There are long forms for those parameters, named --source and --destination, but as you say there is no --cluster_file.

However, the reason that --cluster_file seems to be required is that it is typically included in the [general] section, which holds a few special properties as well as any flag that you want to be used in all of your processes. I think if you move --cluster_file to the processes that need it (e.g. [fdbserver] and [backup_agent]), then you should be ok. A working conf may look something like this, then:

...

[general]
restart_delay = 60

[fdbserver]
command = /usr/sbin/fdbserver
cluster_file = /etc/foundationdb/fdb.cluster # Moved from [general]
...

[fdbserver.4500]

[backup_agent]
command = /usr/lib/foundationdb/backup_agent/backup_agent
logdir = /var/log/foundationdb
cluster_file = /etc/foundationdb/fdb.cluster

[backup_agent.1]

[dr_agent]
command = /usr/lib/foundationdb/backup_agent/dr_agent # Note this doesn't seem to be installed by the packages, so I had to create it manually (a symlink to backup_agent named dr_agent is sufficient)
logdir = /var/log/foundationdb
source = /etc/foundationdb/source.cluster
destination = /etc/foundationdb/dest.cluster

[dr_agent.2] # Must use a different ID than any other process (we can't reuse 1 since it is used by the backup_agent).

That said, I seemed to have some trouble running both a backup agent and a DR agent at the same time for reasons I haven’t yet determined. I’ll report back if I figure out what’s wrong.

EDIT: my problem was that the ID used for each process has to be unique, even across different sections. In other words, you can’t use the same number 1 in [backup_agent.1] and[dr_agent.1]. I’ve updated the configuration above to include a backup agent as well.

1 Like

If you’re asking why the dr_agent doesn’t do this update in the secondary cluster in response to a modification by fdbdr in the source cluster, I think the main reason is that it’s possible to run multiple DR agents with the same source cluster and different destination clusters. In that case, it would be ambiguous which DR you wanted to interact with if you only specified the source. There’s probably some safety benefit to being explicit, as well.