Run fdbbackup/fdbrestore on a TLS-enabled cluster

We have a TLS-enabled V6.0 cluster. I tried to run “fdbbackup status” and it hang. fdbcli has command line option to specify tls related options, but fdbback --help shows there is no tls related option. How do I run fdbbackup on a TLS cluster? If fdbbackup doesn’t support TLS, how do I do backup/restore for TLS-enabled cluster? Thank in advance.

1 Like

It looks like there’s TLS options to me?

$ ./bin/fdbbackup --help | grep tls
  --tls_certificate_file CERTFILE
  --tls_ca_file CERTAUTHFILE
  --tls_key_file KEYFILE
  --tls_password PASSCODE
  --tls_verify_peers CONSTRAINTS

Alex, which version of FDB are you using? I am using 6.0 and fdbbackup has no tls option.

root@xid2-storage-02-6d47f95bdd-2bsdq:~# fdbbackup --help|grep tls
root@xid2-storage-02-6d47f95bdd-2bsdq:~# fdbbackup --help          
FoundationDB 6.0 (v6.0.15)
Usage: fdbbackup (start | status | abort | wait | discontinue | pause | resume | expire | delete | describe | list) [OPTIONS]

  -C CONNFILE    The path of a file containing the connection string for the
                 FoundationDB cluster. The default is first the value of the
                 FDB_CLUSTER_FILE environment variable, then `./fdb.cluster',
                 then `/etc/foundationdb/fdb.cluster'.
  -d, --destcontainer URL
                 The Backup container URL for start, describe, expire, and delete operations.
                 Backup URL forms:

                     blobstore://<api_key>:<secret>@<host>[:<port>]/<name>[?<param>=<value>[&<param>=<value>]...] (Note: The 'bucket' parameter is required.)

  -b, --base_url BASEURL
                 Base backup URL for list operations.  This looks like a Backup URL but without a backup name.
  --blob_credentials FILE
                 File containing blob credentials in JSON format.  Can be specified multiple times for multiple files.  See below for more details.
  --expire_before_timestamp DATETIME
                 Datetime cutoff for expire operations.  Requires a cluster file and will use version/timestamp metadata
                 in the database to obtain a cutoff version very close to the timestamp given in YYYY-MM-DD.HH:MI:SS format (UTC).
  --expire_before_version VERSION
                 Version cutoff for expire operations.  Deletes data files containing no data at or after VERSION.
  --restorable_after_timestamp DATETIME
                 For expire operations, set minimum acceptable restorability to the version equivalent of DATETIME and later.
  --restorable_after_version VERSION
                 For expire operations, set minimum acceptable restorability to the VERSION and later.
                 For describe operations, lookup versions in the database to obtain timestamps.  A cluster file is required.
  -f, --force    For expire operations, force expiration even if minimum restorability would be violated.
  -s, --snapshot_interval DURATION
                 For start operations, specifies the backup's target snapshot interval as DURATION seconds.  Defaults to 864000.
  -e ERRORLIMIT  The maximum number of errors printed by status (default is 10).
  -k KEYS        List of key ranges to backup.
                 If not specified, the entire database will be backed up.
  -n, --dry-run  For start or restore operations, performs a trial run with no actual changes made.
  -v, --version  Print version information and exit.
  -w, --wait     Wait for the backup to complete (allowed with `start' and `discontinue').
  -z, --no-stop-when-done
                 Do not stop backup when restorable.
  -h, --help     Display this help and exit.


     Blob account secret keys can optionally be omitted from blobstore:// URLs, in which case they will be
     loaded, if possible, from 1 or more blob credentials definition files.

     These files can be specified with the --blob_credentials argument described above or via the environment variable
     FDB_BLOB_CREDENTIALS, whose value is a colon-separated list of files.  The command line takes priority over
     over the environment but all files from both sources are used.

     At connect time, the specified files are read in order and the first matching account specification (user@host)
     will be used to obtain the secret key.

     The JSON schema is:
        { "accounts" : { "user@host" : { "secret" : "SECRETKEY" }, "user2@host2" : { "secret" : "SECRET" } } }

Via download and spot checking random versions, apparently flag support was added in 6.1.

It looks like using the environment variables for TLS should still work on the 6.0 binaries.

After setting environment variables, it works. Thanks Alex.

I responded too early. fdbbackup works with environment variables, but backup agent doesn’t.

fdbbackup start -d file:///var/lib/foundationdb/backup -C /var/lib/foundationdb/fdb.cluster

The backup on tag `default’ was successfully submitted but no backup agents are responding.

I tried to start “TLS-happy” backup agent as suggested in (Solved) Correct setup of TLS for FoundationDB, but backup_agent hang now.

#FDB_TLS_CERTIFICATE_FILE="/etc/foundationdb/certs/cert/nugraph_fdb.crt" FDB_TLS_KEY_FILE="/etc/foundationdb/certs/cacerts/nugraph_fdb_cacerts" FDB_TLS_CA_FILE="/etc/foundationdb/cert.crt" /usr/lib/foundationdb/backup_agent/backup_agent -C /var/lib/foundationdb/fdb.cluster


Any more suggestion?

If the TLS environment variables worked for fdbbackup, then as far as I know, the exact same variable settings should work for the backup agents. Searching the backup_agent trace files for TLS should probably give you some sort of error message that will point to what is wrong?

The backup agent will keep running until it is killed, talking to the cluster and receiving work for new backup tasks. It may not print anything to standard output. You should be able to see evidence of the backup agents doing work by checking the backup status. If the backup agents aren’t running and connecting, the backup will not make progress.

backup agent needs to run on all nodes, including stateless/master/log/resolver etc, right?

No, the backup agent runs independently of the other processes. It gets its data from the cluster, rather than directly from the data volumes, so it can even run on a totally different set of nodes if that makes sense for your deployment.