FoundationDB

Fresh LXD container installation of fdb 6.0.15 fails with Database unavailable


(Cloudspeech) #1

Using latest LXD 3.7 (snap package) on Ubuntu I launched a fresh Ubuntu 18.04 container.

I apt-installed the .deb packages provided on fdb’s download pages for version 6.0.15 in the order client-server.

The installation ends with ‘Database unavailable’, and ‘status details’ in fdbcli gives:

`Using cluster file /etc/foundationdb/fdb.cluster’.

Could not communicate with a quorum of coordination servers:
127.0.0.1:4500 (unreachable)`

How to fix this?


(Cloudspeech) #2

After some digging around I am pretty sure this is related to https://github.com/apple/foundationdb/issues/274.

The issue discussed therein is incompatibility of ZFS with O_DIRECT file I/O, which is
used by foundationDB.

The default install of LXD uses ZFS.

Strace’ing fdbserver uncovers this line involving O_DIRECT:

openat(AT_FDCWD, "fdb/4500/processId.part", 
O_RDWR|O_CREAT|O_TRUNC|O_DIRECT, 02600) = -1 EINVAL (Invalid argument)

Which then leads to:

write(2, "ERROR: error creating or opening"..., 71ERROR: error creating or opening process id file `fdb/4500/processId'.
) = 71
write(2, "Fatal Error: Disk i/o operation "..., 39Fatal Error: Disk i/o operation failed

(Cloudspeech) #3

The issue is indeed resolved by using a LXD storage-pool type that is compatible with foundationDB (cf. https://lxd.readthedocs.io/en/latest/storage/).

For example, LVM with underlying ext4 works.

(For those, who - like me - want to preserve their precious container content but just get foundationDB to work:
lxc storage create lvm lvm
lxc profile copy default fdb
lxc profile edit fdb # change ‘pool’ value to ‘lvm’ in editor
lxc stop my-container
lxc copy my-container fdb-container --container-only --profile fdb
lxc start fdb-container

You may have to apply the advice from Trying to make docker image; fdbcli says database not created after that.
)