FoundationDB

Did something change in fdbmonitor in 5.2.5?


(Clement Pang) #1

Seems like just a normal dpkg install of the deb did not bounce the fdbserver processes and requires a killall fdbmonitor and then a start to fix. Something we noticed.


(Christophe Chevalier) #2

We also had something strange happen while upgrading from 5.1.7-1.el6 to 5.2.5-1.el7 on RHEL 7.5.

After the upgrade we had to reboot the vm because the fdbserver process would not be restarted properly. Also, before the reboot, attempting to stop/start the service gives an access defined error (even as root, see bellow).

I just tried again on another set of identically configured VMs, and I’m getting the same symptoms:

...$ sudo rpm -Uvh foundationdb-clients-5.2.5-1.el7.x86_64.rpm \
> foundationdb-server-5.2.5-1.el7.x86_64.rpm
[sudo] password for admin:
Preparing...                          ################################# [100%]
Updating / installing...
   1:foundationdb-clients-5.2.5-1.el7 ################################# [ 25%]
   2:foundationdb-server-5.2.5-1.el7  warning: /etc/foundationdb/foundationdb.conf created as /etc/foundationdb/foundationdb.conf.rpmnew
################################# [ 50%]
Cleaning up / removing...
   3:foundationdb-server-5.1.7-1.el6  ################################# [ 75%]
   4:foundationdb-clients-5.1.7-1.el6 ################################# [100%]

...$ sudo service foundationdb stop
Redirecting to /bin/systemctl stop foundationdb.service
Failed to stop foundationdb.service: Access denied
See system logs and 'systemctl status foundationdb.service' for details.
Failed to get load state of foundationdb.service: Access denied

...$ sudo service foundationdb start
Redirecting to /bin/systemctl start foundationdb.service
Failed to start foundationdb.service: Access denied
See system logs and 'systemctl status foundationdb.service' for details.

...$ sudo systemctl status foundationdb
Failed to get properties: Access denied

AFTER REBOOTING (no other change)

...$ sudo systemctl status foundationdb
[sudo] password for admin:
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; disabled; vendor preset: disabled)
   Active: inactive (dead)

...$ sudo service foundationdb start
Redirecting to /bin/systemctl start foundationdb.service

...$ sudo systemctl status foundationdb
● foundationdb.service - FoundationDB Key-Value Store
   Loaded: loaded (/usr/lib/systemd/system/foundationdb.service; disabled; vendor preset: disabled)
   Active: active (running) since Tue 2018-07-03 14:14:47 EDT; 16s ago
  Process: 1412 ExecStart=/usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run/fdbmonitor.pid --daemonize (code=exited, status=0/SUCCESS)
 Main PID: 1413 (fdbmonitor)
   CGroup: /system.slice/foundationdb.service
           ├─1413 /usr/lib/foundationdb/fdbmonitor --conffile /etc/foundationdb/foundationdb.conf --lockfile /var/run...
           ├─1414 /usr/lib/foundationdb/backup_agent/backup_agent --cluster_file /etc/foundationdb/fdb.cluster --logd...
           ├─1415 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datacenter_id DC01 --datadir /va...
           └─1416 /usr/sbin/fdbserver --cluster_file /etc/foundationdb/fdb.cluster --datacenter_id DC01 --datadir /va...

Jul 03 14:14:47 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbmonitor": Starting fd...500
Jul 03 14:14:47 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbmonitor": Starting fd...501
Jul 03 14:14:47 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4501": Launchi...501
Jul 03 14:14:47 xxxx fdbmonitor[1413]: LogGroup="default" Process="backup_agent.1": Launchi...t.1
Jul 03 14:14:47 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4500": Launchi...500
Jul 03 14:14:47 xxxx systemd[1]: Started FoundationDB Key-Value Store.
Jul 03 14:14:52 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4500": Warning...ds.
Jul 03 14:14:52 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4500":   Check...cli
Jul 03 14:14:52 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4501": Warning...ds.
Jul 03 14:14:52 xxxx fdbmonitor[1413]: LogGroup="default" Process="fdbserver.4501":   Check...cli
Hint: Some lines were ellipsized, use -l to show in full.

note: if I don’t touch the service right after the update, and immediately reboot, the foundationdb service is NOT started, but after starting it then everything works fine.

I’m guessing the update leaves the foundationdb service in a broken state that requires a reboot + at least one start to fix itself…

(note: I have upgraded 3 hosts out of 5, and I’m living the other 2 untouched for if you need me to check some things…)


(Christophe Chevalier) #3

Just noticed that the service was still disabled (not restarting):

$ sudo systemctl status foundationdb
...
Loaded: loaded (/usr/lib/systemd/foundationdb.services; disabled; vendor preset: disabled;
...

I needed to run sudo systemctl enable foundationdb to fix this.