I have been playing with the recently posted official Docker image and wanted to get clarification on certain details. Thanks in advance.
It seems that this Docker image runs fdbserver directly without fdbmonitor. Is there any specific reason for not launching foundationDB via fdbmonitor?
Without fdbmonitor daemon running, how could we handle backup_agent for periodic backups? Are we supposed to trigger those agents manually?
I think both your questions kind of boil down to that the posted docker image was converted from an image used for testing (e.g. docker-compose) and not specifically crafted for a production deployment. This wasn’t done intentionally, only that we don’t have the knowledge or experience in running docker-based storage services to make something productionized, and the previous attempt at doing so (#355) stalled out.
The lack of fdbmonitor will also potentially make cluster-wide restarts for upgrades more difficult, if it’s being used for a production use case. I’ve also been told though that orchestration systems prefer having the service be directly executed, because then they have better knowledge of if/when it dies.
For the backup agent, you’d currently need to either launch it separately or create a similar docker image that runs the backup agent. Suggestions/advice/feedback/etc. on all of this is very welcome.
I missed that @john_brownlee responded on the issue you filed instead. I’ll copy it here, as he did the work so his answer is more accurate than mine:
I decided not to use fdbmonitor in the interest of having each container run as few things as possible. For running backup agents, I think that the best approach would be to run a separate container for them. I think the main advantage that fdbmonitor would provide in a dockerized environment would be to support bouncing the fdbserver processes in a fast and coordinated way to pick up new arguments or (especially) to upgrade FoundationDB. Doing that would require a way to change the monitor conf file without bouncing the container, and that will likely depend on features that are specific to a deployment platform, so I figured they were out of the scope of the initial work.