Dockerized deployment

I’ve noticed that there is an official foundationdb Docker image. That’s fantastic, as Docker support is essential for many of us! But I have some questions and suggestions:

  • can we rely on this docker image to be maintained and updated?
  • the image seems to lack a README, even including the README that is in the source code next to the Dockerfile would be helpful!
  • I am looking at https://github.com/apple/foundationdb/issues/1730 and just wanted to point out that local development testing is not the only use case: deployment is, too, as well as developing with pre-populated data, so the ability to mount a volume as the data directory seems essential.
  • it took me a long time to stumble onto the example at https://github.com/apple/foundationdb/tree/master/packaging/docker/samples/local — perhaps this could be added to the README?
  • the example requires a local (host-native) fdbcli, wouldn’t it be better to use docker run using fdbcli from the image?

Lastly, on a more general note, browsing through GitHub issues I see mentions of Kubernetes deployments. I’d say there are several general classes of use cases for docker images:

  • local development/testing (e.g. docker run, docker-compose locally, with a docker-compose network)
  • remote development/testing (same thing, but executed using docker-machine on remote servers)
  • static dockerized deployment (docker-compose or simple docker run, but in production)
  • orchestrated deployment using simple tools (nomad, docker swarm)
  • Kubernetes/OpenStack

I wanted to point this out, because a well-designed docker image is useful to all of these use cases.

Also, while I am personally not a fan of Kubernetes and its complexity, I know that in a number of real-life enterprise scenarios it has become a hard requirement: a vendor selling a solution must be able to deploy it on customer-provided Kubernetes infrastructure. Which means that dockerization work is important for FoundationDB adoption.

Updating it as a part of the release process, so it should be updated in coordination with downloads being posted to the website.

For the suggestions, you are welcome to open issues for the docker improvements that you’d like to see. :slight_smile:

This is well known. The struggle is that having a well-designed docker image that is useful to a large number of use-cases requires a person doing the work that knows how to design a docker image well that will satisfy a large number of use-cases. This is something that, I believe, the current set of companies sponsoring FDB work lack.

I’ve been hopeful that this is an area where the community would help define or create the appropriate solution. If anyone would like to post a design doc for what a well-designed docker image would look like, it’s overly welcome. Early in FDB’s OSS history, there was an initial community PR that tried to do this, and the work stalled out for a variety of reasons. I think it highlights why we likely haven’t seen continued progress, unfortunately. The current docker image was meant as the smallest step forward that still yielded a meaningful improvement.

My current hope has been that from the upcoming FDB summit, some folk will give talks and release what they’ve done for Kubernetes/docker, and that will help to push the area along.

3 Likes

Thanks, Alex, this is great to hear!

As for filing issues or helping, I don’t feel qualified enough to do that, as I’m not a Docker expert, and I’m just learning about FoundationDB. There might have been reasons for some decisions, which I do not understand. For example, the FDB docker container doesn’t EXPOSE any ports, and some examples use host networking. In general, the idea behind containerization is that you only use the ports that you exposed and avoid host networking.

Another thing: FDB seems to make a lot of assumptions and actually implements some discovery/coordination/orchestration features on its own. The cluster file is assumed to be the same for all processes connecting to the cluster, which I don’t think is desirable, or even possible with dockerized deployments. Also, the cluster file uses IP addresses, while in docker you’d rather see names, which get resolved through docker/docker-compose, consul, etcd or other discovery services.

Then there is fdbmonitor: I’m not sure what is the right granularity for a dockerized FDB. If we want to go down to single-process level, then fdbmonitor is not desirable. I think it makes more sense to deploy FDB “units”, not individual processes. This leaves open the question of container health checks.

As I wrote, I do not feel qualified to provide answers and solutions at this point, I just wanted to know if containerization is considered important. I’m very glad to hear that it is and I will try to contribute as much as I can.