I started looking in k8s a month ago. Writing a custom resource is powerful, but I can’t take a call on if this is an “appropriate” solution here.
To answer this, I looked at different databases and how they scale.
- Cassandra: supports seeding from any node and uses StatefulSet. See doc
- CockraochDb: supports seeding from any node (Scale the cluster) and uses StatefulSet. See: cockroachdb-statefulset.yaml
- CouchBase: Supports seeding from any node (server-add) and has a custom k8s controller: Couchbase Operator.
- Redis: Supports seeding from any node (See Adding a new node section here) and there are custom controllers available. This can also be deployed as a simple StatefulSet.
Some clarifications
- Few dbs above have a second step of enabling data-rebalancing which I have not discussed
- redis.conf provided to every node is essentially a static config (except self-port) and comes from k8s ConfigMap.
clusterfile is a dynamic file.
Some other findings
CouchBase has concept of multi-dimensional scaling (See Multi Dimensional Scaling section here). Dividing fdb pods in 3 categories, namely: stateless, log and storage, and then using a similar multi-dimensional scaling, where each dimension is a pod type, would be great for scaling in different situations.This is where custom controller will be great.
Conclusions as of now
I am not sure if vending clusterfile is a good use case for custom controller or not. But implementing and maintianing custom controller is another effort in itself. And many dbs support seeding in general and not take the custom controller way for “seeding based on a dynamic file”.
Next Steps
- To start with, could we simply have “fdbmonitor” take a “seed” node (host:port) of an existing cluster to effectively get cluster file?
- We need to resolve the changing IP thingy of coordinators. Could you suggest an action plan for this?