Preventing machines from joining FDB cluster

Is there a way to distinguish between server to server communication and client to server communication in fdb? We’re currently using the same config file for both client and servers, but this make it’s really easy for people to “accidentally” join the production cluster if they have the server module installed locally.

Once we go to production we’d like to prevent this somehow. For example using firewall rules, but then we’d need to have different ports for server/client communications.

1 Like

Just want to add that this has happened to us multiple times also :slight_smile:

If the number of app server is very small, you can preemptively exclude them (using fdbcli), but this may not scale well if you have a lot of servers or dynamically assigned ip adresses.

So I’m also looking for a more scalable solution to this issue.

To answer your question: I don’t think this is currently possible. Even if you exclude the machines, the servers will still join the cluster (exclude simply means that they won’t get any roles assigned).

However, this looks concerning to me:

This might be clear to you, but I still want to point this out here: if people (I assume this means developers) can accidentally join your production cluster, you probably have a huge security issue. A developer shouldn’t be able to join a production cluster at all (not with a server and not with a client).

Imagine a developer assumes she is working on a local fdb instance and executes the following command:

fdbcli --exec 'clearrange \x00 \xff'

Now your production db is gone!

When working with FDB you should never rely on FDB for security. You should always keep production and development in different environments (by using firewall rules and VPN).

For development you probably don’t want a shared fdb cluster. Instead each dev should have a local fdb installation.

I think a source of rogue servers is Application Servers (that have to talk to the cluster) where you accidentally install the server packages (or in windows, forget to uncheck the server component which is enabled by default). If you edit their fdb.cluster (so that the app can run, and also to be able to test connectivity using fdbcli locally) then the fdbserver that spins up will also connect. And this usually happens on the next reboot of app servers, so you can easily miss this issue.

I think the proper way of preventing this is to make sure that the ports are closed. The difference between clients and servers is that clients don’t accept incoming connections. So if you run a fdbserver process on a client machine, you should see an error and the corresponding server process will never be functional.

I don’t disagree that it would be good if fdb had more functionality in this realm (for example I would like to be able to whitelist the clients - I would also like to have some secure form of authorization for clients and servers - but for now we live with what we have and use other facilities to prevent malicious use of our production systems)

I don’t think FDB should generally try to be a security boundary, because I think the right approach to building secure systems is to have as little software as possible in that position, and to have it have as few other requirements (besides security) as possible so that there’s a chance of getting it right.

And I also agree that developers shouldn’t normally have direct access to production machines from their workstations.

However, I think that defending against administrative mistakes is a proper concern of database software. I think it was probably a design mistake to have the default behavior be that new database servers join the cluster and get data moved to them; there are configurations where this is the right thing but it would better be opt in. I would probably +1 a change to make it so that, after “configure new”, servers are “excluded” by default rather than “included” by default, so that adding servers to the database would normally require an extra step in fdbcli rather than just starting them up with the right cluster file, unless a configuration option is changed to recover the current behavior. It would require much more careful and detailed design, though, and I don’t know if anyone is currently excited about doing that.

1 Like

We have also seen similar issues. Does the new versions have something better?