Potential alternatives to "master" and other controversial terms in our codebase

The IETF has a good write-up on potentially oppressive language that is widely used in technical contexts: https://tools.ietf.org/id/draft-knodel-terminology-00.html. The two specific areas that are addressed in that document are “master/slave” and “whitelist/blacklist”. We have some limited uses of the term “whitelist”, which is worth revisiting, but I think the biggest one for FDB is the term “master”, which is used widely due to the presence of the master role. Changing the master role would be a large undertaking, but I think it’s worth considering.

I think it’s worth considering a new name for the master role for a couple of different reasons. On reason is that, though the master role is not being used in a master/slave context explicitly, it does bring with it that connotation, which could be offputting to some in our community. A second reason is that the term “master” doesn’t seem to clearly define the tasks associated with this process. From informal discussions I’ve had in the past, it seems like the name made more sense when the master process had more of an active role in controlling other process, but has become anachronistic as FoundationDB’s architecture has changed over the years. It could be good to take an opportunity to consider alternative names that would communicate the role’s intent more clearly to people using and developing FoundationDB.

Another use of the term, which the IETF document does not directly address, is the use of “master” as the default branch name. I know that that usage has a different context, though I don’t fully understand the etymology. It’s still worth considering whether the “master branch” is the best name, given how it can be tied into connotations that we don’t want. I’m partial to “main” as an alternative name, since it is equally clear or even clearer, and also shorter. I recommended this for our Kubernetes operator repo: https://github.com/FoundationDB/fdb-kubernetes-operator/issues/253. I think it’s worth considering for our other repos as well.

I’d love to get everyone’s feedback on whether these are worthwhile changes, and if there’s other terminology that we can improve on in our codebase.

4 Likes

Thanks John for bringing up this topic for community discussion. I fully support making these changes.

I found the IETF draft to be very helpful, thanks for sharing the link. Another source that folks may want to read is this blog post: A Guide to Nomenclature Selection.

“Main” makes a ton of sense to me as a replacement for the “master branch” – although I know some folks prefer “trunk” since it fits the branch / tree metaphor.

Ideally a change like this could be made across the entire FDB ecosystem (at least those in the /foundationdb/ GitHub org), not just the KV store or individual projects like the K8s operator.

One thing to add is that earlier this week the CEO of GitHub tweeted that they’re working on renaming the default GitHub branch name: https://twitter.com/natfriedman/status/1271253144442253312. Fingers crossed that this work incl redirecting previous branch names – that would make such a change even simpler to adopt.

Thanks for bringing this up.

I think whitelist is only used for the snapshot backup feature. So renaming this should be easy.

Master and MasterProxy are the main ones that we should rename. MasterProxy should just be proxy anyways. I don’t like the proposed names (trunk or main) as I think they are not very descriptive (neither is “master” - but if we rename the role anyways we could as well try to find a better name). IIRC @Evan was planning to move the recruitment logic and the recovery logic to the cluster controller.

Cluster Controller is imho a good name and describes a role that recruits workers and initiates recoveries well. Master is kind of a stupid name anyways - as this role then doesn’t make any decisions. So maybe just call it “Version Generator”?

One consideration we have to make is that changing these names will break external tooling. So I think we should support (but not document) the old command line arguments etc for a few versions. For traces I am not sure what to do…

I would be willing to help with this effort if/when we reach a consensus for the naming and the rename strategy.

I think “trunk” and “main” were suggested as the branch names, rather than the role names.

I kind of like “trunk” just because it’s a fun word, but yeah, “versioner” or something sounds better for the role name–something with “versions”.

Ohh - I need to work on my reading abilities :slight_smile: I have to admit that I don’t really care about what that branch is renamed to. Though for simplicity it probably should be whatever github uses as a default?

For what it’s worth: I don’t like trunk: changes are often applied first to a version branch and then merged back into master/main/trunk. I liked “develop” as a name, but I think most people don’t use that.

I’d like to propose Pacer as the new name for the master role. It sets the “pace” of version advancement in the database, it is the first role to reach each new version, and its notion of versions per second is in practice what the cluster will experience as time passes. I think there are a couple of analogies that work here, such as a pace car in racing or a pacemaker setting heart rhythm.

Or perhaps Versionmaker?

Sequencer is a relatively common name which I think could replace Master.

Proxy is also an easy replacement for MasterProxy, and I was honestly confused the first time I went looking for the Proxy code.

1 Like

If this is happening, it might be worth waiting a little bit to hear the details. If they adopt a new default branch name, then the most sensible choice may be to use what the new default is.

We inherited “redzone” from valgrind in a few places. :roll_eyes:

ADDRESS_SANITIZER_REDZONE -> ADDRESS_SANITIZER_PADDING?

edit: Actually, that’s from NaCL, not valgrind, and looks like it’s part of their API. Still, it’s something to watch out for.

I think waiting for an official recommendation from GitHub makes sense. I think there’s valid concerns around fragmenting the terminology and creating more confusion, but a change to defaults from GitHub would send a good signal that people are converging around a replacement.

It looks like GitHub went forward with main as the new default branch name. They have more details on it here: GitHub - github/renaming: Guidance for changing the default branch name for GitHub repositories. Is this something we should re-evaluate?

Record Layer did the rename from master to main at the beginning of the year, shortly after GitHub added tooling to make it easy. Nothing surprising happened.