Permanently remove excluded IP addresses

A dumb question, but is there a way to remove already excluded IP addresses? We tend to fiddle with the size of our clusters every couple of weeks, upgrade and move to different locations, etc. The challenge is that we have quite low capacity for IP addresses (only /20) and so many times we get assigned IP address that was already excluded and then we have to pay extra attention and make sure we also include it.

It’s a minuscule problem, but if there is no reason to keep the past list around, would be nice to be able to remove them.

fdbcli> include all ? Or are you asking for a variant of exclude that automatically expires after some amount of time?

exclude/include is meant to be driven by whatever automation removes machines from your FDB cluster, so the include of the IP should be run once the excluded machine is removed, and it’s assured that it will never try to rejoin the cluster.

It’s a bit hacky, but I get it. At least it solves what I want. Thank you

I don’t think it is hacky. exclude is used to tell the cluster to move the resources on the target machine to others. So it is reasonable if the progress has completed, you can safely include that IP again.

Or if your progress can satisfy the failure tolerance, I think you don’t need to exclude at all. Just leave the cluster to health. For example, you have a 5 nodes cluster and want to move two nodes to another location. Just shut down those two nodes and create two new ones to join. Although this is somehow unsafe as you may lose data if one additional node fails during the progress.

Another possibility is that you may consider maintenance:

fdb> help maintenance 

maintenance [on|off] [ZONEID] [SECONDS]

Mark a zone for maintenance.

Calling this command with `on' prevents data distribution from moving data away
from the processes with the specified ZONEID. Data distribution will
automatically be turned back on for ZONEID after the specified SECONDS have
elapsed, or after a storage server with a different ZONEID fails. Only one
ZONEID can be marked for maintenance. Calling this command with no arguments
will display any ongoing maintenance. Calling this command with `off' will
disable maintenance.

I’m usually moving tens of them. Also found out that when I run exclude the cluster is reacting more swiftly.

It’s hacky when you have no understanding of the underlying processes and rely on the meaning of the english word. I would never thought that exclude and include do what they do based on their general name.

Yeah, natural language is mostly ambiguous :frowning:

1 Like