FoundationDB

Unable to fake multiple "machine_id" on the same host for testing


(Christophe Chevalier) #1

Once upon a time, I frequently used the ability to set multiple machine_id=XXX values in the same foundationdb.conf to fake having multiple physical host on a single host: this is very useful when testing ‘double’ or ‘triple’ redundancy on a single dev box (with multiple hard disks), and it also saved me multiple times in production.

It does not seem to work anymore with 6.x? I’m using 6.0.15, see the conf/logs at the end.

I saw that machined_id has been renamed to locality_machineid in the .conf file, and updated accordingly, but this still does not work: when doing a “configure double”, the cluster does not come back online and is tring to recruit more log workers unsuccessfully

Looking at the code that throws no_more_servers errors, I see that it uses the dcid() or zoneid(), but not the machineid ? It also uses the IP address of the node, so I tried using 3 different IPs for all the fake nodes (127.0.0.1, and two other IP addresses that my dev box is using) and the results are the same…

Example code that tries to find storage workers:

for( auto& it : id_worker )
    if( workerAvailable( it.second, false ) &&
		!excludedMachines.count(it.second.interf.locality.zoneId()) &&
		( includeDCs.size() == 0 || includeDCs.count(it.second.interf.locality.dcId()) ) &&
		!addressExcluded(excludedAddresses, it.second.interf.address()) &&
		it.second.processClass.machineClassFitness( ProcessClass::Storage ) <= ProcessClass::UnsetFit ) {
	return std::make_pair(it.second.interf, it.second.processClass);
}

Is this “feature” still a thing? This is soooo helpful that I’m a bit sad that it seems gone from 6.x :frowning:

Is there another way to fake this? I could try changing also the zone or dc id, but I’m afraid that this would create other issues. My goal at the moment is only to have at least double redundancy, so that I can test some things…

Here is the foundationdb.conf on my single machine that attempts to simulates 3 machines with 4 processes each:

[fdbmonitor]
restart_delay = 20

[general]
cluster_file=C:\ProgramData\foundationdb\fdb.cluster

## Default parameters for individual fdbserver processes
[fdbserver]
public_address = auto:$ID
listen_address = public
parentpid = $PID
command=C:\Program Files\foundationdb\bin\fdbserver.exe
datadir=C:\ProgramData\foundationdb\data\$ID
logdir=C:\ProgramData\foundationdb\logs

# Fake NODE01
[fdbserver.4500]
locality_machineid = NODE01
[fdbserver.4501]
locality_machineid = NODE01
[fdbserver.4502]
locality_machineid = NODE01
[fdbserver.4503]
locality_machineid = NODE01

# Fake NODE02
[fdbserver.4510]
locality_machineid = NODE02
[fdbserver.4511]
locality_machineid = NODE02
[fdbserver.4512]
locality_machineid = NODE02
[fdbserver.4513]
locality_machineid = NODE02

# Fake NODE03
[fdbserver.4520]
locality_machineid = NODE03
[fdbserver.4521]
locality_machineid = NODE03
[fdbserver.4522]
locality_machineid = NODE03
[fdbserver.4523]
locality_machineid = NODE03

Here is what I get in the fdbcli console, after wiping everything and starting over!

C:\WINDOWS\system32>net start fdbmonitor
The FoundationDB Server Monitor (fdbmonitor) service is starting..
The FoundationDB Server Monitor (fdbmonitor) service was started successfully.


C:\WINDOWS\system32>fdbcli
Using cluster file `C:\ProgramData\foundationdb\fdb.cluster'.

The database is unavailable; type `status' for more information.

Welcome to the fdbcli. For help, type `help'.
fdb> configure single new ssd
Database created
fdb> status details

Using cluster file `C:\ProgramData\foundationdb\fdb.cluster'.

Configuration:
  Redundancy mode        - single
  Storage engine         - ssd-2
  Coordinators           - 1

Cluster:
  FoundationDB processes - 12
  Machines               - 3
  Memory availability    - 3.3 GB per process on machine with least available
                           >>>>> (WARNING: 4.0 GB recommended) <<<<<
  Fault Tolerance        - 0 machines
  Server time            - 12/21/18 19:20:01

Data:
  Replication health     - (Re)initializing automatic data distribution
  Moving data            - unknown (initializing)
  Sum of key-value sizes - unknown
  Disk space used        - 0 MB

Operating space:
  Storage server         - 45.2 GB free on most full server
  Log server             - 45.2 GB free on most full server

Workload:
  Read rate              - 0 Hz
  Write rate             - 0 Hz
  Transactions started   - 0 Hz
  Transactions committed - 0 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  10.10.0.173:4500       (  1% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.1 GB / 3.3 GB RAM  )
  10.10.0.173:4501       (  1% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4502       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4503       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4510       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4511       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4512       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4513       (  1% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4520       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.4 GB RAM  )
  10.10.0.173:4521       (  2% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.1 GB / 3.3 GB RAM  )
  10.10.0.173:4522       (  0% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.0 GB / 3.3 GB RAM  )
  10.10.0.173:4523       (  1% cpu;  6% machine; 0.002 Gbps;  4% disk IO; 0.1 GB / 3.3 GB RAM  )

Coordination servers:
  10.10.0.173:4500  (reachable)

Client time: 12/21/18 19:20:01

fdb> configure double
ERROR: The database is unavailable
Type `configure FORCE <TOKEN>*' to configure without this check
fdb> status details

WARNING: Long delay (Ctrl-C to interrupt)

Using cluster file `C:\ProgramData\foundationdb\fdb.cluster'.

Recruiting new transaction servers.

Need at least 2 log servers, 1 proxies and 1 resolvers.

Have 12 processes on 3 machines.

Timed out trying to retrieve storage servers.

fdb>

Here are some things that I see in the CC’s log:

	Line 559: <Event Severity="20" Time="1545416397.826630" Type="RecruitStorageNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,TL" />
	Line 801: <Event Severity="20" Time="1545416398.426781" Type="RecruitStorageNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />
	Line 1639: <Event Severity="20" Time="1545416480.117065" Type="RecruitFromConfigurationNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />
	Line 1651: <Event Severity="20" Time="1545416480.617592" Type="RecruitTLogMatchingSetNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />
	Line 1653: <Event Severity="20" Time="1545416480.617592" Type="RecruitStorageNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />
	Line 1671: <Event Severity="20" Time="1545416481.619681" Type="RecruitTLogMatchingSetNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />
	Line 1673: <Event Severity="20" Time="1545416481.619681" Type="RecruitStorageNotAvailable" ID="4bcd542f38ab421f" Error="no_more_servers" ErrorDescription="Not enough physical servers available" ErrorCode="1008" Machine="10.10.0.173:4500" LogGroup="default" Roles="CC,SS,TL" />

(A.J. Beamon) #2

I believe the zone ID is now the property used to determine whether different processes are in the same fault domain, so you’ll need to set it in order to replicate the old behavior. Although for me it also still mostly works if I use the old style machine_id=X argument (except that status reports there is only 1 machine). This is because machine_id is actually setting the zone ID, possibly for backward compatibility reasons.

My recommendation, though, is that you set both locality_machineid and locality_zoneid in your configuration if you want to simulate a cluster with multiple hosts where you want each host to be treated as a separate fault domain.

I’ll also note that status isn’t the most clear here, as you’ve discovered. It reports that you have 12 processes on 3 machines, but that’s not really the thing that it’s interested in anymore. It will happily recruit from 12 processes on 1 machine if the processes are in different fault domains, and it will also fail to recruit from 12 processes on 12 machines if they are all in the same fault domain. We have a task outstanding to clean this up, though it doesn’t look like it’s gotten any traction. I’ll re-raise it as an issue in GitHub.


(Christophe Chevalier) #3

I added locality_zoneid = ZONExx in foundationdb.conf and the cluster came online instantly:

Config:

# Fake NODE01
[fdbserver.4500]
locality_machineid = NODE01
locality_zoneid = ZONE01
[fdbserver.4501]
locality_machineid = NODE01
locality_zoneid = ZONE01
[fdbserver.4502]
locality_machineid = NODE01
locality_zoneid = ZONE01
[fdbserver.4503]
locality_machineid = NODE01
locality_zoneid = ZONE01

# Fake NODE02
[fdbserver.4510]
locality_machineid = NODE02
locality_zoneid = ZONE02
[fdbserver.4511]
locality_machineid = NODE02
locality_zoneid = ZONE02
[fdbserver.4512]
locality_machineid = NODE02
locality_zoneid = ZONE02
[fdbserver.4513]
locality_machineid = NODE02
locality_zoneid = ZONE02

# Fake NODE03
[fdbserver.4520]
locality_machineid = NODE03
locality_zoneid = ZONE03
[fdbserver.4521]
locality_machineid = NODE03
locality_zoneid = ZONE03
[fdbserver.4522]
locality_machineid = NODE03
locality_zoneid = ZONE03
[fdbserver.4523]
locality_machineid = NODE03
locality_zoneid = ZONE03

Result:

C:\WINDOWS\system32>fdbcli
Using cluster file `C:\ProgramData\foundationdb\fdb.cluster'.

The database is available.

Welcome to the fdbcli. For help, type `help'.
fdb> status details

Using cluster file `C:\ProgramData\foundationdb\fdb.cluster'.

Configuration:
  Redundancy mode        - double
  Storage engine         - ssd-2
  Coordinators           - 1

Cluster:
  FoundationDB processes - 12
  Machines               - 3
  Memory availability    - 2.9 GB per process on machine with least available
                           >>>>> (WARNING: 4.0 GB recommended) <<<<<
  Fault Tolerance        - 0 machines
  Server time            - 12/21/18 21:00:11

Data:
  Replication health     - (Re)initializing automatic data distribution
  Moving data            - unknown (initializing)
  Sum of key-value sizes - unknown
  Disk space used        - 0 MB

Operating space:
  Storage server         - 45.1 GB free on most full server
  Log server             - 45.1 GB free on most full server

Workload:
  Read rate              - 20 Hz
  Write rate             - 7 Hz
  Transactions started   - 8 Hz
  Transactions committed - 4 Hz
  Conflict rate          - 0 Hz

Backup and DR:
  Running backups        - 0
  Running DRs            - 0

Process performance details:
  10.10.0.173:4500       (  7% cpu;  9% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )
  10.10.0.173:4501       (  6% cpu;  9% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4502       (  7% cpu;  9% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4503       (  6% cpu;  9% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )
  10.10.0.173:4510       (  6% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4511       (  6% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4512       (  6% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4513       (  7% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )
  10.10.0.173:4520       (  6% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )
  10.10.0.173:4521       (  7% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )
  10.10.0.173:4522       (  6% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.0 GB / 2.9 GB RAM  )
  10.10.0.173:4523       (  7% cpu; 10% machine; 0.002 Gbps;  7% disk IO; 0.1 GB / 2.9 GB RAM  )

Coordination servers:
  10.10.0.173:4500  (reachable)

Client time: 12/21/18 21:00:11

Though, when trying to run coordinators auto I’m getting ERROR: Too few fdbserver machines to provide coordination at the current redundancy level, even though it actually did change to 3 coordinators (the error message made me think initially that the operation failed).

I thought that 3 machines was “enough” for double replication? is this a side effect of also changing the zone id?

...
Coordination servers:
  10.10.0.173:4500  (reachable)

fdb> coordinators auto
ERROR: Too few fdbserver machines to provide coordination at the current redundancy level
fdb> coordinators
Cluster description: 691107463418911801231975
Cluster coordinators (3): 10.10.0.173:4503,10.10.0.173:4513,10.10.0.173:4523
Type `help coordinators' to learn how to change this information.
fdb> status details
...
Coordination servers:
  10.10.0.173:4503  (reachable)
  10.10.0.173:4513  (reachable)
  10.10.0.173:4523  (reachable)

(A.J. Beamon) #4

I’m a little perplexed by this one. 3 should be enough, and I don’t think it should be changing the coordinators when you get back this error, so let me look into it a bit.


(A.J. Beamon) #5

Ok, I think I see what’s going on here, and it’s just a bug in the coordinator changing logic. When you change coordinators, it needs to set a key and commit it, but the result of the transaction will not be known. In your case, this transaction is succeeding and the coordinators get changed. Then it basically goes through the process of trying to set the coordinators again as a way to confirm that the coordinators were changed (and to retry if not).

During this process, it reads the list of workers (corresponding to all the processes in the cluster), but there is a race here with the cluster controller, who is repopulating this list. If the list comes back empty, it goes through the process of checking whether the list of current coordinators is “acceptable”, which uses IP addresses and not locality information to decide (and includes a fixme comment for this). Of course, in your configuration the coordinators share an IP and are rejected. They then proceed to try to allocate coordinators from the list of workers that was read earlier, and since there are none it reports that there are too few machines.

I think there are two problems here – the first is that there exists a path to confirm the acceptability of coordinators that doesn’t account for locality, which among other things means that coordinators auto may choose not to change coordinators when multiple coordinators share a fault domain.

The other is that this retry logic means that successful coordinator changes may report back to the user as errors. This can probably be resolved by storing the desired coordinators chosen initially and checking for those specific coordinators after the change.

I’ll create a GitHub issue for this.


(Christophe Chevalier) #6

Well, it’s always when exploring these edge cases that we found new ways to break the system :slight_smile: Hopefully this will help make it even more robust.

Just a small thing I noticed: the coordinators were all the 45*3 (last port) of each fake host. This is fine since any node will be able to be a coordinator, but when exploring what is the best configuration for perfs (number of processes, classes for each process, etc…), having the coordinators be the last of each host means that if you want to reduce the number of processes (here I went from 4 to 3 per “host”) you will always probably remove the last entry in the conf file, which was a coordinator. If the coordinator was the “first” (smallest port? first entry in the foundationdb.conf?) it would make it less likely. Not really a huge issue, just a small hindrance (had to manually touch up the fdb.cluster after realizing that I killed all my coordinators at the same time).


(A.J. Beamon) #7

I think the coordinators are being chosen by selecting the first N coordinators that match some criteria from a list ordered by a random process ID, which should mean they are well dispersed throughout the cluster. My own experiments seems to confirm that the ports being used do vary.

I do see the benefit of your proposal, although I’m not sure if there are any drawbacks. I can bring it up with others on the team here, though. Either way, if it is valuable to have the coordinators be on the first port, you can always select your coordinators explicitly rather than using auto.


(Christophe Chevalier) #8

Ok maybe by random luck all the selected ports were the last ones in my case?

Like you said, if you are experimenting maybe it is best to manually specify a list of coordinators to make life easier. I’m not sure if there’s a need to change the current algorithm just for this specific case (unless it is considered safe?).


Fresh install on Windows fails with FileOpenError `Must be unbuffered` when attempting to open 'xxxx-0.fdq' file