Seeing 'FinishMoveKeysTooLong' and 'RelocateShardTooLong' while data rebalancing

ajames · July 24, 2020, 4:15am

Hello,

We added couple of new nodes with extra storage processes to an existing 5 node cluster.
Occasionally seeing this error in the log. Not sure what it means. Could someone help me in understanding more? Is this something I need to be worried about?

FDB version - FoundationDB 6.2 (v6.2.15)

Event Severity=“30” Time=“1595560530.339613” Type=“RelocateShardTooLong” ID=“0000000000000000” Duration=“621.046” Dest=“150305ac25ba33d73affbcc775623b27,8c591274b23a33346823f3d74e354919” Src=“6262d0fd036dad4ffd2a9d09006e7716,6e22f6457d66f4acd9c6606649a85410” Machine=“172.41.21.21:4510” LogGroup=“default” Roles=“DD”
Event Severity=“30” Time=“1595560544.320335” Type=“FinishMoveKeysTooLong” ID=“0000000000000000” Duration=“600” Servers=“8c591274b23a33346823f3d74e354919,c5d02c8049f05de123f951ed57ae255e” Machine=“172.41.21.21:4510” LogGroup=“default” Roles=“DD”

mengxu · July 24, 2020, 4:57am

Data distribution rebalance data across nodes by relocating shards.
The message means it takes too long (10min) to relocate a shard.

This may be caused by:

The source storage servers are busy and don’t have extra bandwidth to handle the relocate shard workload. You need to reduce the client traffic;
The storage servers may not have enough CPU to handle the traffic.
Network between the nodes may have low bandwidth.

ajames · July 24, 2020, 2:52pm

Thank You @mengxu for the quick response. There was a CPU spike and disk were under heavy utilization around the time of errors. But eventually those settled down.
From what you explained, I assume it is a warning that shard relocation is taking too long, but eventually it will be relocated. Forgive my ignorance here, but does this have any other bad effect on the system apart from longer data rebalance time?

mengxu · July 24, 2020, 3:33pm

Severity=30 messages are warnings instead of errors.

Data rebalance affects:

The disk utilization on each node, and
load balancing on read and write traffic. If a shard is not balanced to the destination storage servers quickly, that shard may become hot shard (i.e., too much traffic to the single shard). Hot shard is bad for a cluster because the cluster’s Ratekeeper will eventually kicks in and throttle the cluster’s traffic.

Back to your question, the behavior you saw is a hazard to the cluster’s health because it takes longer time to balance load.

ajames · July 24, 2020, 6:59pm

Understood and thank you @mengxu.

We had a 5 node cluster (2*1.9TB nvme/node) with 28 storage processes (working on top of 7 nvme) and 3 Tx process (working on rest of 3 nvme) in total. Cluster had ~2.9TB of KV data and ~6TB (disk used, RF=2).
Added 2 more similar nodes with additional 8 storage processes (4/nvme) and 2 stateless processes on each new node.

We added 2 nodes in one go, is that ok or is it recommended to add one node at a time? Is there any recommendations with regards to tunings we should do?

mengxu · July 27, 2020, 2:11am

It is better to add multiple storage server nodes at once. So what you did is preferred.

Say the replication factor is k (which is 2 in your case), it’s better to add >=k storage servers at once to expand the cluster. Otherwise, some existing storage server may see temporary storage usage increase when the new SS is added.

For tLog server, it does not matter that much. But adding them all at once can help reduce the number of cluster recoveries, so IMO it is preferred.

Topic		Replies	Views
Debugging Data Distribution Using FoundationDB	3	910	November 14, 2018
What does FetchKeysTooLong trace event denote? Using FoundationDB	4	651	April 6, 2020
Unknown FDB downtime Running FoundationDB	11	728	June 17, 2021
Full disk on one machine results in 99% performance degradation Using FoundationDB	5	2210	November 8, 2018
Data distribution and rebalancing Using FoundationDB	3	1291	October 28, 2023

Seeing 'FinishMoveKeysTooLong' and 'RelocateShardTooLong' while data rebalancing

Related topics