FDB has two parameters under moving_data category:
cluster.data.moving_data.in_flight_bytes
cluster.data.moving_data.in_queue_bytes
We don’t know how to interpret the parameters under different scenarios, such as removing a node, adding a node and cross-DC data movement, and need help.
When I remove a node, the moving_data_in_flight parameter starts at 80GB, then gradually and consistently decreases to 70, 60, … and 0. I interpret this as follows: FDB knows the total amount of data (80GB) to be rebalanced out of the node being removed. This parameter represents the amount of data left to be rebalanced. As the rebalancing continues, the parameter decreases.
When I add a node, however, moving_data_in_flight seems to have a different meaning. It starts at about 5GB, fluctuates a little bit, then becomes almost constant at 4.7GB. I check the disk usage of /var/lib/foundationdb at the newly added node, it is increasing from 10GB to 20, 30, …, 120, etc.
I have a few questions:
How should I interpret moving_data_in_flight (4.7GB) when I add a node?
When cross-region migration/replication is performed, what does moving_data_in_flight represent?
What’s the meaning of moving_data_in_queue?
Also if you can give some details of how FDB rebalance data when a node is added, I’ll highly appreciate it. For example,
Does FDB predetermine the total amount of data to be distributed to the new node, or does it rebalance as it sees fit (dynamically computed, without a predetermined number)?
How much data (in terms of bytes or shards) is pushed to the new node at a time?
Attached is a screenshot of our FDB-Grafana monitoring platform, when I add a node.
in_flight_bytes is the total amount of data that is currently moved among nodes (i.e., storage servers).
in_queue_bytes is the total amount of data in the queue that should be moved.
Data distribution (DD) first put data into the queue and relocate the queued data among storage servers.
The amount of data in flight is dynamically decided by the DD based on the cluster’s busyness: if a cluster is idle, DD will balance data more aggressively and vice versa.
When you add a node, that node will be teamed up with other nodes to form a group of 3-member server teams. Data will be moved to those newly firmly teams with the goal that balance the amount of data and read workload on each server.
When you remove a node, the data on the node will be moved to other nodes with the same goal above.
The amount of data (i.e. shards used in DD) to move per node depends on the type of movement. If it is a more urgent movement (say only 1 replica is left), DD moves more shards at a time (say 5000 shards) than when the movement is less urgent (say when 2 replica is left, DD moves 2500 shards). The related code is getWorkFactor in DataDistributionQueue.actor.cpp
Meng, thank you very much for your detailed explanations. I get a better understanding of data movement in FDB now.
A follow-up question:
I imagine that before data pieces are moved from the queue to storage servers and become in-flight, in_queue_bytes should be greater than 0.
I am guessing why we see in_queue_bytes == 0 in our second chart for hours:
The DD queueing and dequeueing are very fast for the adding node case and only last for a very short time. When we perform our 1-min-interval metrics sampling, the chances are very slim that we can hit the ephemeral durations when in_queue_bytes > 0, and we have been always hit the " in_queue_bytes == 0" moments.