Where is metadata about key and table distribution stored?

(Jeff Baker) #1

When the database says it is “Healthy (Rebalancing)” what does it mean? Is there a metadata structure I can inspect to see why it is rebalancing, from and to where, and when it expects to complete, and so forth?

(Alex Miller) #2

Healthy (Rebalancing) means that there’s ongoing data distribution activity. That activity isn’t to re-replicate data due to a failure, it’s only trying to even out the data stored on various nodes.

I believe there’s system keys in the database that hold what the ongoing transfers are, but there isn’t anything implemented that I’m aware of that exposes that information. @mengxu has been doing some work near this, and might be able to describe it better. I don’t believe there’s any estimate calculated as to when the transfer is expected to complete either. FDB lacks a good amount of observability into data distribution, and it would make me happy to see that improve.

(Jeff Baker) #3

Thanks. I used the example of “Heathy (Rebalancing)” just to illustrate an opaque thing the database claims about itself. I’d like to be able to get visibility into data distribution and, in my experience, it’s also useful to have the ability to change it. For example it can be very helpful to pre-split a keyrange if you know you are about to create a hotspot on that range. APIs for inspecting and mutating this stuff would be swell.

(Meng Xu) #4

As far as I know, FDB does not have the metadata about “which data is relocated from where to where”.
The closest information is the “RelocateShardHasDestination” message in the TraceEvent, which is produced by this line.

I agree having such visibility in the data movement will be helpful.
If you are interested in contributing to such a feature, I’m willing to help as much as I can.

(David Scherer) #5

You can look at the keyServers metadata in the system keyspace directly. If a range of data is in flight, it will have “source” and a “destination” servers identified by their UIDs. But I’m not sure there is any index you can use to find key ranges which are in flight without scanning all of them. And this level of the system does not know “why” anything happens.

It is theoretically possible to disable automatic data distribution (by changing the \xff/dataDistributionMode key) and take it over entirely yourself, but needless to say this is a big job!