Debugging Data Distribution

I’d like to know how I can go about debugging why FDB is or is not moving data. A lot of times a particular process would show up with a ton of CPU and storage queues would go up (throughput suffers), we can pull back on the traffic but DD doesn’t seem to be moving data away from it (I know for a fact that if i just kill -9 the process, the cluster immediately behaves and since this is a double redundancy cluster, it’s not a write hotspot since exactly one SS is hot).

To be fair, these are clusters that are artificially sized to take 1M+ writes (with proxies/resolvers/logs to match)

FoundationDB will only move data for a few reasons:

  1. to restore replication after a failure
  2. to split or merge a shard to a size between ~125MB to 500MB (this range is smaller grows and shrinks with database size, these numbers are the max for the largest databases)
  3. to split or merge a shard because of a lot of recent writes (write hotspot)
  4. to balance the bytes stored across the storage servers

It does not balance based on high read traffic. When it moves a shard it only considers bytes stored, not write traffic.

This means it is possible to randomly assign the same storage server to multiple high read or write traffic ranges.

Your best bet is to look at the StorageMetrics on the hot storage server to see if you observe higher read or write traffic compared to other servers.

You can also look at MovingData and TotalDataInFlight, to see the type of work currently being done by data distribution.

RelocateShardStartSplitx100 will give you a sampled view of why shards are being split, and RelocateShardMergeMetrics will be logged every time two shards are merged.

TeamHealthChanged will tell you when a team of servers has become unhealthy.

Thanks @Evan, yeah, it’s mostly the case that a single process gets too hot and the only recourse is to exclude it and it mostly works. Just wanted to see if there’s a more gentle way to nudging the cluster to move the range.