Database version difference

(Hieu Nguyen) #1

I deployed a 2-region, 3-DC FDB cluster and configure it with:

{
  "regions": [
    {
      "datacenters": [
        {
          "id": "dc1",
          "priority": 2
        },
        {
          "id": "dc2",
          "priority": 0,
          "satellite": 1
        }
      ],
      "satellite_redundancy_mode": "one_satellite_double",
      "satellite_logs": 24
    },
    {
      "datacenters": [
        {
          "id": "dc3",
          "priority": -1
        }
      ]
    }
  ]
}

and then set the usable_regions to 1.

I loaded my data into dc1. After loading completes, I set usable_regions to 2 so that the data is copied over dc3. After several hours, I checked the status using fdbcli and see this:

Data:
  Replication health     - Healthy (Rebalancing)
  Moving data            - 0.000 GB
  Sum of key-value sizes - 2.176 TB
  Disk space used        - 16.832 TB

It looks like the data movement completed. However, when checking status json, the data version difference is still -495671 (I expect it to be 0 since two data centers should have exactly the same data). My questions are:

  1. If the data movement completed, why the data version difference is not zero? If data is still moving, how could I know when the movement completes?
  2. Sometimes I see the data version difference is positive, sometimes it is negative. What does that mean? How the data version difference is computed then?
1 Like
(Alex Miller) #2

Versions advance with time, and thus the version at the primary will always be ahead of the secondary.

DatacenterVersionDifference is calculated by pinging the local and remote TLogs and asking them for what version they have, and then subtracting. Depending on the latencies of local vs remote, it’s possible that by the time the remote receives the packet and replies, that it’s actually fetched versions greater than what local had when we started. Thus it shows up as negative.

DatacenterVersionDifference is mostly just useful as a rough approximation of “am I caught up”. If it’s less than 5,000,000, then you’re probably caught up.

1 Like