I’m trying to figure out why the status json
output section for backups is not reporting stats in my setup.
This is true for both 6.3 and 7.3 clusters. I have a machine running all backup_agents and no fdbserver processes that does the backups.
"layers" : {
"_valid" : true,
"backup" : {
"blob_recent_io" : {
"bytes_per_second" : 0,
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
},
"instances" : {
"90201a26e6aac7c3f7760119b57f0385" : {
"blob_stats" : {
"recent" : {
"bytes_per_second" : 0,
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
},
"total" : {
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
}
},
"configured_workers" : 10,
"id" : "90201a26e6aac7c3f7760119b57f0385",
"last_updated" : 1741195253.9706519,
"locality" : {
},
"main_thread_cpu_seconds" : 3232.2759629999996,
"memory_usage" : 997814272,
"networkAddress" : "172.26.11.150",
"processID" : 642601,
"process_cpu_seconds" : 3539.0761820000002,
"resident_size" : 531697664,
"version" : "7.3.59"
},
"a834a16efb6328aa035f9f8b37810aff" : {
"blob_stats" : {
"recent" : {
"bytes_per_second" : 0,
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
},
"total" : {
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
}
},
"configured_workers" : 10,
"id" : "a834a16efb6328aa035f9f8b37810aff",
"last_updated" : 1741195227.3100455,
"locality" : {
},
"main_thread_cpu_seconds" : 3205.2537630000002,
"memory_usage" : 1028243456,
"networkAddress" : "172.26.11.150",
"processID" : 642599,
"process_cpu_seconds" : 3509.1482339999998,
"resident_size" : 560922624,
"version" : "7.3.59"
},
"d3cf787bf7922c060530efe18fbd5a0a" : {
"blob_stats" : {
"recent" : {
"bytes_per_second" : 0,
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
},
"total" : {
"bytes_sent" : 0,
"requests_failed" : 0,
"requests_successful" : 0
}
},
"configured_workers" : 10,
"id" : "d3cf787bf7922c060530efe18fbd5a0a",
"last_updated" : 1741195228.3917904,
"locality" : {
},
"main_thread_cpu_seconds" : 3234.0649039999998,
"memory_usage" : 1033863168,
"networkAddress" : "172.26.11.150",
"processID" : 642600,
"process_cpu_seconds" : 3542.051164,
"resident_size" : 566521856,
"version" : "7.3.59"
}
},
"instances_running" : 3,
"last_updated" : 1741195253.9706519,
"paused" : false,
"tags" : {
"continuous" : {
"current_container" : "file:///mnt/fdb_pp-continuous/backup-2025-03-03-08-32-59.172878",
"current_status" : "is differential",
"last_restorable_seconds_behind" : 15.644265000000001,
"last_restorable_version" : 96923578965246,
"mutation_log_bytes_written" : 0,
"mutation_stream_id" : "c98b8a3f12c84b439b6072adb545b5ef",
"range_bytes_written" : 361627760053,
"running_backup" : true,
"running_backup_is_restorable" : true
},
"default" : {
"current_container" : "file:///mnt/fdb_pp/backup-2024-09-11-23-00-06.228893",
"current_status" : "has been completed",
"last_restorable_seconds_behind" : 15101288.106191,
"last_restorable_version" : 81822306503320,
"mutation_log_bytes_written" : 0,
"mutation_stream_id" : "a6719e5c26d98504d022263d2f2541e3",
"range_bytes_written" : 311872883594,
"running_backup" : false,
"running_backup_is_restorable" : false
}
},
"total_workers" : 30
}
},
I also see that in the configuration section no backup_workers
are reported:
"configuration" : {
"backup_worker_enabled" : 0,
"blob_granules_enabled" : 0,
"commit_proxies" : 3,
"coordinators_count" : 6,
[...]
},
I’m trying out different backup setups with disks and it would be useful to see the write speed metrics.
It takes us around 2 days to get over the first snapshot in the backup for 2TBs cluster size. This seems too long of a time, and maybe our setup could be improved.
Also, is there any way to have any information about the percentage of completeness of a snapshot that is currently running?
Can I gauge it from the range_bytes_written
value in the backup section? I assume for the first snapshot to be complete this value should reach a similar size to the total_kv_size_bytes
value in the cluster data section?
I’m guessing this assumption wont work for subsequent snapshots though because this value increases monotonically