Bug #58706
closed[rbd-mirror] incorrect syncing_percent in get_replay_status
0%
Description
The syncing_percent sometimes returns > 100 when getting the mirror image status (snapshot mode).
bin/rbd --cluster site-b mirror image status -p data --image image1
2023-02-06T17:55:13.385+0530 7f1fb9ee5640 15 rbd::mirror::MirrorStatusUpdater 0x55b3e5bb89a0 set_mirror_image_status: global_image_id=746b4bbc-71a3-425a-aefc-55e2d4b123c4, mirror_image_site_status={state=up+replaying, description=replaying, {"bytes_per_second":0.0,"bytes_per_snapshot":0.0,"local_snapshot_timestamp":1675686173,"remote_snapshot_timestamp":1675686240,"replay_state":"syncing","syncing_percent":128000,"syncing_snapshot_timestamp":1675686240}, last_update=0.000000]}
Updated by Ilya Dryomov about 1 year ago
- Status changed from New to In Progress
- Assignee set to Nithya Balachandran
- Backport set to pacific,quincy
Updated by Ilya Dryomov about 1 year ago
- Subject changed from Incorrect syncing_percent in get_replay_status in rbd mirroring to [rbd-mirror] incorrect syncing_percent in get_replay_status
Updated by Nithya Balachandran about 1 year ago
Analysis:
In get_replay_status()
root_obj["syncing_percent"] = static_cast<uint64_t>(
100 * m_local_mirror_snap_ns.last_copied_object_number /
static_cast<float>(std::max<uint64_t>(1U, m_local_object_count)));
A value like 128000 indicates that m_local_object_count is 0 and m_local_mirror_snap_ns.last_copied_object_number is non-zero.
If the rbd-mirror daemon is restarted in the middle of a snapshot copy, the scan_local_mirror_snapshots() sets m_local_object_count to 0. It also initializes m_local_mirror_snap_ns to the snap namespace.
m_local_object_count is set to a non-zero value in Replayer<I>::handle_copy_image_progress.
If get_replay_status() is called before handle_copy_image_progress(), the sycing_percent will be set to a value > 100.
Fix:
Return syncing_percent value 0 if m_local_object_count is 0.
Updated by Ilya Dryomov about 1 year ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 50096
Updated by Ilya Dryomov about 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot about 1 year ago
- Copied to Backport #58763: quincy: [rbd-mirror] incorrect syncing_percent in get_replay_status added
Updated by Backport Bot about 1 year ago
- Copied to Backport #58764: pacific: [rbd-mirror] incorrect syncing_percent in get_replay_status added
Updated by Backport Bot about 1 year ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".