Bug #36659
[rbd-mirror] forced promotion after killing remote cluster results in stuck state
Status:
Resolved
Priority:
Normal
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
The rbd-mirror daemon detects that the image has been locally promoted and attempts to shut down, but it hangs since the remote cluster is unresponsive and skips the status update.
2018-10-31 10:45:32.341 7f32f8ff9700 20 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] on_stop_journal_replay: enter 2018-10-31 10:45:32.341 7f32f8ff9700 20 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] set_state_description: 0 force promoted 2018-10-31 10:45:32.341 7f32f8ff9700 20 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] update_mirror_image_status: 2018-10-31 10:45:32.341 7f32f8ff9700 20 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] start_mirror_image_status_update: shut down in-progress: ignoring update 2018-10-31 10:45:32.341 7f32f8ff9700 15 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] reschedule_update_status_task: canceling existing status update task 2018-10-31 10:45:32.341 7f32f8ff9700 15 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] finish_mirror_image_status_update: 2018-10-31 10:45:32.341 7f32f8ff9700 10 rbd::mirror::ImageReplayer: 0x7f333800e4e0 [1/41ddd4e2-5716-4c14-9568-c1340762addd] shut_down: r=0
Related issues
History
#1 Updated by Jason Dillaman over 5 years ago
- Status changed from In Progress to Fix Under Review
#2 Updated by Jason Dillaman over 5 years ago
New status message:
$ rbd --cluster cluster2 mirror pool status --verbose health: WARNING images: 1 total 1 stopping_replay image1: global_id: 79833db6-58fd-4f58-b013-cba7ed26750e state: up+stopping_replay description: force promoted last_update: 2018-10-31 14:42:37
#3 Updated by Mykola Golub over 5 years ago
- Status changed from Fix Under Review to Pending Backport
#4 Updated by Nathan Cutler over 5 years ago
- Copied to Backport #36692: luminous: [rbd-mirror] forced promotion after killing remote cluster results in stuck state added
#5 Updated by Nathan Cutler over 5 years ago
- Copied to Backport #36693: mimic: [rbd-mirror] forced promotion after killing remote cluster results in stuck state added
#6 Updated by Nathan Cutler about 5 years ago
- Status changed from Pending Backport to Resolved