Actions
Bug #15993
closedrbd-mirror can become stuck during live-replay
Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2016-05-23 07:31:04.045929 7fa7251e1700 -1 JournalPlayer: missing prior journal entry: Entry[tag_tid=2, entry_tid=924, data size=16777183] 2016-05-23 07:31:04.045952 7fa7251e1700 -1 rbd-mirror: ImageReplayer[1/5e4e2ae8944a]::handle_replay_complete: replay encountered an error: (42) No message of desired type 2016-05-23 07:32:35.695730 7fa7251e1700 -1 JournalPlayer: missing prior journal entry: Entry[tag_tid=2, entry_tid=956, data size=1054] 2016-05-23 07:32:35.695751 7fa7251e1700 -1 rbd-mirror: ImageReplayer[1/5e4e2ae8944a]::handle_replay_complete: replay encountered an error: (42) No message of desired type
# rbd --cluster slave --pool pool1 mirror pool status --verbose health: OK images: 4 total 4 replaying data1: global_id: 037458d0-4516-4b68-908f-ab6fce7de7a7 state: up+replaying description: replaying, master_position=[object_number=287, tag_tid=2, entry_tid=1491], mirror_position=[object_number=150, tag_tid=2, entry_tid=966], entries_behind_master=525 last_update: 2016-05-23 11:56:13
After restarting rbd-mirror, replication proceeded until it became stuck again:
[id=, commit_position=[positions=[[object_number=287, tag_tid=2, entry_tid=1491], [object_number=286, tag_tid=2, entry_tid=1490], [object_number=285, tag_tid=2, entry_tid=1489], [object_number=276, tag_tid=2, entry_tid=1488]]], state=connected] [id=041ffd0b-8910-47c6-9275-6858c551739d, commit_position=[positions=[[object_number=286, tag_tid=2, entry_tid=1490], [object_number=285, tag_tid=2, entry_tid=1489], [object_number=276, tag_tid=2, entry_tid=1488], [object_number=287, tag_tid=2, entry_tid=1487]]], state=connected]
2016-05-23 12:14:49.563020 7f8a8affd700 20 JournalPlayer: schedule_watch: scheduling watch on journal_data.1.5e4e2ae8944a.284 2016-05-23 12:14:49.563022 7f8a8affd700 20 ObjectPlayer: watch: journal_data.1.5e4e2ae8944a.284 watch 2016-05-23 12:14:49.563023 7f8a8affd700 20 ObjectPlayer: schedule_watch: journal_data.1.5e4e2ae8944a.284 scheduling watch 2016-05-23 12:14:50.563093 7f8ab5d7b700 10 ObjectPlayer: handle_watch_task: journal_data.1.5e4e2ae8944a.284 polling 2016-05-23 12:14:50.563100 7f8ab5d7b700 10 ObjectPlayer: fetch: journal_data.1.5e4e2ae8944a.284 2016-05-23 12:14:50.563797 7f8a8affd700 10 ObjectPlayer: handle_fetch_complete: journal_data.1.5e4e2ae8944a.284, r=-2, len=0 2016-05-23 12:14:50.563802 7f8a8affd700 10 ObjectPlayer: handle_watch_fetched: journal_data.1.5e4e2ae8944a.284 poll complete, r=-2 2016-05-23 12:14:50.564392 7f8a8affd700 20 JournalPlayer: schedule_watch: scheduling watch on journal_data.1.5e4e2ae8944a.284
Again, restarting rbd-mirror fixed the issue.
Updated by Jason Dillaman almost 8 years ago
- Status changed from New to In Progress
- Assignee set to Jason Dillaman
Updated by Jason Dillaman almost 8 years ago
- Status changed from In Progress to Fix Under Review
- Backport set to jewel
Updated by Jason Dillaman almost 8 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Jason Dillaman almost 8 years ago
- Copied to Backport #16020: jewel: rbd-mirror can become stuck during live-replay added
Updated by Jason Dillaman almost 8 years ago
- Status changed from Pending Backport to Resolved
Actions