Actions
Bug #15750
closedJournal corruption detected during rbd-mirror replay
Status:
Resolved
Priority:
Urgent
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Mirroring was enabled and OS installation started on VM:
# rbd --cluster master journal status 10612ae8944a minimum_set: 28 active_set: 34 registered clients: [id=, commit_position=[positions=[[object_number=138, tag_tid=3, entry_tid=6566], [object_number=133, tag_tid=3, entry_tid=6565], [object_number=132, tag_tid=3, entry_tid=6564], [object_number=131, tag_tid=3, entry_tid=6563]]], state=connected] [id=e62a7d53-cb6a-4a62-ad50-d1d7642391bc, commit_position=[positions=[[object_number=117, tag_tid=2, entry_tid=64937], [object_number=112, tag_tid=2, entry_tid=64936], [object_number=115, tag_tid=2, entry_tid=64935], [object_number=118, tag_tid=2, entry_tid=64934]]], state=connected]
Corruption detected in journal object "journal_data.0.10612ae8944a.125":
... snip ... 2016-05-05 21:17:27.309765 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4841, data size=4126] decoded 2016-05-05 21:17:27.309768 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4845, data size=4126] decoded 2016-05-05 21:17:27.309771 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4849, data size=4126] decoded 2016-05-05 21:17:27.309774 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4853, data size=4126] decoded 2016-05-05 21:17:27.309778 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4857, data size=4126] decoded 2016-05-05 21:17:27.309780 7f41b0ff9700 -1 ObjectPlayer: : partial record at offset 14316491 2016-05-05 21:17:27.309781 7f41b0ff9700 -1 ObjectPlayer: : corruption range [14316491, 33554432) 2016-05-05 21:17:27.309783 7f41b0ff9700 10 JournalPlayer: handle_fetched: journal_data.0.10612ae8944a.125: r=-74 2016-05-05 21:17:27.309791 7f41b0ff9700 10 JournalPlayer: process_state: object_num=125, r=-74 2016-05-05 21:17:27.309793 7f41b0ff9700 10 JournalPlayer: notify_complete: replay complete: r=-74
Updated by Jason Dillaman about 8 years ago
Last entry in the journal was a 16MB IO event -- which ballooned the final size of the journal object to ~44MB. However, the ObjectPlayer only fetched 32MB:
2016-05-05 21:39:25.732838 7fc7cd7fa700 10 ObjectPlayer: handle_fetch_complete: journal_data.0.10612ae8944a.125, r=0, len=33554432
Updated by Jason Dillaman about 8 years ago
- Status changed from In Progress to Fix Under Review
Updated by Jason Dillaman about 8 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Jason Dillaman about 8 years ago
- Copied to Backport #15819: jewel: Journal corruption detected during rbd-mirror replay added
Updated by Jason Dillaman about 8 years ago
- Status changed from Pending Backport to Resolved
Actions