Project

General

Profile

Actions

Bug #15750

closed

Journal corruption detected during rbd-mirror replay

Added by Jason Dillaman almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Mirroring was enabled and OS installation started on VM:

# rbd --cluster master journal status 10612ae8944a
minimum_set: 28
active_set: 34
registered clients: 
    [id=, commit_position=[positions=[[object_number=138, tag_tid=3, entry_tid=6566], [object_number=133, tag_tid=3, entry_tid=6565], [object_number=132, tag_tid=3, entry_tid=6564], [object_number=131, tag_tid=3, entry_tid=6563]]], state=connected]
    [id=e62a7d53-cb6a-4a62-ad50-d1d7642391bc, commit_position=[positions=[[object_number=117, tag_tid=2, entry_tid=64937], [object_number=112, tag_tid=2, entry_tid=64936], [object_number=115, tag_tid=2, entry_tid=64935], [object_number=118, tag_tid=2, entry_tid=64934]]], state=connected]

Corruption detected in journal object "journal_data.0.10612ae8944a.125":

... snip ...
2016-05-05 21:17:27.309765 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4841, data size=4126] decoded
2016-05-05 21:17:27.309768 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4845, data size=4126] decoded
2016-05-05 21:17:27.309771 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4849, data size=4126] decoded
2016-05-05 21:17:27.309774 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4853, data size=4126] decoded
2016-05-05 21:17:27.309778 7f41b0ff9700 20 ObjectPlayer: : Entry[tag_tid=3, entry_tid=4857, data size=4126] decoded
2016-05-05 21:17:27.309780 7f41b0ff9700 -1 ObjectPlayer: : partial record at offset 14316491
2016-05-05 21:17:27.309781 7f41b0ff9700 -1 ObjectPlayer: : corruption range [14316491, 33554432)
2016-05-05 21:17:27.309783 7f41b0ff9700 10 JournalPlayer: handle_fetched: journal_data.0.10612ae8944a.125: r=-74
2016-05-05 21:17:27.309791 7f41b0ff9700 10 JournalPlayer: process_state: object_num=125, r=-74
2016-05-05 21:17:27.309793 7f41b0ff9700 10 JournalPlayer: notify_complete: replay complete: r=-74

Related issues 1 (0 open1 closed)

Copied to rbd - Backport #15819: jewel: Journal corruption detected during rbd-mirror replayResolvedJason DillamanActions
Actions #1

Updated by Jason Dillaman almost 8 years ago

Last entry in the journal was a 16MB IO event -- which ballooned the final size of the journal object to ~44MB. However, the ObjectPlayer only fetched 32MB:

2016-05-05 21:39:25.732838 7fc7cd7fa700 10 ObjectPlayer: handle_fetch_complete: journal_data.0.10612ae8944a.125, r=0, len=33554432

Actions #2

Updated by Jason Dillaman almost 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #3

Updated by Jason Dillaman almost 8 years ago

  • Backport set to jewel
Actions #4

Updated by Jason Dillaman almost 8 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #5

Updated by Jason Dillaman almost 8 years ago

  • Copied to Backport #15819: jewel: Journal corruption detected during rbd-mirror replay added
Actions #6

Updated by Jason Dillaman almost 8 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF