Project

General

Profile

Bug #15864

journal replay doesn't properly handle missing entries

Added by Jason Dillaman about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
Start date:
05/12/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Client "crash" resulted in a missing tag_tid=1/entry_tid=72 in offset 0. Instead, tag_tid=2/entry_tid=0 was next available record at offset 0 but offset 1 had tag_tid=1/entry_tid=73. Journal replay should skip over all future "tag_tid=1" records.

   -38> 2016-05-12 19:05:12.169227 7f94917fa700 20 JournalPlayer: try_pop_front
   -37> 2016-05-12 19:05:12.169232 7f94917fa700 20 JournalPlayer: advance_splay_object: new offset 0
   -36> 2016-05-12 19:05:12.169236 7f94917fa700 20 JournalMetadata: allocated commit tid: commit_tid=72 [object_num=3, tag_tid=1, entry_tid=71]
   -24> 2016-05-12 19:05:12.169339 7f94917fa700 20 JournalPlayer: try_pop_front
   -23> 2016-05-12 19:05:12.169343 7f94917fa700 20 JournalPlayer: verify_playback_ready: new tag 2 detected, adjusting offset to 0
   -22> 2016-05-12 19:05:12.169347 7f94917fa700 20 JournalPlayer: advance_splay_object: new offset 1
   -21> 2016-05-12 19:05:12.169350 7f94917fa700 20 JournalMetadata: allocated commit tid: commit_tid=73 [object_num=0, tag_tid=2, entry_tid=0]
    -9> 2016-05-12 19:05:12.169452 7f94917fa700 20 JournalPlayer: try_pop_front
    -8> 2016-05-12 19:05:12.169456 7f94917fa700 -1 JournalPlayer: unexpected tag in journal entry: Entry[tag_tid=1, entry_tid=73, data size=4126]
    -7> 2016-05-12 19:05:12.169461 7f94917fa700 10 JournalPlayer: notify_complete: replay complete: r=-42
    -6> 2016-05-12 19:05:12.169491 7f946a7fc700 20 librbd::Journal: 0x7f9478009b00 handle_replay_complete: r=-42

History

#1 Updated by Jason Dillaman about 3 years ago

  • Priority changed from Normal to High
  • Backport set to jewel

#2 Updated by Jason Dillaman about 3 years ago

Issue #15665 should be fixed first

#3 Updated by Venky Shankar about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Venky Shankar

#4 Updated by Jason Dillaman about 3 years ago

@Venky: hopefully all the outstanding replay issues will be resolved under issue #15665. If so, we can mark this one as resolved.

#5 Updated by Venky Shankar about 3 years ago

Jason Dillaman wrote:

@Venky: hopefully all the outstanding replay issues will be resolved under issue #15665. If so, we can mark this one as resolved.

Yeh, your patchset (esp. journal: skip partially complete tag entries during playback) is similar to what I had implemented for this issue and is a bit more superior ;)

I tested it out and got a clean journal replay. Thanks. Will mark this as resolved after PR https://github.com/ceph/ceph/pull/9130 is merged?

#6 Updated by Nathan Cutler about 3 years ago

Please remove the "Backport: jewel" in that case.

#7 Updated by Venky Shankar about 3 years ago

  • Backport deleted (jewel)

#8 Updated by Venky Shankar about 3 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF