Project

General

Profile

Bug #6313

dumpling: FAILED assert(latest->is_update()) from recover_primary()

Added by Samuel Just over 10 years ago. Updated about 10 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

restarting the backfill target for the osd fixed the issue

log.osd.6.log.gz (882 KB) Samuel Just, 09/13/2013 12:08 PM


Related issues

Related to Ceph - Bug #7563: osd/ReplicatedPG.cc: 8425: FAILED assert(info.last_complete == info.last_update) Resolved 02/27/2014

History

#1 Updated by Alexandre Marangone over 10 years ago

  • Source changed from other to Support

We restarted osd.4 and osd.1 to fix the issue.

#2 Updated by Samuel Just over 10 years ago

  • Status changed from New to Can't reproduce

#3 Updated by Sage Weil over 10 years ago

  • Subject changed from FAILED assert(latest->is_update()) to FAILED assert(latest->is_update()) from recover_primary()
  • Status changed from Can't reproduce to 12

on dumpling:

2013-12-06 21:57:55.772944 7f5636d6d700 -1 osd/ReplicatedPG.cc: In function 'int ReplicatedPG::recover_primary(int, ThreadPool::TPHandle&)' thread 7f5636d6d700 time 2013-12-06 21:57:55.738073
osd/ReplicatedPG.cc: 7028: FAILED assert(latest->is_update())

 ceph version 0.67.4-36-g9875c8b (9875c8b1992c59cc0c40901a44573676cdff2669)
 1: (ReplicatedPG::recover_primary(int, ThreadPool::TPHandle&)+0x5c9) [0x5fd159]
 2: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&)+0x102) [0x620102]
 3: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x1b8) [0x688b58]
 4: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x11) [0x6c81d1]
 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b4f06]
 6: (ThreadPool::WorkThread::entry()+0x10) [0x8b6d10]
 7: (()+0x7e9a) [0x7f564ae73e9a]
 8: (clone()+0x6d) [0x7f5648fbf3fd]

ubuntu@teuthology:/a/teuthology-2013-12-06_19:00:25-rados-dumpling-testing-basic-plana/134231

#4 Updated by Sage Weil over 10 years ago

it's a delete:

(gdb) p *latest
$1 = {op = 3, soid = {oid = {name = {static npos = <optimized out>, _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, 
          _M_p = 0x6ede3a8 "plana2827491-112"}}}, snap = {val = 18446744073709551614}, hash = 2817255090, max = false, pool = 3, nspace = {static npos = <optimized out>, 
      _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0xd30c78 ""}}, key = {static npos = <optimized out>, 
      _M_dataplus = {<std::allocator<char>> = {<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>}, _M_p = 0xd30c78 ""}}}, version = {version = 7409, epoch = 845, __pad = 0}, 
  prior_version = {version = 7406, epoch = 843, __pad = 0}, reverting_to = {version = 0, epoch = 0, __pad = 0}, reqid = {name = {_type = 8 '\b', _num = 4140, static TYPE_MON = 1, static TYPE_MDS = 2, 
      static TYPE_OSD = 4, static TYPE_CLIENT = 8, static NEW = -1}, tid = 4997, inc = 0}, mtime = {tv = {tv_sec = 1386395727, tv_nsec = 611103000}}, snaps = {
    _buffers = {<std::_List_base<ceph::buffer::ptr, std::allocator<ceph::buffer::ptr> >> = {
        _M_impl = {<std::allocator<std::_List_node<ceph::buffer::ptr> >> = {<__gnu_cxx::new_allocator<std::_List_node<ceph::buffer::ptr> >> = {<No data fields>}, <No data fields>}, _M_node = {
            _M_next = 0x217eea0, _M_prev = 0x217eea0}}}, <No data fields>}, _len = 0, append_buffer = {_raw = 0x0, _off = 0, _len = 0}, last_p = {bl = 0x217eea0, ls = 0x217eea0, off = 0, p = {
        _M_node = 0x217eea0}, p_off = 0}}, invalid_hash = false, invalid_pool = false, offset = 0}
...
(gdb) p info.last_update
$3 = {version = 7484, epoch = 990, __pad = 0}
(gdb) p latest->version
$4 = {version = 7409, epoch = 845, __pad = 0}

#5 Updated by Sage Weil over 10 years ago

  • Source changed from Support to Q/A

#6 Updated by Sage Weil over 10 years ago

  • Subject changed from FAILED assert(latest->is_update()) from recover_primary() to dumpling: FAILED assert(latest->is_update()) from recover_primary()
  • Assignee deleted (Samuel Just)
  • Priority changed from Normal to Urgent

#7 Updated by Sage Weil about 10 years ago

  • Status changed from 12 to Can't reproduce

#8 Updated by David Zafman about 10 years ago

  • Status changed from Can't reproduce to 12
  • Assignee set to Samuel Just

Seen

/a/dzafman-2014-02-28_12:09:58-rados:thrash-wip-7458-testing-basic-plana/111683

#9 Updated by Samuel Just about 10 years ago

  • Status changed from 12 to Fix Under Review

#10 Updated by Samuel Just about 10 years ago

  • Status changed from Fix Under Review to In Progress

#11 Updated by Samuel Just about 10 years ago

  • Status changed from In Progress to Duplicate

I've got a better explanatino in 7563, so I'm marking this one as a dup.

Also available in: Atom PDF