Project

General

Profile

Actions

Bug #362

closed

mds: rejoin crashes on snaptest-2 workload

Added by Sage Weil over 13 years ago. Updated over 7 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

saw two crashes, running commit:c8701f29f0a6f3777c41f8952c054ba4dd41b9d4

mds/CInode.cc: In function 'virtual void CInode::auth_unpin(void*)':
mds/CInode.cc:1439: FAILED assert(auth_pins >= 0)
 1: (CInode::auth_unpin(void*)+0x187) [0x916fff]
 2: (SimpleLock::set_state_rejoin(int, std::list<Context*, std::allocator<Context*> >&)+0x55) [0x8364c9]
 3: (SimpleLock::decode_state_rejoin(ceph::buffer::list::iterator&, std::list<Context*, std::allocator<Context*> >&)+0x40) [0x91bbee]
 4: (CInode::_decode_locks_rejoin(ceph::buffer::list::iterator&, std::list<Context*, std::allocator<Context*> >&)+0x8c) [0x91ac66]
 5: (MDCache::handle_cache_rejoin_ack(MMDSCacheRejoin*)+0xb2d) [0x817109]
 6: (MDCache::handle_cache_rejoin(MMDSCacheRejoin*)+0x135) [0x812597]
 7: (MDCache::dispatch(Message*)+0x7a) [0x82247e]
 8: (MDS::_dispatch(Message*)+0xf91) [0x755aad]
 9: (MDS::ms_dispatch(Message*)+0x38) [0x7549a2]
 10: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x73ff61]
 11: (SimpleMessenger::dispatch_entry()+0x5d1) [0x7316cd]
 12: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x7263d8]
 13: (Thread::_entry_func(void*)+0x23) [0x73ee87]
 14: (()+0x68ba) [0x7f316bda18ba]
 15: (clone()+0x6d) [0x7f316ad5601d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

and

mds/MDCache.cc: In function 'void MDCache::handle_cache_rejoin_weak(MMDSCacheRejoin*)':
mds/MDCache.cc:3192: FAILED assert(dnl->is_primary())
 1: (MDCache::handle_cache_rejoin_weak(MMDSCacheRejoin*)+0xb61) [0x813153]
 2: (MDCache::handle_cache_rejoin(MMDSCacheRejoin*)+0x10b) [0x81256d]
 3: (MDCache::dispatch(Message*)+0x7a) [0x82247e]
 4: (MDS::_dispatch(Message*)+0xf91) [0x755aad]
 5: (MDS::ms_dispatch(Message*)+0x38) [0x7549a2]
 6: (Messenger::ms_deliver_dispatch(Message*)+0x63) [0x73ff61]
 7: (SimpleMessenger::dispatch_entry()+0x5d1) [0x7316cd]
 8: (SimpleMessenger::DispatchThread::entry()+0x2c) [0x7263d8]
 9: (Thread::_entry_func(void*)+0x23) [0x73ee87]
 10: (()+0x68ba) [0x7f4aee06c8ba]
 11: (clone()+0x6d) [0x7f4aed02101d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions #1

Updated by Sage Weil over 13 years ago

  • Target version changed from v0.22 to v0.23

work on recovery in v0.23

Actions #2

Updated by Sage Weil over 13 years ago

  • Target version changed from v0.23 to v0.24
Actions #3

Updated by Sage Weil over 13 years ago

  • Assignee set to Sage Weil
Actions #4

Updated by Sage Weil over 13 years ago

  • Estimated time set to 4:00 h
  • Source set to 3
Actions #5

Updated by Sage Weil over 13 years ago

  • Status changed from New to Rejected
Actions #6

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Target version deleted (v0.24)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF