Project

General

Profile

Actions

Bug #385

closed

Failed assertion in Locker::scatter_nudge

Added by Wido den Hollander over 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I updated issue #312 but Gregory told me that it was another issue.

19:47 < gregaf> wido: your recent MDS crash is actually a different issue from #312, involving the distributed lock manager
19:48 < gregaf> are your MDSes just refusing to come up now, or is your cluster working again?
19:50 < gregaf> and what version of the code were you running when it crashed the first time?

The last log lines:

10.08.27_08:33:54.023625 7f33ea334710 mds0.journal try_to_expire waiting for nest flush on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40]
10.08.27_08:33:54.023664 7f33ea334710 mds0.locker scatter_nudge auth, scatter/unscattering (inest sync dirty) on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40]
10.08.27_08:33:54.023690 7f33ea334710 mds0.locker simple_lock on (inest sync dirty) on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40]
10.08.27_08:33:54.023716 7f33ea334710 mds0.locker scatter_nudge oh, stable again already.
mds/Locker.cc: In function 'void Locker::scatter_nudge(ScatterLock*, Context*, bool)':
mds/Locker.cc:3290: FAILED assert(!c)
 1: (LogSegment::try_to_expire(MDS*)+0x10f0) [0x636770]
 2: (MDLog::try_expire(LogSegment*)+0x1d) [0x62ec2d]
 3: (MDLog::trim(int)+0x628) [0x62f598]
 4: (MDS::tick()+0x552) [0x498372]
 5: (SafeTimer::EventWrapper::finish(int)+0x269) [0x6b27d9]
 6: (Timer::timer_entry()+0x7bc) [0x6b4bac]
 7: (Timer::TimerThread::entry()+0xd) [0x4777cd]
 8: (Thread::_entry_func(void*)+0xa) [0x48a73a]
 9: (()+0x69ca) [0x7f33edc9c9ca]
 10: (clone()+0x6d) [0x7f33ecc546fd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

The cores, binaries and logfiles are uploaded to logger.ceph.widodh.nl:/srv/ceph/issues/mds_crash_locker_scatter_nudge

The timestamps of all the files were preserved.

Actions

Also available in: Atom PDF