Actions
Bug #385
closedFailed assertion in Locker::scatter_nudge
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I updated issue #312 but Gregory told me that it was another issue.
19:47 < gregaf> wido: your recent MDS crash is actually a different issue from #312, involving the distributed lock manager 19:48 < gregaf> are your MDSes just refusing to come up now, or is your cluster working again? 19:50 < gregaf> and what version of the code were you running when it crashed the first time?
The last log lines:
10.08.27_08:33:54.023625 7f33ea334710 mds0.journal try_to_expire waiting for nest flush on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40] 10.08.27_08:33:54.023664 7f33ea334710 mds0.locker scatter_nudge auth, scatter/unscattering (inest sync dirty) on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40] 10.08.27_08:33:54.023690 7f33ea334710 mds0.locker simple_lock on (inest sync dirty) on [inode 10000058e7b [...2,head] /static/kernel/linux/kernel/people/lenb/acpi/ auth v358 f(v5 m10.08.06_21:49:46.000183 3=0+3) n(v47 rc10.08.09_13:39:19.000312 b75920353 3260=3160+100) (inest sync dirty) (ifile sync dirty) (iversion lock) | dirtyscattered dirfrag dirty 0x7631e40] 10.08.27_08:33:54.023716 7f33ea334710 mds0.locker scatter_nudge oh, stable again already. mds/Locker.cc: In function 'void Locker::scatter_nudge(ScatterLock*, Context*, bool)': mds/Locker.cc:3290: FAILED assert(!c) 1: (LogSegment::try_to_expire(MDS*)+0x10f0) [0x636770] 2: (MDLog::try_expire(LogSegment*)+0x1d) [0x62ec2d] 3: (MDLog::trim(int)+0x628) [0x62f598] 4: (MDS::tick()+0x552) [0x498372] 5: (SafeTimer::EventWrapper::finish(int)+0x269) [0x6b27d9] 6: (Timer::timer_entry()+0x7bc) [0x6b4bac] 7: (Timer::TimerThread::entry()+0xd) [0x4777cd] 8: (Thread::_entry_func(void*)+0xa) [0x48a73a] 9: (()+0x69ca) [0x7f33edc9c9ca] 10: (clone()+0x6d) [0x7f33ecc546fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The cores, binaries and logfiles are uploaded to logger.ceph.widodh.nl:/srv/ceph/issues/mds_crash_locker_scatter_nudge
The timestamps of all the files were preserved.
Actions