Actions
Bug #11462
closedkernel: crash (when MDS died?)
Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
kceph
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
http://pulpito.ceph.com/teuthology-2015-04-20_23:18:01-multimds-next-testing-basic-multi/857045/
[3]kdb> bt Stack traceback for pid 1 0xffff88040cdb0000 1 0 1 3 R 0xffff88040cdb0618 *init ffff88040cdbbba8 0000000000000018 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 Call Trace: <#DB> <<EOE>> [<ffffffff81114032>] ? kgdb_panic_event+0x22/0x50 [<ffffffff8107d2ad>] ? notifier_call_chain+0x4d/0x70 [<ffffffff8107d400>] ? __atomic_notifier_call_chain+0x70/0xb0 [<ffffffff8107d395>] ? __atomic_notifier_call_chain+0x5/0xb0 [<ffffffff8107d456>] ? atomic_notifier_call_chain+0x16/0x20 [<ffffffff817573f8>] ? panic+0xed/0x1fa [<ffffffff8105eb33>] ? do_exit+0xa43/0xb50 [<ffffffff81766120>] ? _raw_spin_unlock_irq+0x30/0x40 [<ffffffff8105ece1>] ? do_group_exit+0x51/0xc0 [<ffffffff8106b91e>] ? get_signal+0x26e/0x760 [<ffffffff81002503>] ? do_signal+0x33/0xab0 [<ffffffff8175725b>] ? mm_fault_error+0x130/0x14c [<ffffffff81049584>] ? __do_page_fault+0x374/0x4a0 [<ffffffff81767631>] ? retint_signal+0x11/0x90 [<ffffffff81002ff8>] ? do_notify_resume+0x78/0xa0 [<ffffffff81767666>] ? retint_signal+0x46/0x90
If you look at the teuthology log you'll see that one of the MDSes crashed, so I bet they're related.
2015-04-21T11:28:28.094 INFO:tasks.ceph.mds.g.burnupi18.stderr:mds/StrayManager.cc: 538: FAILED assert(!dn->state_test(CDentry::STATE_PURGING)) 2015-04-21T11:28:28.094 INFO:tasks.ceph.mds.g.burnupi18.stderr: ceph version 0.94-912-g33bdae7 (33bdae7d62ddc1fd77b758e2dd2876a1353f5db6) 2015-04-21T11:28:28.094 INFO:tasks.ceph.mds.g.burnupi18.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x95c46b] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 2: (StrayManager::eval_stray(CDentry*, bool)+0xb56) [0x6fa996] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 3: (StrayManager::advance_delayed()+0xf6) [0x6face6] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 4: (MDCache::trim(int, int)+0x15d) [0x671b0d] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 5: (MDS::tick()+0xd0) [0x5a4e50] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 6: (MDSInternalContextBase::complete(int)+0x153) [0x7d7f73] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 7: (SafeTimer::timer_thread()+0xec) [0x94db0c] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 8: (SafeTimerThread::entry()+0xd) [0x94eaad] 2015-04-21T11:28:28.095 INFO:tasks.ceph.mds.g.burnupi18.stderr: 9: (()+0x8182) [0x7f2069445182] 2015-04-21T11:28:28.096 INFO:tasks.ceph.mds.g.burnupi18.stderr: 10: (clone()+0x6d) [0x7f2067bb538d] 2015-04-21T11:28:28.096 INFO:tasks.ceph.mds.g.burnupi18.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 2015-04-21T11:28:28.096 INFO:tasks.ceph.mds.g.burnupi18.stderr:2015-04-21 11:28:28.061173 7f205f795700 -1 mds/StrayManager.cc: In function 'bool StrayManager::eval_stray(CDentry*, bool)' thread 7f205f795700 time 2015-04-21 11:28:27.842066 2015-04-21T11:28:28.096 INFO:tasks.ceph.mds.g.burnupi18.stderr:mds/StrayManager.cc: 538: FAILED assert(!dn->state_test(CDentry::STATE_PURGING))
Actions