Bug #7565
Failed assert in check_rstats
0%
Description
This is odd, because it's happening very reproducibly, is not unique to the tip of master, but apparently isn't happening in the nightlies. So handle with caution, I am suspicious of my environment. Still, so that I don't forget it, here's a ticket.
2014-02-28 00:34:37,850.850 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: mds/CDir.cc: In function 'bool CDir::check_rstats()' thread 7fcd74125700 time 2014-02-27 16:34:36.278405 2014-02-28 00:34:37,850.850 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: mds/CDir.cc: 234: FAILED assert(!g_conf->mds_debug_scatterstat || (get_num_head_items() == (fnode.fragstat.nfiles + fnode.fragstat.nsubdirs))) 2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: ceph version 0.77-614-gfc33eae (fc33eaed0d21c5659ceea3fa5a081f4f53ea91e2) 2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 1: (CDir::check_rstats()+0x1426) [0x748c46] 2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 2: (CInode::finish_scatter_gather_update(int)+0x1a29) [0x7783b9] 2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 3: (Locker::scatter_writebehind(ScatterLock*)+0x4c2) [0x6ead22] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 4: (Locker::simple_sync(SimpleLock*, bool*)+0x47a) [0x6eb9da] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 5: (Locker::scatter_nudge(ScatterLock*, Context*, bool)+0x7a4) [0x6ed0c4] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 6: (Locker::scatter_tick()+0x336) [0x6ed7e6] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 7: (MDS::tick()+0x32c) [0x5872cc] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 8: (Context::complete(int)+0x9) [0x57ee59] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 9: (SafeTimer::timer_thread()+0x425) [0x8ddec5] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 10: (SafeTimerThread::entry()+0xd) [0x8deafd] 2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 11: (()+0x7e9a) [0x7fcd79b2ae9a] 2014-02-28 00:34:37,853.853 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: 12: (clone()+0x6d) [0x7fcd7853c3fd] 2014-02-28 00:34:37,853.853 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
MDS logs and a teuthology config attached.
Associated revisions
mds: fix empty fs rstat
In 81bcf43080a7be8a48aa13b88287cbfac0e01e3e we removed the .ceph directory
but did not adjust the rsubdirs back to 0.
Fixes: #7565
Signed-off-by: Sage Weil <sage@inktank.com>
History
#1 Updated by Zheng Yan about 10 years ago
- Priority changed from Normal to Low
it's CDir::check_rstats() bug, not rstat corruption.
#2 Updated by Greg Farnum about 10 years ago
- Priority changed from Low to Normal
What's the bug with check_rstats? Is num_head_items just not expected to be valid at this stage of replay?
Either way, this isn't a low-priority bug since it's causing an assert during replay.
#3 Updated by Sage Weil about 10 years ago
- Priority changed from Normal to High
- Source changed from other to Development
#4 Updated by Sage Weil about 10 years ago
- Status changed from New to 12
ubuntu@teuthology:/a/teuthology-2014-03-12_04:55:11-multimds-master-testing-basic-plana/127691
failed on trivial_sync, probably as concise a reproduction as we can hope for.
#5 Updated by Sage Weil about 10 years ago
- Status changed from 12 to Resolved
#6 Updated by Greg Farnum over 7 years ago
- Component(FS) MDS added