Project

General

Profile

Bug #7565

Failed assert in check_rstats

Added by John Spray about 10 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is odd, because it's happening very reproducibly, is not unique to the tip of master, but apparently isn't happening in the nightlies. So handle with caution, I am suspicious of my environment. Still, so that I don't forget it, here's a ticket.

2014-02-28 00:34:37,850.850 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: mds/CDir.cc: In function 'bool CDir::check_rstats()' thread 7fcd74125700 time 2014-02-27 16:34:36.278405
2014-02-28 00:34:37,850.850 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]: mds/CDir.cc: 234: FAILED assert(!g_conf->mds_debug_scatterstat || (get_num_head_items() == (fnode.fragstat.nfiles + fnode.fragstat.nsubdirs)))
2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  ceph version 0.77-614-gfc33eae (fc33eaed0d21c5659ceea3fa5a081f4f53ea91e2)
2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  1: (CDir::check_rstats()+0x1426) [0x748c46]
2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  2: (CInode::finish_scatter_gather_update(int)+0x1a29) [0x7783b9]
2014-02-28 00:34:37,851.851 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  3: (Locker::scatter_writebehind(ScatterLock*)+0x4c2) [0x6ead22]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  4: (Locker::simple_sync(SimpleLock*, bool*)+0x47a) [0x6eb9da]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  5: (Locker::scatter_nudge(ScatterLock*, Context*, bool)+0x7a4) [0x6ed0c4]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  6: (Locker::scatter_tick()+0x336) [0x6ed7e6]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  7: (MDS::tick()+0x32c) [0x5872cc]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  8: (Context::complete(int)+0x9) [0x57ee59]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  9: (SafeTimer::timer_thread()+0x425) [0x8ddec5]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  10: (SafeTimerThread::entry()+0xd) [0x8deafd]
2014-02-28 00:34:37,852.852 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  11: (()+0x7e9a) [0x7fcd79b2ae9a]
2014-02-28 00:34:37,853.853 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  12: (clone()+0x6d) [0x7fcd7853c3fd]
2014-02-28 00:34:37,853.853 INFO:teuthology.task.ceph.mds.a.err:[10.214.132.128]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

MDS logs and a teuthology config attached.

ceph-mds.a.log.gz (51.7 KB) John Spray, 02/27/2014 04:46 PM

ceph-mds.b-s-a.log.gz (136 KB) John Spray, 02/27/2014 04:46 PM

repro.yaml View - Teuthology 3 host config that triggers this (442 Bytes) John Spray, 02/27/2014 04:46 PM

Associated revisions

Revision a72b636b (diff)
Added by Sage Weil about 10 years ago

mds: fix empty fs rstat

In 81bcf43080a7be8a48aa13b88287cbfac0e01e3e we removed the .ceph directory
but did not adjust the rsubdirs back to 0.

Fixes: #7565
Signed-off-by: Sage Weil <>

History

#1 Updated by Zheng Yan about 10 years ago

  • Priority changed from Normal to Low

it's CDir::check_rstats() bug, not rstat corruption.

#2 Updated by Greg Farnum about 10 years ago

  • Priority changed from Low to Normal

What's the bug with check_rstats? Is num_head_items just not expected to be valid at this stage of replay?

Either way, this isn't a low-priority bug since it's causing an assert during replay.

#3 Updated by Sage Weil about 10 years ago

  • Priority changed from Normal to High
  • Source changed from other to Development

#4 Updated by Sage Weil about 10 years ago

  • Status changed from New to 12

ubuntu@teuthology:/a/teuthology-2014-03-12_04:55:11-multimds-master-testing-basic-plana/127691

failed on trivial_sync, probably as concise a reproduction as we can hope for.

#5 Updated by Sage Weil about 10 years ago

  • Status changed from 12 to Resolved

#6 Updated by Greg Farnum over 7 years ago

  • Component(FS) MDS added

Also available in: Atom PDF