Project

General

Profile

Actions

Bug #12355

closed

MDS assertion during shutdown (MDLog !capped), in TestStrays.test_migration_on_shutdown

Added by John Spray almost 9 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/teuthology-2015-07-13_23:04:03-fs-master---basic-multi/972783

mds.a crashed

   -20> 2015-07-15 17:00:39.763388 7fb131cc7700 20 mds.1.cache.ino(101) * dirty_old_rstat {}
   -19> 2015-07-15 17:00:39.763390 7fb131cc7700 10 mds.1.cache project_rstat_frag_to_inode [2,head]
   -18> 2015-07-15 17:00:39.763392 7fb131cc7700 20 mds.1.cache   frag           rstat n(v1 rc2015-07-15 16:59:22.180365 10=0+10)
   -17> 2015-07-15 17:00:39.763395 7fb131cc7700 20 mds.1.cache   frag accounted_rstat n(v1 rc2015-07-15 16:59:22.180365 b8388608 11=1+10)
   -16> 2015-07-15 17:00:39.763399 7fb131cc7700 20 mds.1.cache                  delta n(v1 rc2015-07-15 16:59:22.180365 b-8388608 -1=-1+0)
   -15> 2015-07-15 17:00:39.763404 7fb131cc7700 20 mds.1.cache  projecting to [2,head] n(v2 rc2015-07-15 16:59:22.180365 b8388608 12=1+11)
   -14> 2015-07-15 17:00:39.763408 7fb131cc7700 20 mds.1.cache         result [2,head] n(v2 rc2015-07-15 16:59:22.180365 11=0+11)
   -13> 2015-07-15 17:00:39.763412 7fb131cc7700 10 mds.1.cache.dir(101) check_rstats bailing out -- incomplete or non-auth or frozen dir!
   -12> 2015-07-15 17:00:39.763414 7fb131cc7700 10 mds.1.cache.dir(101) check_rstats bailing out -- incomplete or non-auth or frozen dir!
   -11> 2015-07-15 17:00:39.763426 7fb131cc7700 10 mds.1.cache.ino(101) * updated accounted_rstat n(v2 rc2015-07-15 16:59:22.180365 10=0+10) on [dir 101 ~mds1/ [2,head] auth pv=8 v=7 cv=0/0 dir_auth=-2 state=1610743808 f(v0 10=0+10)->f(v0 10=0+10) n(v1 rc2015-07-15 16:59:22.180365 b8388608 11=1+10)->n(v2 rc2015-07-15 16:59:22.180365 10=0+10) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=0 subtreetemp=0 replicated=0 dirty=1 waiter=0 authpin=0 0x4062000]
   -10> 2015-07-15 17:00:39.763445 7fb131cc7700 20 mds.1.cache.ino(101)  final rstat n(v2 rc2015-07-15 16:59:22.180365 11=0+11)
    -9> 2015-07-15 17:00:39.763467 7fb131cc7700 20 mds.1.cache.ino(101) encode_snap_blob snaprealm(101 seq 1 lc 0 cr 0 cps 1 snaps={} 0x4041b40)
    -8> 2015-07-15 17:00:39.763474 7fb131cc7700 10 mds.1.cache.ino(101) finish_scatter_gather_update_accounted 1024 on [inode 101 [...2,head] ~mds1/ auth v2 pv3 ap=1+0 snaprealm=0x4041b40 dirtyparent f(v0 10=0+10)->f(v0 10=0+10) n(v1 rc2015-07-15 16:59:22.180365 b8388608 12=1+11)/n(v0 11=0+11)->n(v2 rc2015-07-15 16:59:22.180365 11=0+11)/n(v0 11=0+11) (inest sync->lock w=1 flushing) (iversion lock) | dirtyscattered=1 lock=1 dirfrag=1 dirtyparent=1 replicated=0 dirty=1 authpin=1 0x4052000]
    -7> 2015-07-15 17:00:39.763495 7fb131cc7700 10 mds.1.cache.ino(101)  journaling updated frag accounted_ on [dir 101 ~mds1/ [2,head] auth pv=8 v=7 cv=0/0 dir_auth=-2 state=1610743808 f(v0 10=0+10)->f(v0 10=0+10) n(v1 rc2015-07-15 16:59:22.180365 b8388608 11=1+10)->n(v2 rc2015-07-15 16:59:22.180365 10=0+10) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=0 subtreetemp=0 replicated=0 dirty=1 waiter=0 authpin=0 0x4062000]
    -6> 2015-07-15 17:00:39.763510 7fb131cc7700 10 mds.1.cache.dir(101) pre_dirty 9
    -5> 2015-07-15 17:00:39.763526 7fb131cc7700 10 mds.1.cache.dir(101) auth_pin by 0x40fda00 on [dir 101 ~mds1/ [2,head] auth pv=9 v=7 cv=0/0 dir_auth=-2 ap=1+0+0 state=1610743808 f(v0 10=0+10)->f(v0 10=0+10) n(v1 rc2015-07-15 16:59:22.180365 b8388608 11=1+10)->n(v2 rc2015-07-15 16:59:22.180365 10=0+10) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=0 subtreetemp=0 replicated=0 dirty=1 waiter=0 authpin=1 0x4062000] count now 1 + 0
    -4> 2015-07-15 17:00:39.763544 7fb131cc7700 10 mds.1.cache.dir(101) assimilate_dirty_rstat_inodes_finish
    -3> 2015-07-15 17:00:39.763547 7fb131cc7700 10 mds.1.cache.ino(60a) auth_pin by 0x40fda00 on [inode 60a [...2,head] ~mds1/stray0/ auth v6 pv8 ap=1+0 dirtyparent f(v0 m2015-07-15 16:59:22.180365)->f(v0 m2015-07-15 16:59:22.180365) n(v0 rc2015-07-15 16:59:22.180365 1=0+1)/n(v0 rc2015-07-15 16:59:22.180365 b8388608 2=1+1)->n(v0 rc2015-07-15 16:59:22.180365 1=0+1) (inest mix) (iversion lock) | lock=0 dirfrag=1 stickydirs=0 stray=0 dirtyrstat=1 dirtyparent=1 replicated=0 dirty=1 waiter=0 authpin=1 0x40557a0] now 1+0
    -2> 2015-07-15 17:00:39.763573 7fb131cc7700 15 mds.1.cache.dir(101) adjust_nested_auth_pins 1/1 on [dir 101 ~mds1/ [2,head] auth pv=9 v=7 cv=0/0 dir_auth=-2 ap=1+1+1 state=1610612736 f(v0 10=0+10)->f(v0 10=0+10) n(v1 rc2015-07-15 16:59:22.180365 b8388608 11=1+10)->n(v2 rc2015-07-15 16:59:22.180365 10=0+10) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=0 subtreetemp=0 replicated=0 dirty=1 waiter=0 authpin=1 0x4062000] by 0x40557a0 count now 1 + 1
    -1> 2015-07-15 17:00:39.763590 7fb131cc7700 10 mds.1.cache.ino(60a) clear_dirty_rstat
     0> 2015-07-15 17:00:39.767176 7fb131cc7700 -1 mds/MDLog.cc: In function 'void MDLog::_submit_entry(LogEvent*, MDSInternalContextBase*)' thread 7fb131cc7700 time 2015-07-15 17:00:39.763606
mds/MDLog.cc: 260: FAILED assert(!capped)

 ceph version 9.0.1-1475-g970195e (970195e86a01503921f248d469a73f3611747197)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) [0x98cbaf]
 2: (MDLog::_submit_entry(LogEvent*, MDSInternalContextBase*)+0x4fb) [0x80a40b]
 3: (Locker::scatter_writebehind(ScatterLock*)+0x7c9) [0x71a099]
 4: (Locker::simple_lock(SimpleLock*, bool*)+0x34d) [0x71f15d]
 5: (Locker::scatter_nudge(ScatterLock*, MDSInternalContextBase*, bool)+0x728) [0x722638]
 6: (Locker::scatter_tick()+0x33b) [0x722feb]
 7: (Locker::tick()+0x9) [0x723249]
 8: (MDS::tick()+0x33c) [0x5b84fc]
 9: (MDSInternalContextBase::complete(int)+0x1db) [0x7fa2cb]
 10: (SafeTimer::timer_thread()+0x3e5) [0x97e0f5]
 11: (SafeTimerThread::entry()+0xd) [0x97ec8d]
 12: (()+0x7e9a) [0x7fb139e98e9a]
 13: (clone()+0x6d) [0x7fb13885e8bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions #1

Updated by Greg Farnum almost 9 years ago

  • Priority changed from Normal to Urgent
Actions #3

Updated by John Spray almost 9 years ago

  • Assignee set to John Spray
Actions #4

Updated by John Spray almost 9 years ago

  • Status changed from New to In Progress

Reproduced this locally. Fixing...

Actions #5

Updated by John Spray almost 9 years ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by John Spray almost 9 years ago

  • Status changed from Fix Under Review to Resolved
Actions #7

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF