Project

General

Profile

Bug #17606

multimds: assertion failure during directory migration

Added by Patrick Donnelly almost 3 years ago. Updated 6 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
Start date:
10/18/2016
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:

Description

This is from an experiment on Linode with 9 active MDS, 32 OSD, and 128 clients building the kernel.

2016-10-18 06:35:57.090410 7ff5c1a6c700  0 mds.1.bal reexporting [dir 1000000138d /tmp.fixVZA/ [2,head] auth{0=1,5=1,8=1} v=393 cv=393/393 dir_auth=1 state=1073741826|complete f(v1 m2016-10-18 06:09:45.949993 2=1+1) n(v45 rc2016-10-18 06:24:57.106327 b623479704 51788=48644+3144) hs=2+0,ss=0+0 | child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=0 authpin=0 0x562e4ebe0c48] pop 0 back to mds.0
2016-10-18 06:35:57.279925 7ff5c1a6c700 -1 /srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.2/src/mds/Migrator.cc: In function 'void Migrator::handle_export_ack(MExportDirAck*)' thread 7ff5c1a6c700 time 2016-10-18 06:35:57.271799
/srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.2/src/mds/Migrator.cc: 1557: FAILED assert(dir->is_frozen_tree_root())

 ceph version v11.0.2-358-g29119aa (29119aaff3fac7e14fd6e0afd31d6bfd6a58098a)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x562e444738d5]
 2: (Migrator::handle_export_ack(MExportDirAck*)+0xa11) [0x562e442be281]
 3: (Migrator::dispatch(Message*)+0xe5) [0x562e442cd5a5] 
 4: (MDSRank::handle_deferrable_message(Message*)+0x63b) [0x562e441219eb]
 5: (MDSRank::_dispatch(Message*, bool)+0x207) [0x562e4412bc37]
 6: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x562e4412cd95]
 7: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x562e441199b3]
 8: (DispatchQueue::entry()+0x7ba) [0x562e446373fa]
 9: (DispatchQueue::DispatchThread::entry()+0xd) [0x562e444ea9fd]
 10: (()+0x7df3) [0x7ff5c898adf3]
 11: (clone()+0x6d) [0x7ff5c7a7701d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Copy of log here: /ceph/cephfs-perf/tmp/2016-10-18/ceph-mds.ceph-mds0.log

History

#1 Updated by Zheng Yan almost 3 years ago

I can't access /ceph/cephfs-perf/tmp/2016-10-18/ceph-mds.ceph-mds0.log. please change permission

#2 Updated by Patrick Donnelly almost 3 years ago

Fixed!

#3 Updated by Zheng Yan almost 3 years ago

  • Status changed from New to Need More Info

debug level of the log is too low. This is dup of a long-standing bug http://tracker.ceph.com/issues/8405. I can't figure out what happened

#4 Updated by John Spray almost 3 years ago

  • Priority changed from Normal to High
  • Target version set to v12.0.0

#5 Updated by Zheng Yan almost 3 years ago

  • Status changed from Need More Info to Testing

#6 Updated by Zheng Yan over 2 years ago

  • Status changed from Testing to Resolved

#7 Updated by Patrick Donnelly 6 months ago

  • Category deleted (90)
  • Labels (FS) multimds added

Also available in: Atom PDF