Bug #17069
closedmultimds: slave rmdir assertion failure
0%
Description
Assertion: /srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-11.0.0-1382-g253f285/src/mds/Server.cc: 5784: FAILED assert(straydn->first >= in->first) ceph version v11.0.0-1382-g253f285 (253f28556c8dead17806deeb49917246bdbed8ea) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x91d3fb] 2: (Server::handle_slave_rmdir_prep(std::shared_ptr<MDRequestImpl>&)+0x11d6) [0x6559d6] 3: (Server::dispatch_slave_request(std::shared_ptr<MDRequestImpl>&)+0x71b) [0x666c9b] 4: (Server::handle_slave_request(MMDSSlaveRequest*)+0x8fc) [0x67008c] 5: (Server::dispatch(Message*)+0x69b) [0x670deb] 6: (MDSRank::handle_deferrable_message(Message*)+0x80c) [0x5efdac] 7: (MDSRank::_dispatch(Message*, bool)+0x1e1) [0x5f9be1] 8: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x5fad35] 9: (MDSDaemon::ms_dispatch(Message*)+0xc3) [0x5e8503] 10: (DispatchQueue::entry()+0x78b) [0xab84db] 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x97dded] 12: (()+0x8184) [0x7f365a3f3184] 13: (clone()+0x6d) [0x7f3658d6037d]
In first run:
http://pulpito.ceph.com/pdonnell-2016-08-10_03:03:20-multimds-master---basic-mira/357018/
http://pulpito.ceph.com/pdonnell-2016-08-10_03:03:20-multimds-master---basic-mira/357232/
and in another run:
http://pulpito.ceph.com/pdonnell-2016-08-11_20:40:52-multimds-master-testing-basic-mira/358681/
http://pulpito.ceph.com/pdonnell-2016-08-11_20:40:52-multimds-master-testing-basic-mira/358719/
http://pulpito.ceph.com/pdonnell-2016-08-11_20:40:52-multimds-master-testing-basic-mira/358910/
Here is an excerpt from one of the MDS logs:
2016-08-12 20:20:23.165872 7f1918db9700 20 mds.1.cache.dir(1000000000d) lookup (head, 'copyofsnap1') 2016-08-12 20:20:23.165875 7f1918db9700 20 mds.1.cache.dir(1000000000d) hit -> (copyofsnap1,head) 2016-08-12 20:20:23.165878 7f1918db9700 10 mds.1.cache path_traverse finish on snapid head 2016-08-12 20:20:23.165880 7f1918db9700 10 mds.1.server dn [dentry #1/client.0/tmp/copyofsnap1 [d6,head] rep@2,-2.1 (dn lock) (dversion lock) v=15128 inode=0xd41af90 | request=0 lock=0 inodepin=1 dirty=0 authpin=0 tempexporting=0 clientlease=0 0xd133f40] 2016-08-12 20:20:23.165893 7f1918db9700 10 mds.1.server straydn [dentry #102/stray5/2000000086e [2,head] rep@2,-2.1 NULL (dn lock) (dversion lock) v=0 inode=0 | request=1 0xa77bb60] 2016-08-12 20:20:23.165904 7f1918db9700 20 mds.1.server rollback is 77 bytes 2016-08-12 20:20:23.165907 7f1918db9700 10 mds.1.server no auth subtree in [inode 2000000086e [...da,head] /client.0/tmp/copyofsnap1/ rep@2.1 v33032 f(v1 m2016-08-12 20:20:21.897837) n(v7 rc2016-08-12 20:20:18.090778 b20298 437=398+39)/n(v1 rc2016-08-12 20:19:02.350605 1034=946+88) (ilink lock) (inest lock) (iversion lock) caps={4131=pAsXs/p@9} | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=1 exportingcaps=0 dirtyparent=0 dirty=0 waiter=0 authpin=0 tempexporting=0 0xd41af90], skipping journal 2016-08-12 20:20:23.165933 7f1918db9700 12 mds.1.cache.dir(1000000000d) unlink_inode [dentry #1/client.0/tmp/copyofsnap1 [d6,head] rep@2,-2.1 (dn lock) (dversion lock) v=15128 inode=0xd41af90 | request=1 lock=0 inodepin=1 dirty=0 authpin=0 tempexporting=0 clientlease=0 0xd133f40] [inode 2000000086e [...da,head] /client.0/tmp/copyofsnap1/ rep@2.1 v33032 f(v1 m2016-08-12 20:20:21.897837) n(v7 rc2016-08-12 20:20:18.090778 b20298 437=398+39)/n(v1 rc2016-08-12 20:19:02.350605 1034=946+88) (ilink lock) (inest lock) (iversion lock) caps={4131=pAsXs/p@9} | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=1 exportingcaps=0 dirtyparent=0 dirty=0 waiter=0 authpin=0 tempexporting=0 0xd41af90] 2016-08-12 20:20:23.165963 7f1918db9700 12 mds.1.cache.dir(619) link_primary_inode [dentry #102/stray5/2000000086e [2,head] rep@2,-2.1 NULL (dn lock) (dversion lock) v=0 inode=0 | request=1 0xa77bb60] [inode 2000000086e [...da,head] #2000000086e/ rep@-2.1 v33032 f(v1 m2016-08-12 20:20:21.897837) n(v7 rc2016-08-12 20:20:18.090778 b20298 437=398+39)/n(v1 rc2016-08-12 20:19:02.350605 1034=946+88) (ilink lock) (inest lock) (iversion lock) caps={4131=pAsXs/p@9} | dirtyscattered=0 request=0 lock=0 dirfrag=1 caps=1 exportingcaps=0 dirtyparent=0 dirty=0 waiter=0 authpin=0 tempexporting=0 0xd41af90] 2016-08-12 20:20:23.169559 7f1918db9700 -1 /srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.0/src/mds/Server.cc: In function 'void Server::handle_slave_rmdir_prep(MDRequestRef&)' thread 7f1918db9700 time 2016-08-12 20:20:23.166004 /srv/autobuild-ceph/gitbuilder.git/build/rpmbuild/BUILD/ceph-11.0.0/src/mds/Server.cc: 5784: FAILED assert(straydn->first >= in->first) ceph version v11.0.0-1464-gec24ff0 (ec24ff0ceeaa735423bb113a4e522bb543e1bbcc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x9339b5] 2: (Server::handle_slave_rmdir_prep(std::shared_ptr<MDRequestImpl>&)+0x11e9) [0x6511b9] 3: (Server::dispatch_slave_request(std::shared_ptr<MDRequestImpl>&)+0x70b) [0x66966b] 4: (Server::handle_slave_request(MMDSSlaveRequest*)+0x924) [0x672ce4] 5: (Server::dispatch(Message*)+0x6db) [0x673a8b] 6: (MDSRank::handle_deferrable_message(Message*)+0x82c) [0x5ef3fc] 7: (MDSRank::_dispatch(Message*, bool)+0x207) [0x5f96d7] 8: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x5fa835] 9: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x5e7583] 10: (DispatchQueue::entry()+0x78a) [0xadbafa] 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x99820d] 12: (()+0x7dc5) [0x7f191eaccdc5] 13: (clone()+0x6d) [0x7f191dbb821d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Zheng Yan over 7 years ago
strange. have you ever use snapshot on the testing cluster?
Updated by Zheng Yan over 7 years ago
- Status changed from New to 12
please don't run snapshot tests on multimds, they are know broken.
Updated by Zheng Yan over 7 years ago
- Priority changed from High to Low
snapshot bug, lower Priority
Updated by John Spray almost 7 years ago
- Status changed from 12 to Closed
Closing because currently we know that snapshots+multimds is broken.
Updated by Patrick Donnelly about 5 years ago
- Category deleted (
90) - Labels (FS) multimds added