Bug #21070
closedMDS: MDS is laggy or crashed When deleting a large number of files
0%
Description
We plan to use mdtest to create 100w level of the file, in the ceph-fuse mount the directory, the command is as follows:
./mdtest -d / cephfs / test / -z 1 -w 10k 1 -T -b 10 -L -n 1000000 -F -C
The creation process is normal, but when we delete the file, the rm -rf command is used,the error happened, as follows:
services: mon: 2 daemons, quorum 192.9.9.221,192.9.9.223 mgr: 192.9.9.221(active), standbys: 192.9.9.223 mds: 1/1/1 up {0=192.9.9.223=up:active(laggy or crashed)} osd: 19 osds: 19 up, 19 in
The mds log has the following error:
2017-08-22 15:29:11.017122 7ffb752f8700 1 mds.0.28 recovery_done -- successful recovery! 2017-08-22 15:29:11.017247 7ffb752f8700 1 mds.0.28 active_start 2017-08-22 15:29:11.018963 7ffb752f8700 1 mds.0.28 cluster recovered. 2017-08-22 15:29:11.023689 7ffb6f2ec700 -1 *** Caught signal (Segmentation fault) ** in thread 7ffb6f2ec700 thread_name:mds_rank_progr ceph version 12.1.0.3 (48ff484a1b75334f6a08c5b3dbdbb0abfa1cb2cf) luminous (dev) 1: (()+0x58592f) [0x7ffb7e1d792f] 2: (()+0xf130) [0x7ffb7bd25130] 3: (Server::handle_client_readdir(boost::intrusive_ptr<MDRequestImpl>&)+0xbb9) [0x7ffb7df54989] 4: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x9b1) [0x7ffb7df85a51] 5: (MDSInternalContextBase::complete(int)+0x1eb) [0x7ffb7e162dcb] 6: (MDSRank::_advance_queues()+0x4a5) [0x7ffb7df0d665] 7: (MDSRank::ProgressThread::entry()+0x4a) [0x7ffb7df0daca] 8: (()+0x7df3) [0x7ffb7bd1ddf3] 9: (clone()+0x6d) [0x7ffb7ae033ed] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Description:
12.1.0.3 comes from version 12.1.0
Files
Updated by huanwen ren over 6 years ago
I open the log to try to view the problem where the log information is as follows:
2017-08-22 16:46:37.846888 7fbda3aae700 10 mds.0.server rdlock_path_pin_ref request(client.1483804:4138670 cr=0x7fbdbc093500) #10000000007 2017-08-22 16:46:37.846890 7fbda3aae700 10 mds.0.locker acquire_locks request(client.1483804:4138670 cr=0x7fbdbc093500) - done locking 2017-08-22 16:46:37.846893 7fbda3aae700 20 Session check_access path /test/#test-dir.0/mdtest_tree.0/mdtest_tree.5 2017-08-22 16:46:37.846894 7fbda3aae700 10 MDSAuthCap is_capable inode(path /test/#test-dir.0/mdtest_tree.0/mdtest_tree.5 owner 0:0 mode 040755) by caller 0:0 mask 1 new 0:0 cap: MDSAuthCaps[allow *] 2017-08-22 16:46:37.846896 7fbda3aae700 10 mds.0.server frag 111111* offset '' offset_hash 16776760 flags 1 2017-08-22 16:46:37.846898 7fbda3aae700 10 mds.0.server handle_client_readdir on [dir 10000000007.111111* /test/#test-dir.0/mdtest_tree.0/mdtest_tree.5/ [2,head] auth v=85827 cv=0/0 state=1073741826|complete f(v0 m2017-08-22 15:28:22.988204) n(v0 rc2017-08-22 15:28:22.988204) hs=0+389,ss=0+0 dirty=389 | child=1 waiter=0 authpin=0 0x7fbdbef91a80] 2017-08-22 16:46:37.846905 7fbda3aae700 10 mds.0.server snapid head 2017-08-22 16:46:37.848174 7fbda3aae700 -1 *** Caught signal (Segmentation fault) ** in thread 7fbda3aae700 thread_name:mds_rank_progr ceph version 12.1.0.3 (48ff484a1b75334f6a08c5b3dbdbb0abfa1cb2cf) luminous (dev) 1: (()+0x58592f) [0x7fbdb299992f] 2: (()+0xf130) [0x7fbdb04e7130] 3: (Server::handle_client_readdir(boost::intrusive_ptr<MDRequestImpl>&)+0xbb9) [0x7fbdb2716989] 4: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x9b1) [0x7fbdb2747a51] 5: (MDSInternalContextBase::complete(int)+0x1eb) [0x7fbdb2924dcb] 6: (MDSRank::_advance_queues()+0x4a5) [0x7fbdb26cf665] 7: (MDSRank::ProgressThread::entry()+0x4a) [0x7fbdb26cfaca] 8: (()+0x7df3) [0x7fbdb04dfdf3] 9: (clone()+0x6d) [0x7fbdaf5c53ed] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. --- begin dump of recent events --- -10000> 2017-08-22 16:46:36.950581 7fbda9aba700 7 mds.0.cache dentry [dentry #100/stray0/100000bf8e
Updated by xie xingguo over 6 years ago
- Assignee set to Patrick Donnelly
- Priority changed from Normal to Urgent
Updated by huanwen ren over 6 years ago
I added some print debugging information, and can be reproduced:
1.The right to print
2017-08-24 14:48:05.996554 7f8a0c3ca700 20 Session check_access path /test/#test-dir.0/mdtest_tree.0/mdtest_tree.7 2017-08-24 14:48:05.996556 7f8a0c3ca700 10 MDSAuthCap is_capable inode(path /test/#test-dir.0/mdtest_tree.0/mdtest_tree.7 own er 0:0 mode 040755) by caller 0:0 mask 1 new 0:0 cap: MDSAuthCaps[allow *] 2017-08-24 14:48:05.996561 7f8a0c3ca700 10 mds.0.server frag 111111* offset 'file.mdtest.0.730836' offset_hash 16514491 flag s 1 2017-08-24 14:48:05.996583 7f8a0c3ca700 10 mds.0.server handle_client_readdir on [dir 0x10000000009.111111* /test/#test-dir.0 /mdtest_tree.0/mdtest_tree.7/ [2,head] auth v=82019 cv=82019/82019 state=1073741826|complete f(v0 m2017-08-22 14:55:59.141856 1517=1517+0) n(v0 rc2017-08-22 14:55:59.141856 b15170 1517=1517+0) hs=1517+0,ss=0+0 | child=1 waiter=0 authpin=0 0x7f8a25bd4 d80] 2017-08-24 14:48:05.996593 7f8a0c3ca700 10 mds.0.server snapid head 2017-08-24 14:48:05.996594 7f8a0c3ca700 10 mds.0.server test1 2017-08-24 14:48:05.996595 7f8a0c3ca700 10 mds.0.server test2 2017-08-24 14:48:05.996597 7f8a0c3ca700 10 mds.0.server test3 2017-08-24 14:48:05.996598 7f8a0c3ca700 10 mds.0.server test4 589758 2017-08-24 14:48:05.996599 7f8a0c3ca700 10 mds.0.server test5 2017-08-24 14:48:05.996600 7f8a0c3ca700 10 mds.0.server test6 2017-08-24 14:48:05.996601 7f8a0c3ca700 10 mds.0.server test6-1 2017-08-24 14:48:05.996602 7f8a0c3ca700 10 mds.0.server dn1-10x7f8a357dbdc0 2017-08-24 14:48:05.996603 7f8a0c3ca700 10 mds.0.server dn1-2 0x7f8a357dbdc0 2017-08-24 14:48:05.996604 7f8a0c3ca700 10 mds.0.server dn1-3[dentry #0x1/test/#test-dir.0/mdtest_tree.0/mdtest_tree.7/file.m dtest.0.775074 [2,head] auth (dversion lock) pv=0 v=82019 inode=0x7f8a3600d200 state=1073741824 0x7f8a357dbdc0] 2017-08-24 14:48:05.996608 7f8a0c3ca700 10 mds.0.server dn2->inode0x7f8a25ae3000 2017-08-24 14:48:05.996609 7f8a0c3ca700 10 mds.0.server dn3->linkage0x7f8a3600d200 2017-08-24 14:48:05.996610 7f8a0c3ca700 10 mds.0.server dn20x7f8a357dbdc0 2017-08-24 14:48:05.996611 7f8a0c3ca700 10 mds.0.server test6-2 2017-08-24 14:48:05.996612 7f8a0c3ca700 10 mds.0.server test6-3 2017-08-24 14:48:05.996613 7f8a0c3ca700 10 mds.0.server test6-4 2017-08-24 14:48:05.996614 7f8a0c3ca700 10 mds.0.server test6-5
2.The erro to print
2017-08-24 14:55:40.930126 7f8a0abc7700 10 mds.0.server frag 111* offset '' offset_hash 16777078 flags 1 2017-08-24 14:55:40.930130 7f8a0abc7700 10 mds.0.server handle_client_readdir on [dir 0x10000000009.111* /test/#test-dir.0/md test_tree.0/mdtest_tree.7/ [2,head] auth v=85713 cv=85407/85407 state=1610612738|complete f(v0 m2017-08-24 14:55:39.890505) n (v0 rc2017-08-24 14:55:39.890505) hs=0+153,ss=0+0 dirty=153 | child=1 frozen=0 dirty=1 waiter=0 authpin=0 0x7f8a25101740] 2017-08-24 14:55:40.930138 7f8a0abc7700 10 mds.0.server snapid head 2017-08-24 14:55:40.930152 7f8a0abc7700 10 mds.0.server test1 2017-08-24 14:55:40.930152 7f8a0abc7700 10 mds.0.server test2 2017-08-24 14:55:40.930156 7f8a0abc7700 10 mds.0.server test3 2017-08-24 14:55:40.930157 7f8a0abc7700 10 mds.0.server test4 589758 2017-08-24 14:55:40.930158 7f8a0abc7700 10 mds.0.server test5 2017-08-24 14:55:40.930163 7f8a0abc7700 10 mds.0.server test6 2017-08-24 14:55:40.930164 7f8a0abc7700 10 mds.0.server test6-1 2017-08-24 14:55:40.930165 7f8a0abc7700 10 mds.0.server dn1-10x600000000000099 2017-08-24 14:55:40.930166 7f8a0abc7700 10 mds.0.server dn1-2 0x600000000000099 2017-08-24 14:55:40.934081 7f8a0abc7700 -1 *** Caught signal (Segmentation fault) ** in thread 7f8a0abc7700 thread_name:fn_anonymous ceph version 12.1.2-589-gc8c837f (c8c837f665edc6e32efda5f15dedd8a48698214a) luminous (rc) 1: (()+0x594dd1) [0x7f8a19ac1dd1] 2: (()+0xf130) [0x7f8a17600130] 3: (CDentry::make_path(filepath&, bool) const+0x7) [0x7f8a199ba4b7] 4: (operator<<(std::ostream&, CDentry const&)+0x77) [0x7f8a199ba607] 5: (Server::handle_client_readdir(boost::intrusive_ptr<MDRequestImpl>&)+0x110b) [0x7f8a1983af1b] 6: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x9c1) [0x7f8a1986dc31] 7: (MDSInternalContextBase::complete(int)+0x1eb) [0x7f8a19a4be6b] 8: (void finish_contexts<MDSInternalContextBase>(CephContext*, std::list<MDSInternalContextBase*, std::allocator<MDSInternal ContextBase*> >&, int)+0xac) [0x7f8a1980982c]
test is my add print info as fllow:
dout(10) << "test5 " << dendl; for (CDir::map_t::iterator it = start ? dir->begin() : dir->lower_bound(skip_key); !end && numfiles < max; end = (it == dir->end())) { dout(10) << "test6 " << dendl; CDentry *dn = it->second; ++it; dout(10) << "test6-1 " << dendl; if (!dn) { dout(10) << "test6-1111 " << dendl; continue; } dout(10) << "dn1-1" << dn << dendl; dout(10) << "dn1-2 " << dn << dendl; dout(10) << "dn1-3" << *dn << dendl; dout(10) << "dn2->inode" << dn->get_dir()->inode << dendl; dout(10) << "dn3->linkage" << dn->get_linkage()->get_inode() << dendl; dout(10) << "dn2" << dn << dendl; if (dn->state_test(CDentry::STATE_PURGING)) continue;
Updated by huanwen ren over 6 years ago
The address of *dn(mds.0.server dn1-10x600000000000099) is overflowed,
but have not found the reason.
Updated by Zheng Yan over 6 years ago
- File readdir.patch readdir.patch added
please try the attached patch
Updated by Zheng Yan over 6 years ago
- Assignee changed from Patrick Donnelly to Zheng Yan
Updated by huanwen ren over 6 years ago
In addition, I set "offset_hash" to 0 when offset_str is empty, which can solve this problem.
Modify the code as follows:
__u32 offset_hash = 0; if (!offset_str.empty()) offset_hash = ceph_frag_value(diri->hash_dentry_name(offset_str)); else - offset_hash = (__u32)req->head.args.readdir.offset_hash; + offset_hash = 0;
Updated by Zheng Yan over 6 years ago
my patch alone does not work? your change will make seeky readdir on directory inefficiency
Updated by huanwen ren over 6 years ago
your patch can solve this problem, I test twice today.:-)
my modification is just to verify the dirfrag offset caused by the problem, not a good solution.
I have a question, under what circumstances will there be offset_str.empty is null, but offset_hash also has offset data? Is it dirfrag split at the end of the object?
thank you.
Updated by Zheng Yan over 6 years ago
for seeky readdir on large directory
int fd = open("/mnt/ceph", O_RDONLY | O_DIRECTORY)
lseek(fd, xxxxxx, SEEK_SET)
getdents(fd, ...)
Updated by Zheng Yan over 6 years ago
- Status changed from New to Fix Under Review
- Backport set to luminous
Updated by Zheng Yan over 6 years ago
- Backport changed from luminous to luminous,jewel
Updated by Nathan Cutler over 6 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 6 years ago
- Copied to Backport #21357: luminous: mds: segfault during `rm -rf` of large directory added
Updated by Nathan Cutler over 6 years ago
- Copied to Backport #21450: jewel: MDS: MDS is laggy or crashed When deleting a large number of files added
Updated by Zheng Yan about 6 years ago
- Backport changed from luminous,jewel to luminous
Updated by Nathan Cutler about 6 years ago
- Copied to deleted (Backport #21450: jewel: MDS: MDS is laggy or crashed When deleting a large number of files)
Updated by Zheng Yan about 6 years ago
- Status changed from Pending Backport to Resolved