Project

General

Profile

Actions

Bug #21070

closed

MDS: MDS is laggy or crashed When deleting a large number of files

Added by huanwen ren over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We plan to use mdtest to create 100w level of the file, in the ceph-fuse mount the directory, the command is as follows:

./mdtest -d / cephfs / test / -z 1 -w 10k 1 -T -b 10 -L -n 1000000 -F -C

The creation process is normal, but when we delete the file, the rm -rf command is used,the error happened, as follows:
services:
    mon: 2 daemons, quorum 192.9.9.221,192.9.9.223
    mgr: 192.9.9.221(active), standbys: 192.9.9.223
    mds: 1/1/1 up {0=192.9.9.223=up:active(laggy or crashed)}
    osd: 19 osds: 19 up, 19 in

The mds log has the following error:

2017-08-22 15:29:11.017122 7ffb752f8700  1 mds.0.28 recovery_done -- successful recovery!
2017-08-22 15:29:11.017247 7ffb752f8700  1 mds.0.28 active_start
2017-08-22 15:29:11.018963 7ffb752f8700  1 mds.0.28 cluster recovered.
2017-08-22 15:29:11.023689 7ffb6f2ec700 -1 *** Caught signal (Segmentation fault) **
 in thread 7ffb6f2ec700 thread_name:mds_rank_progr

 ceph version 12.1.0.3 (48ff484a1b75334f6a08c5b3dbdbb0abfa1cb2cf) luminous (dev)
 1: (()+0x58592f) [0x7ffb7e1d792f]
 2: (()+0xf130) [0x7ffb7bd25130]
 3: (Server::handle_client_readdir(boost::intrusive_ptr<MDRequestImpl>&)+0xbb9) [0x7ffb7df54989]
 4: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0x9b1) [0x7ffb7df85a51]
 5: (MDSInternalContextBase::complete(int)+0x1eb) [0x7ffb7e162dcb]
 6: (MDSRank::_advance_queues()+0x4a5) [0x7ffb7df0d665]
 7: (MDSRank::ProgressThread::entry()+0x4a) [0x7ffb7df0daca]
 8: (()+0x7df3) [0x7ffb7bd1ddf3]
 9: (clone()+0x6d) [0x7ffb7ae033ed]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Description:
12.1.0.3 comes from version 12.1.0


Files

readdir.patch (838 Bytes) readdir.patch Zheng Yan, 08/24/2017 10:48 AM

Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #21357: luminous: mds: segfault during `rm -rf` of large directoryResolvedZheng YanActions
Actions

Also available in: Atom PDF