Bug #49132: mds crashed "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE", - CephFS - Ceph

Actions

Copy link

Bug #49132

closed

mds crashed "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",

Added by Kenneth Waegeman about 3 years ago. Updated about 1 year ago.

Status:

Resolved

Priority:

Normal

Assignee:

Xiubo Li

Category:

Target version:

Ceph - v17.0.0

% Done:

Source:

Community (user)

Tags:

backport_processed

Backport:

pacific, quincy

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v14.2.16

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

crash

Pull request ID:

48743

Crash signature (v1):

Crash signature (v2):

Description

[root@mds06 ~]# ceph crash ls-new
ID                                                               ENTITY    NEW 
2021-02-02_13:37:07.487784Z_d50629e2-49f0-47ca-b305-40937ad11dbf mds.mds06  *  
=
[root@mds06 ~]# ceph crash info 2021-02-02_13:37:07.487784Z_d50629e2-49f0-47ca-b305-40937ad11dbf
{
    "os_version_id": "7", 
    "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE", 
    "utsname_release": "3.10.0-1160.6.1.el7.x86_64", 
    "os_name": "CentOS Linux", 
    "entity_name": "mds.mds06", 
    "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.16/rpm/el7/BUILD/ceph-14.2.16/src/mds/ScatterLock.h", 
    "timestamp": "2021-02-02 13:37:07.487784Z", 
    "process_name": "ceph-mds", 
    "utsname_machine": "x86_64", 
    "assert_line": 85, 
    "utsname_sysname": "Linux", 
    "os_version": "7 (Core)", 
    "os_id": "centos", 
    "assert_thread_name": "fn_anonymous", 
    "utsname_version": "#1 SMP Thu Nov 19 12:50:20 CET 2020", 
    "backtrace": [
        "(()+0xf630) [0x7fba63e1a630]", 
        "(gsignal()+0x37) [0x7fba629f7387]", 
        "(abort()+0x148) [0x7fba629f8a78]", 
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7fba65f65b76]", 
        "(()+0x25ccef) [0x7fba65f65cef]", 
        "(MDCache::truncate_inode(CInode*, LogSegment*)+0x44e) [0x55ea2196d49e]", 
        "(C_MDS_inode_update_finish::finish(int)+0x133) [0x55ea21947b83]", 
        "(MDSContext::complete(int)+0x74) [0x55ea21b0db94]", 
        "(MDSIOContextBase::complete(int)+0x16f) [0x55ea21b0ddef]", 
        "(MDSLogContextBase::complete(int)+0x40) [0x55ea21b0e000]", 
        "(Finisher::finisher_thread_entry()+0x16f) [0x7fba65fef75f]", 
        "(()+0x7ea5) [0x7fba63e12ea5]", 
        "(clone()+0x6d) [0x7fba62abf96d]" 
    ], 
    "utsname_hostname": "mds06.gigalith.os", 
    "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.16/rpm/el7/BUILD/ceph-14.2.16/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7fba5626e700 time 2021-02-02 14:37:07.486072\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.16/rpm/el7/BUILD/ceph-14.2.16/src/mds/ScatterLock.h: 85: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)\n", 
    "crash_id": "2021-02-02_13:37:07.487784Z_d50629e2-49f0-47ca-b305-40937ad11dbf", 
    "assert_func": "void ScatterLock::set_xlock_snap_sync(MDSContext*)", 
    "ceph_version": "14.2.16" 
}

Files

mds-logs.gz (150 KB) mds-logs.gz

MDS logs for 6/15/2021 crash

Andras Pataki, 06/16/2021 12:17 AM

Related issues 4 (0 open — 4 closed)

Actions

Copy link

Updated by Sage Weil almost 3 years ago

Project changed from Ceph to CephFS

Actions

Copy link

Updated by Patrick Donnelly almost 3 years ago

Description updated (diff)

Actions

Copy link

Updated by Andras Pataki almost 3 years ago

File mds-logs.gz mds-logs.gz added

We also had a crash that resembles this quite closely - I'm attaching the MDS logs. Are there any leads regarding potential causes for this issue?
The cluster is on CentOS 8, Ceph Nautilus 14.2.20.

{
    "crash_id": "2021-06-15_04:23:24.432205Z_167f4609-298f-45e8-b0db-fc04b5bbacd0",
    "timestamp": "2021-06-15 04:23:24.432205Z",
    "process_name": "ceph-mds",
    "entity_name": "mds.popeye-mds-2-09",
    "ceph_version": "14.2.20",
    "utsname_hostname": "popeye-mds-2-09",
    "utsname_sysname": "Linux",
    "utsname_release": "5.10.32.2.fi",
    "utsname_version": "#1 SMP Fri Apr 23 03:57:43 EDT 2021",
    "utsname_machine": "x86_64",
    "os_name": "CentOS Linux",
    "os_id": "centos",
    "os_version_id": "8",
    "os_version": "8",
    "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",
    "assert_func": "void ScatterLock::set_xlock_snap_sync(MDSContext*)",
    "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h",
    "assert_line": 85,
    "assert_thread_name": "fn_anonymous",
    "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7fffde9cd700 time 2021-06-14 21:23:24.430600\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: 85: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)\n",
    "backtrace": [
        "(()+0x12b20) [0x7fffecd8bb20]",
        "(gsignal()+0x10f) [0x7fffeb7dc7ff]",
        "(abort()+0x127) [0x7fffeb7c6c35]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a7) [0x7fffeef93a8b]",
        "(()+0x275c54) [0x7fffeef93c54]",
        "(ScatterLock::set_xlock_snap_sync(MDSContext*)+0x1eb) [0x5555556b670b]",
        "(MDCache::truncate_inode(CInode*, LogSegment*)+0x267) [0x555555643b37]",
        "(C_MDS_inode_update_finish::finish(int)+0x133) [0x55555561bbe3]",
        "(MDSContext::complete(int)+0x7f) [0x5555558067df]",
        "(MDSIOContextBase::complete(int)+0x17f) [0x555555806a8f]",
        "(MDSLogContextBase::complete(int)+0x44) [0x555555806d54]",
        "(Finisher::finisher_thread_entry()+0x18d) [0x7fffef02462d]",
        "(()+0x814a) [0x7fffecd8114a]",
        "(clone()+0x43) [0x7fffeb8a1f23]" 
    ]
}

    -1> 2021-06-14 21:23:24.430 7fffde9cd700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7fffde9cd700 time 2021-06-14 21:23:24.430600
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: 85: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)

 ceph version 14.2.20 (36274af6eb7f2a5055f2d53ad448f2694e9046a0) nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x156) [0x7fffeef93a3a]
 2: (()+0x275c54) [0x7fffeef93c54]
 3: (ScatterLock::set_xlock_snap_sync(MDSContext*)+0x1eb) [0x5555556b670b]
 4: (MDCache::truncate_inode(CInode*, LogSegment*)+0x267) [0x555555643b37]
 5: (C_MDS_inode_update_finish::finish(int)+0x133) [0x55555561bbe3]
 6: (MDSContext::complete(int)+0x7f) [0x5555558067df]
 7: (MDSIOContextBase::complete(int)+0x17f) [0x555555806a8f]
 8: (MDSLogContextBase::complete(int)+0x44) [0x555555806d54]
 9: (Finisher::finisher_thread_entry()+0x18d) [0x7fffef02462d]
 10: (()+0x814a) [0x7fffecd8114a]
 11: (clone()+0x43) [0x7fffeb8a1f23]

     0> 2021-06-14 21:23:24.430 7fffde9cd700 -1 *** Caught signal (Aborted) **
 in thread 7fffde9cd700 thread_name:fn_anonymous

 ceph version 14.2.20 (36274af6eb7f2a5055f2d53ad448f2694e9046a0) nautilus (stable)
 1: (()+0x12b20) [0x7fffecd8bb20]
 2: (gsignal()+0x10f) [0x7fffeb7dc7ff]
 3: (abort()+0x127) [0x7fffeb7c6c35]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a7) [0x7fffeef93a8b]
 5: (()+0x275c54) [0x7fffeef93c54]
 6: (ScatterLock::set_xlock_snap_sync(MDSContext*)+0x1eb) [0x5555556b670b]
 7: (MDCache::truncate_inode(CInode*, LogSegment*)+0x267) [0x555555643b37]
 8: (C_MDS_inode_update_finish::finish(int)+0x133) [0x55555561bbe3]
 9: (MDSContext::complete(int)+0x7f) [0x5555558067df]
 10: (MDSIOContextBase::complete(int)+0x17f) [0x555555806a8f]
 11: (MDSLogContextBase::complete(int)+0x44) [0x555555806d54]
 12: (Finisher::finisher_thread_entry()+0x18d) [0x7fffef02462d]
 13: (()+0x814a) [0x7fffecd8114a]
 14: (clone()+0x43) [0x7fffeb8a1f23]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions

Copy link

Updated by Andras Pataki over 2 years ago

Any ideas about the cause of these crashes?  Lately on one of our large cluster, we're getting increasingly more of them.  Just today 3 MDS crashes of this flavor.  Here is the JSON crash report for one of them:

    {
        "crash_id": "2021-08-02_15:37:30.810116Z_44963d0c-37da-4bf8-aa06-c602452b0423",
        "timestamp": "2021-08-02 15:37:30.810116Z",
        "process_name": "ceph-mds",
        "entity_name": "mds.popeye-mds-0-08",
        "ceph_version": "14.2.20",
        "utsname_hostname": "popeye-mds-0-08",
        "utsname_sysname": "Linux",
        "utsname_release": "5.10.32.2.fi",
        "utsname_version": "#1 SMP Fri Apr 23 03:57:43 EDT 2021",
        "utsname_machine": "x86_64",
        "os_name": "CentOS Linux",
        "os_id": "centos",
        "os_version_id": "8",
        "os_version": "8",
        "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",
        "assert_func": "void ScatterLock::set_xlock_snap_sync(MDSContext*)",
        "assert_file": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h",
        "assert_line": 85,
        "assert_thread_name": "fn_anonymous",
        "assert_msg": "/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: In function 'void ScatterLock::set_xlock_snap_sync(MDSContext*)' thread 7fffde9cd700 time 2021-08-02 08:37:30.808586\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/14.2.20/rpm/el8/BUILD/ceph-14.2.20/src/mds/ScatterLock.h: 85: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE)\n",
        "backtrace": [
            "(()+0x12b20) [0x7fffecd8bb20]",
            "(gsignal()+0x10f) [0x7fffeb7dc7ff]",
            "(abort()+0x127) [0x7fffeb7c6c35]",
            "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a7) [0x7fffeef93a8b]",
            "(()+0x275c54) [0x7fffeef93c54]",
            "(ScatterLock::set_xlock_snap_sync(MDSContext*)+0x1eb) [0x5555556b670b]",
            "(MDCache::truncate_inode(CInode*, LogSegment*)+0x267) [0x555555643b37]",
            "(C_MDS_inode_update_finish::finish(int)+0x133) [0x55555561bbe3]",
            "(MDSContext::complete(int)+0x7f) [0x5555558067df]",
            "(MDSIOContextBase::complete(int)+0x17f) [0x555555806a8f]",
            "(MDSLogContextBase::complete(int)+0x44) [0x555555806d54]",
            "(Finisher::finisher_thread_entry()+0x18d) [0x7fffef02462d]",
            "(()+0x814a) [0x7fffecd8114a]",
            "(clone()+0x43) [0x7fffeb8a1f23]" 
        ]
    }

On our other large cluster we don't see such crashes.  One major config difference is that on this cluster, we use daily snapshot (rolling 7 past days).  Could it be related to snapshot trimming?

I'm happy to help with the debugging if I can get some pointers on what would be helpful to do.

Actions

Copy link

Updated by Patrick Donnelly over 2 years ago

Status changed from New to Triaged
Assignee set to Xiubo Li
Target version set to v17.0.0
Backport set to pacific
Component(FS) MDS added
Labels (FS) crash added

Actions

Copy link

Updated by Xiubo Li over 2 years ago

Andras Pataki wrote:

[...]

On our other large cluster we don't see such crashes. One major config difference is that on this cluster, we use daily snapshot (rolling 7 past days). Could it be related to snapshot trimming?

BTW, what's the 'max_mds' for your cluster ? Were you using the kernel client or fuse one ?

Is it possible to set the debug log to 25 for MDS daemons to reproduce it again ? Locally I haven't reproduced it.

Actions

Copy link

Updated by Andras Pataki over 2 years ago

max_mds is 1 - we are running a single active MDS only. The clients are all ceph-fuse.
I've tried setting the MDS log level to 25, but it slows down the MDS to unbearable levels (simple ls on a directory with a few dozen files takes a minute even). So I can't leave this log level on the MDS for extended periods of time. Unfortunately the crashes happen at random times - so I'm open to suggestions how to better capture the crash state. Would something more than 5 but less than 25 help?

Actions

Copy link

Updated by Xiubo Li over 2 years ago

Andras Pataki wrote:

max_mds is 1 - we are running a single active MDS only. The clients are all ceph-fuse.
I've tried setting the MDS log level to 25, but it slows down the MDS to unbearable levels (simple ls on a directory with a few dozen files takes a minute even). So I can't leave this log level on the MDS for extended periods of time. Unfortunately the crashes happen at random times - so I'm open to suggestions how to better capture the crash state. Would something more than 5 but less than 25 help?

Yeah, log level to 25 will slow down the MDS. Could you try to set it to 15 ? At least please set the core dump log level to 25. something like:

debug mds = 15/25

And also please set:

debug ms = 1

From the logs attached, I can see that there has many clients are running at the same time and manipulating some shared dirs/files at the same time. And the crash was during in one client and trying to deleting or truncating a not opened file. It will call the do_open_truncate() when opening the file and journal the "open truncate" mdlog.

Currently the code will set the lock state to "LOCK_XLOCKDONE" in early reply just before flushing the "open truncate" mdlog even, during flusing the mdlog one request has changed the the lock state. But I cannot get any clue from your logs.

Locally I am also trying to reproduce it by using following script, which will emulate your use cases, and I was using the kclient:

while [ 1 ]; do date; for d in A B C; do (for i in {1..3}; do ./bin/mount.ceph :/ /mnt/kcephfs.$d -o noshare; rm -rf /mnt/kcephfs.$d/file$i.txt; rmdir /mnt/kcephfs.$d/.snap/snap$i; dd if=/dev/zero of=/mnt/kcephfs.$d/file$i.txt bs=1M count=8; mkdir -p /mnt/kcephfs.$d/.snap/snap$i; umount -fl /mnt/kcephfs.$d; done ) & done; wait; date; done

But not reproduced yet.

I will try the fuse clients to see whether could it work.

Actions

Copy link

Updated by Xiubo Li over 2 years ago

Pull request ID set to 42861

Actions

Copy link

#10

Updated by Xiubo Li over 2 years ago

Status changed from Triaged to Fix Under Review

Actions

Copy link

#11

Updated by Andras Pataki over 2 years ago

I haven't had luck keeping the MDS running well with higher log levels unfortunately. However, I do have one more data point to add to this issue. On another cluster, we just enabled snapshots - and the crashes started happening pretty much immediately:

2021-10-28_15:03:26.632350Z_4c1931a6-2948-478d-8880-2d2beb7bc336 mds.cephmds00
2021-10-29_04:07:15.848755Z_9d765612-d148-452b-9d14-0d73f56368e2 mds.cephmds01
2021-11-01_01:59:55.895573Z_25ab7513-5560-4506-be49-23cc38be36fd mds.cephmds03
2021-11-01_02:02:54.279955Z_6eac3458-3cc1-4289-809b-50b9ba351e79 mds.cephmds04
2021-11-01_23:55:11.440345Z_f0823d54-c4b3-4e37-b290-564258cf8e86 mds.cephmds02 *
2021-11-03_08:12:12.400879Z_e67143ef-2bc4-4f8c-a5f9-00d4c7ddbf2a mds.cephmds01 *
2021-11-03_08:13:03.079998Z_2b94d760-e2ed-43f5-aab6-d910f9547164 mds.cephmds03 *

prior to enabling snapshots - we had no such MDS crashes on this cluster. So it seems that the issue correlates to cephfs snapshots. The times for the crashes do not match with snapshots being created or removed, however.

Also - I've seen very high MDS CPU utilization with snapshots enabled during file removals seemingly - I'm not sure if this could be related. Here is a stack trace for the 100% busy thread (called fn_anonymous):

#0 0x0000555555787d4c in CInode::get_oldest_parent_dn (this=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.h:780
#1 CInode::is_ancestor_of (this=0x55561ebbee00, other=other@entry=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:917
#2 0x00005555557e4147 in SnapRealm::split_at (this=this@entry=0x555557c3a500, child=0x5555c1ad7900) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/SnapRealm.cc:593
#3 0x00005555557a1a5f in CInode::open_snaprealm (this=0x55561ebbee00, nosplit=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:2764
#4 0x00005555557a1e96 in CInode::pop_projected_snaprealm (this=0x55561ebbee00, next_snaprealm=0x555571b24000, early=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:586
#5 0x00005555557a2462 in CInode::early_pop_projected_snaprealm (this=this@entry=0x55561ebbee00) at /usr/include/c++/8/bits/stl_list.h:1122
#6 0x00005555555d1dc0 in Server::_unlink_local_finish (this=0x555557055680, mdr=..., dn=0x5558e50845a0, straydn=0x55582c1981e0, dnpv=513) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/Server.cc:7025
#7 0x00005555558067df in Context::complete (r=0, this=0x555715b043c0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/include/Context.h:77
#8 MDSContext::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:29
#9 0x0000555555806a8f in MDSIOContextBase::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:114
#10 0x0000555555806d54 in MDSLogContextBase::complete (this=<optimized out>, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:123
#11 0x00007fffef02062d in Finisher::finisher_thread_entry() () from target:/usr/lib64/ceph/libceph-common.so.0
#12 0x00007fffecd7d14a in start_thread () from target:/lib64/libpthread.so.0
#13 0x00007fffeb896dc3 in clone () from target:/lib64/libc.so.6

Not sure if this is related - but it is certainly a bit unusual.

Actions

Copy link

#12

Updated by Xiubo Li over 2 years ago

Andras Pataki wrote:

I haven't had luck keeping the MDS running well with higher log levels unfortunately. However, I do have one more data point to add to this issue. On another cluster, we just enabled snapshots - and the crashes started happening pretty much immediately:

2021-10-28_15:03:26.632350Z_4c1931a6-2948-478d-8880-2d2beb7bc336 mds.cephmds00
2021-10-29_04:07:15.848755Z_9d765612-d148-452b-9d14-0d73f56368e2 mds.cephmds01
2021-11-01_01:59:55.895573Z_25ab7513-5560-4506-be49-23cc38be36fd mds.cephmds03
2021-11-01_02:02:54.279955Z_6eac3458-3cc1-4289-809b-50b9ba351e79 mds.cephmds04
2021-11-01_23:55:11.440345Z_f0823d54-c4b3-4e37-b290-564258cf8e86 mds.cephmds02 *
2021-11-03_08:12:12.400879Z_e67143ef-2bc4-4f8c-a5f9-00d4c7ddbf2a mds.cephmds01 *
2021-11-03_08:13:03.079998Z_2b94d760-e2ed-43f5-aab6-d910f9547164 mds.cephmds03 *

prior to enabling snapshots - we had no such MDS crashes on this cluster. So it seems that the issue correlates to cephfs snapshots. The times for the crashes do not match with snapshots being created or removed, however.

Also - I've seen very high MDS CPU utilization with snapshots enabled during file removals seemingly - I'm not sure if this could be related. Here is a stack trace for the 100% busy thread (called fn_anonymous):

#0 0x0000555555787d4c in CInode::get_oldest_parent_dn (this=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.h:780
#1 CInode::is_ancestor_of (this=0x55561ebbee00, other=other@entry=0x555843e14700) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:917
#2 0x00005555557e4147 in SnapRealm::split_at (this=this@entry=0x555557c3a500, child=0x5555c1ad7900) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/SnapRealm.cc:593
#3 0x00005555557a1a5f in CInode::open_snaprealm (this=0x55561ebbee00, nosplit=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:2764
#4 0x00005555557a1e96 in CInode::pop_projected_snaprealm (this=0x55561ebbee00, next_snaprealm=0x555571b24000, early=<optimized out>) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/CInode.cc:586
#5 0x00005555557a2462 in CInode::early_pop_projected_snaprealm (this=this@entry=0x55561ebbee00) at /usr/include/c++/8/bits/stl_list.h:1122
#6 0x00005555555d1dc0 in Server::_unlink_local_finish (this=0x555557055680, mdr=..., dn=0x5558e50845a0, straydn=0x55582c1981e0, dnpv=513) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/Server.cc:7025
#7 0x00005555558067df in Context::complete (r=0, this=0x555715b043c0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/include/Context.h:77
#8 MDSContext::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:29
#9 0x0000555555806a8f in MDSIOContextBase::complete (this=0x555715b043c0, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:114
#10 0x0000555555806d54 in MDSLogContextBase::complete (this=<optimized out>, r=0) at /usr/src/debug/ceph-14.2.20-0.el8.x86_64/src/mds/MDSContext.cc:123
#11 0x00007fffef02062d in Finisher::finisher_thread_entry() () from target:/usr/lib64/ceph/libceph-common.so.0
#12 0x00007fffecd7d14a in start_thread () from target:/lib64/libpthread.so.0
#13 0x00007fffeb896dc3 in clone () from target:/lib64/libc.so.6

Not sure if this is related - but it is certainly a bit unusual.

I don't think they are the same issue, please raise a new tracker about this.

Actions

Copy link

#13

Updated by Igor Fedotov over 1 year ago

Backport changed from pacific to pacific, quincy

Actions

Copy link

#14

Updated by Igor Fedotov over 1 year ago

Alternative fix is available at https://github.com/ceph/ceph/pull/48743

Actions

Copy link

#15

Updated by Greg Farnum over 1 year ago

Has duplicate Bug #54730: crash: void ScatterLock::set_xlock_snap_sync(MDSContext*): assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE) added

Actions

Copy link

#16

Updated by Xiubo Li over 1 year ago

Pull request ID changed from 42861 to 48743

The root cause is:

When doing the open truncate the mds will set the filelock's state
to LOCK_XLOCKDONE just before journaling the "open truncate" mdlog,
and then releases the mds_lock.

After that if the loner client send another request, which will
also try to acquire xlock for the filelock it will switch the lock
state to LOCK_LOCK_XLOCK. So when the "open truncate" mdlog is
flushed the callback will crash the mds daemon.

We need to suspend the filelock's state if it needs to revoke the
Fb cap to flush the snap data from clients before the inode is
truncated.

More detail please see my previous PR: https://github.com/ceph/ceph/pull/42861.

And will fix this in new PR: https://github.com/ceph/ceph/pull/48743.

Actions

Copy link

#17