Bug #57411: mutiple mds crash seen while running db workloads with regular snapshots and journal flush. - CephFS - Ceph

Actions

Copy link

Bug #57411

closed

mutiple mds crash seen while running db workloads with regular snapshots and journal flush.

Added by Kumar Hemanth over 1 year ago. Updated over 1 year ago.

Status:

Duplicate

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

While the pgbench workloads are on with snapshots being taken after each run and with regular journal flush.
I saw few crashes on mds, the sequence of steps are mentioned below.

[root@argo013 ceph-ansible]# ceph -s
  cluster:
    id:     5b0eb00f-a9d5-4237-92d0-fd373e70358d
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum argo013,argo017,argo020 (age 35h)
    mgr: argo013(active, since 34h), standbys: argo017
    mds: cephfs:2 {0=argo021=up:active,1=argo024=up:active} 1 up:standby
    osd: 9 osds: 9 up (since 35h), 9 in (since 35h)

  data:
    pools:   2 pools, 40 pgs
    objects: 16.53k objects, 33 GiB
    usage:   6.5 TiB used, 16 TiB / 23 TiB avail
    pgs:     40 active+clean

  io:
    client:   341 B/s rd, 58 MiB/s wr, 0 op/s rd, 29 op/s wr

[root@argo013 ceph-ansible]# 
[root@argo013 ceph-ansible]# 
[root@argo013 ceph-ansible]# 
[root@argo013 ceph-ansible]# 
[root@argo013 ceph-ansible]# ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
[root@argo013 ceph-ansible]# ceph fs status
cephfs - 3 clients
======
+------+--------+---------+---------------+-------+-------+
| Rank | State  |   MDS   |    Activity   |  dns  |  inos |
+------+--------+---------+---------------+-------+-------+
|  0   | active | argo021 | Reqs:    1 /s |  697  |  226  |
|  1   | active | argo024 | Reqs:    0 /s | 5143  | 2358  |
+------+--------+---------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata |  108M | 4737G |
|   cephfs_data   |   data   |  102G | 4737G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|   argo022   |
+-------------+
MDS version: ceph version 14.2.11-208.el8cp (6738ba96f296a41c24357c12e8d594fbde457abc) nautilus (stable)
[root@argo013 ceph-ansible]#

.
.
.
.

On Client1 with kernel mount -

for i in {1..100} ; do sudo -u postgres -- /usr/pgsql-14/bin/pgbench -s 500 -i ; sleep 10 ; sudo -u postgres -- /usr/pgsql-14/bin/pgbench -c 100 -j 4 --progress=5 --time=1200 ; sleep 10 ; ceph fs subvolume snapshot create cephfs subv_kernel snapk$i ; sleep 10 ; ssh argo021 ceph daemon mds.argo021 flush journal ; ssh argo024 ceph daemon mds.argo024 flush journal ; done

.
.
.

On Client2 with Fuse mount -

#for i in {1..100} ; do sudo -u postgres -- /usr/pgsql-14/bin/pgbench -s 500 -i ; sleep 10 ; sudo -u postgres -- /usr/pgsql-14/bin/pgbench -c 100 -j 4 --progress=5 --time=1200 ; sleep 10 ; ceph fs subvolume snapshot create cephfs subv_fuse snapf$i ; sleep 10 ; ssh argo021 ceph daemon mds.argo021 flush journal ; ssh argo024 ceph daemon mds.argo024 flush journal ; done

After a day's run - Noticed few crashes. Client 2 IO's had stopped due to some ssh error. IO's on client 1 continued and journal flush did not work as the active mds kept changing after each crash.
.
.
.
.
.

[root@argo013 ceph-ansible]# ceph -s
  cluster:
    id:     5b0eb00f-a9d5-4237-92d0-fd373e70358d
    health: HEALTH_WARN
            5 daemons have recently crashed

  services:
    mon: 3 daemons, quorum argo013,argo017,argo020 (age 3d)
    mgr: argo013(active, since 2d), standbys: argo017
    mds: cephfs:2 {0=argo022=up:active,1=argo024=up:active} 1 up:standby
    osd: 9 osds: 9 up (since 3d), 9 in (since 3d)

  data:
    pools:   2 pools, 40 pgs
    objects: 345.08k objects, 703 GiB
    usage:   8.4 TiB used, 14 TiB / 23 TiB avail
    pgs:     40 active+clean

  io:
    client:   426 B/s rd, 8.6 MiB/s wr, 0 op/s rd, 533 op/s wr

[root@argo013 ceph-ansible]# ceph health detail
HEALTH_WARN 5 daemons have recently crashed
RECENT_CRASH 5 daemons have recently crashed
    mds.argo021 crashed on host argo021.ceph.redhat.com at 2022-08-30 19:46:11.620336Z
    mds.argo022 crashed on host argo022.ceph.redhat.com at 2022-08-30 23:25:19.763877Z
    mds.argo021 crashed on host argo021.ceph.redhat.com at 2022-08-30 23:33:38.267202Z
    mds.argo024 crashed on host argo024.ceph.redhat.com at 2022-08-31 00:20:59.128323Z
    mds.argo021 crashed on host argo021.ceph.redhat.com at 2022-09-01 01:08:35.209276Z
[root@argo013 ceph-ansible]# 

[root@argo013 ceph-ansible]# ceph fs status
cephfs - 3 clients
======
+------+--------+---------+---------------+-------+-------+
| Rank | State  |   MDS   |    Activity   |  dns  |  inos |
+------+--------+---------+---------------+-------+-------+
|  0   | active | argo022 | Reqs:    1 /s | 6750  | 2339  |
|  1   | active | argo024 | Reqs:    0 /s | 5554  | 5557  |
+------+--------+---------+---------------+-------+-------+
+-----------------+----------+-------+-------+
|       Pool      |   type   |  used | avail |
+-----------------+----------+-------+-------+
| cephfs_metadata | metadata | 52.6M | 3805G |
|   cephfs_data   |   data   | 2106G | 3805G |
+-----------------+----------+-------+-------+
+-------------+
| Standby MDS |
+-------------+
|   argo021   |
+-------------+
MDS version: ceph version 14.2.11-208.el8cp (6738ba96f296a41c24357c12e8d594fbde457abc) nautilus (stable)

[root@argo013 ceph-ansible]# ceph crash info 2022-08-30_19:46:11.620336Z_58a3f75e-0160-4fab-ab98-8791acfb174c
{
    "crash_id": "2022-08-30_19:46:11.620336Z_58a3f75e-0160-4fab-ab98-8791acfb174c",
    "timestamp": "2022-08-30 19:46:11.620336Z",
    "process_name": "ceph-mds",
    "entity_name": "mds.argo021",
    "ceph_version": "14.2.11-208.el8cp",
    "utsname_hostname": "argo021.ceph.redhat.com",
    "utsname_sysname": "Linux",
    "utsname_release": "4.18.0-348.23.1.el8_5.x86_64",
    "utsname_version": "#1 SMP Tue Apr 12 11:20:32 EDT 2022",
    "utsname_machine": "x86_64",
    "os_name": "Red Hat Enterprise Linux",
    "os_id": "rhel",
    "os_version_id": "8.5",  
    "os_version": "8.5 (Ootpa)",
    "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()",
    "assert_func": "void SimpleLock::put_xlock()",
    "assert_file": "/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h",
    "assert_line": 528,
    "assert_thread_name": "fn_anonymous",
    "assert_msg": "/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7fb765b27700 time 2022-08-30 19:46:11.614684\n/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h: 528: FAIL
ED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())\n",
    "backtrace": [
        "(()+0x12c20) [0x7fb773ef1c20]",
        "(gsignal()+0x10f) [0x7fb77293937f]",
        "(abort()+0x127) [0x7fb772923db5]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a7) [0x7fb776107359]",
        "(()+0x275522) [0x7fb776107522]",
        "(()+0x317c0b) [0x55f0e3c34c0b]",
        "(Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x92) [0x55f0e3c28432]",
        "(Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x55f0e3c28aca]",
        "(Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x55f0e3c292ce]",
        "(Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x4bc) [0x55f0e3ad134c]",
        "(Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x1cd) [0x55f0e3ad1cfd]",
        "(Server::_link_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CInode*, unsigned long, unsigned long, bool)+0x23a) [0x55f0e3ae99da]",
        "(C_MDS_link_local_finish::finish(int)+0x37) [0x55f0e3b39837]",
        "(MDSContext::complete(int)+0x7f) [0x55f0e3d245df]",
        "(MDSIOContextBase::complete(int)+0x17f) [0x55f0e3d2488f]",
        "(MDSLogContextBase::complete(int)+0x44) [0x55f0e3d24b54]",
        "(Finisher::finisher_thread_entry()+0x18d) [0x7fb776199bed]",
        "(()+0x817a) [0x7fb773ee717a]",
        "(clone()+0x43) [0x7fb7729fedc3]" 
    ]
}

[root@argo013 ceph-ansible]# ceph crash  info 2022-08-31_00:20:59.128323Z_ffb57911-fa63-4f66-8abf-4970508b978c
{
    "crash_id": "2022-08-31_00:20:59.128323Z_ffb57911-fa63-4f66-8abf-4970508b978c",
    "timestamp": "2022-08-31 00:20:59.128323Z",
    "process_name": "ceph-mds",
    "entity_name": "mds.argo024",
    "ceph_version": "14.2.11-208.el8cp",
    "utsname_hostname": "argo024.ceph.redhat.com",
    "utsname_sysname": "Linux",
    "utsname_release": "4.18.0-348.23.1.el8_5.x86_64",
    "utsname_version": "#1 SMP Tue Apr 12 11:20:32 EDT 2022",
    "utsname_machine": "x86_64",
    "os_name": "Red Hat Enterprise Linux",
    "os_id": "rhel",
    "os_version_id": "8.5",  
    "os_version": "8.5 (Ootpa)",
    "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()",
    "assert_func": "void SimpleLock::put_xlock()",
    "assert_file": "/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h",
    "assert_line": 528,
    "assert_thread_name": "fn_anonymous",
    "assert_msg": "/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7fc146861700 time 2022-08-31 00:20:59.125109\n/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())\n",
    "backtrace": [
        "(()+0x12c20) [0x7fc154c2bc20]",
        "(gsignal()+0x10f) [0x7fc15367337f]",
        "(abort()+0x127) [0x7fc15365ddb5]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a7) [0x7fc156e41359]",
        "(()+0x275522) [0x7fc156e41522]",
        "(()+0x317c0b) [0x559fe98bac0b]",
        "(Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x92) [0x559fe98ae432]",
        "(Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x20a) [0x559fe98aeaca]",
        "(Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5e) [0x559fe98af2ce]",
        "(Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x4bc) [0x559fe975734c]",
        "(Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x1cd) [0x559fe9757cfd]",
        "(Server::_link_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CInode*, unsigned long, unsigned long, bool)+0x23a) [0x559fe976f9da]",
        "(C_MDS_link_local_finish::finish(int)+0x37) [0x559fe97bf837]",
        "(MDSContext::complete(int)+0x7f) [0x559fe99aa5df]",
        "(MDSIOContextBase::complete(int)+0x17f) [0x559fe99aa88f]",
        "(MDSLogContextBase::complete(int)+0x44) [0x559fe99aab54]",
        "(Finisher::finisher_thread_entry()+0x18d) [0x7fc156ed3bed]",
        "(()+0x817a) [0x7fc154c2117a]",
        "(clone()+0x43) [0x7fc153738dc3]" 
    ]
}
[root@argo013 ceph-ansible]#

[root@argo013 ceph-ansible]# cephfs-journal-tool --rank=cephfs:0  journal inspect
Overall journal integrity: OK

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Kumar Hemanth over 1 year ago

mds logs are copied here - http://magna002.ceph.redhat.com/ceph-qe-logs/hemanth/ceph_tracker/57411/

Actions

Copy link

Updated by Venky Shankar over 1 year ago

Thanks for reproducing this, Hemanth. This has been seen in the past but there were no logs to debug. Could you please attach MDS logs to this tracker using https://docs.ceph.com/en/quincy/man/8/ceph-post-file/ please?

Actions

Copy link

Updated by Kumar Hemanth over 1 year ago

[root@argo021 ceph]# ceph-post-file -d "tracker-57411" ceph-mds-logs.zip 
args: -d tracker-57411 -- ceph-mds-logs.zip
args: -- ceph-mds-logs.zip
/bin/ceph-post-file: upload tag bb4e8974-acba-4333-9289-63f60c685fa7
/bin/ceph-post-file: user: root@argo021.ceph.redhat.com
/bin/ceph-post-file: description: tracker-57411
/bin/ceph-post-file: will upload file ceph-mds-logs.zip
sftp> mkdir post/bb4e8974-acba-4333-9289-63f60c685fa7_root@argo021.ceph.redhat.com_68f71363-fc36-42e1-a54b-5d452cfbdddd
sftp> cd post/bb4e8974-acba-4333-9289-63f60c685fa7_root@argo021.ceph.redhat.com_68f71363-fc36-42e1-a54b-5d452cfbdddd
sftp> put /tmp/tmp.s38anO3LrD user
sftp> put /tmp/tmp.uV9islqurX description
sftp> put ceph-mds-logs.zip
/bin/ceph-post-file: copy the upload id below to share with a dev:

ceph-post-file: bb4e8974-acba-4333-9289-63f60c685fa7

[root@argo021 ceph]#

Actions

Copy link

Updated by Patrick Donnelly over 1 year ago

Description updated (diff)

Actions

Copy link

Updated by Patrick Donnelly over 1 year ago

Is duplicate of Bug #44565: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()) added

Actions

Copy link