Bug #23972: Ceph MDS Crash from client mounting aufs over cephfs - CephFS - Ceph

Actions

Copy link

Bug #23972

open

Ceph MDS Crash from client mounting aufs over cephfs

Added by Sean Sullivan almost 6 years ago. Updated about 5 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

crash

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Here is a rough outline of my topology
https://pastebin.com/HQqbMxyj
---

I can reliably crash all (in my case 2) cephfs MDS from a client by trying to mount cephFS under AUFS. I am not sure what it is doing to cause this but the MDS will refuse to start until I 1.) Reboot my client to stop any more requests and 2.) Mark the current active MDS server as failed.

`ceph -s ` will report that the current monitors are up but the processes will be dead on both MDS servers:

Ceph health prior to trying to mount bridge cephfs with aufs
----------------------------------------------

ceph -s
      cluster:
        id:     9f58ee5a-7c5d-4d68-81ee-debe16322544
        health: HEALTH_OK

services:
        mon: 3 daemons, quorum kh08-8,kh09-8,kh10-8
        mgr: kh08-8(active)
        mds: cephfs-1/1/1 up  {0=kh09-8=up:active}, 1 up:standby
        osd: 570 osds: 570 up, 570 in

Client tries to mount aufs :: No output here it just hangs.

mount -vvv -t aufs -o br=/cephfs=rw:/mnt/aufs=rw -o udba=reval none /aufs

Monitors now report health_warn state
----------------------------------------------

root@kh08-8:~# ceph -s
      cluster:
        id:     9f58ee5a-7c5d-4d68-81ee-debe16322544
        health: HEALTH_WARN
                insufficient standby MDS daemons available

services:
        mon: 3 daemons, quorum kh08-8,kh09-8,kh10-8
        mgr: kh08-8(active)
        mds: cephfs-1/1/1 up  {0=kh10-8=up:active(laggy or crashed)}

At this point all mounts hang until I stop the client, mark the mds servers as failed, and restart the mds servers.

I tried installing the following packages (ceph-mds-dbg ceph-mgr-dbg ceph-mon-dbg ceph-osd-dbg ceph-test-dbg)
kh10-8 mds backtrace -- https://pastebin.com/bwqZGcfD
kh09-8 mds backtrace -- https://pastebin.com/vvGiXYVY

The log files are pretty large (one 4.1G and the other 200MB)

kh10-8 (200MB) mds log -- https://griffin-objstore.opensciencedatacloud.org/logs/ceph-mds.kh10-8.log
kh09-8 (4.1GB) mds log -- https://griffin-objstore.opensciencedatacloud.org/logs/ceph-mds.kh09-8.log

I am trying to mount aufs over the cephfs directory /aufstest so here are the last few lines from kh10-8 (secondary MDS server at the time) around the aufs mention.

https://pastebin.com/EL5ALLuE

Actions

Copy link

Updated by John Spray almost 6 years ago

Project changed from Ceph to CephFS

Any chance you can reproduce this with debuginfo packages installed, so that we can get meaningful backtraces?

Actions

Copy link

Updated by Patrick Donnelly almost 6 years ago

Target version changed from v12.2.5 to v14.0.0
Source set to Community (user)
Tags deleted (~~mds, cephfs, crash,~~)
Affected Versions deleted (~~v12.2.4, v12.2.5~~)
ceph-qa-suite deleted (fs)
Component(FS) MDS added
Labels (FS) crash added

Actions

Copy link

Updated by Sean Sullivan almost 6 years ago

John Spray wrote:

Any chance you can reproduce this with debuginfo packages installed, so that we can get meaningful backtraces?

Hopefully this helps. I'm a dummy and not exactly sure how to do this well. I also missed this reply, sorry again. I have all the packages ceph-*-dbg installed but this time I attached to ceph-mds with gdb prior to the crash:

https://pastebin.com/kw4bZVZT -- kh09-9
https://pastebin.com/sYZQx0ER -- kh10-9

-----------------------------------------------
List of dbg packages installed on one of the mds servers (same installed on both):
root@kh09-8:~# dpkg -l | grep -i dbg
ii ceph-base-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-base
ii ceph-common-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-common
ii ceph-fuse-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-fuse
ii ceph-mds-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-mds
ii ceph-mgr-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-mgr
ii ceph-mon-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-mon
ii ceph-osd-dbg 12.2.5-1xenial amd64 debugging symbols for ceph-osd
ii libc6-dbg:amd64 2.23-0ubuntu10 amd64 GNU C Library: detached debugging symbols
ii libcephfs2-dbg 12.2.5-1xenial amd64 debugging symbols for libcephfs2
ii librados2-dbg 12.2.5-1xenial amd64 debugging symbols for librados
ii librbd1-dbg 12.2.5-1xenial amd64 debugging symbols for librbd1
ii librgw2-dbg 12.2.5-1xenial amd64 debugging symbols for librbd1
ii radosgw-dbg 12.2.5-1xenial amd64 debugging symbols for radosgw
ii rbd-fuse-dbg 12.2.5-1xenial amd64 debugging symbols for rbd-fuse
ii rbd-mirror-dbg 12.2.5-1xenial amd64 debugging symbols for rbd-mirror
ii rbd-nbd-dbg 12.2.5-1xenial amd64 debugging symbols for rbd-nbd

so I'm not sure why the symbols are not loaded in the original traces. I hope these new traces help.

Actions

Copy link

Updated by Zheng Yan almost 6 years ago

The crash was at "mdr->tracedn = mdr->dn[ 0].back()", because mdr->dn[ 0] is empty. request that triggered the crash is something like "lookup #0x1//"

  dout(10) << "reply to stat on " << *req << dendl;
  mdr->tracei = ref;
  if (is_lookup)
    mdr->tracedn = mdr->dn[0].back();
  respond_to_request(mdr, 0);

Following patch can prevents kclient from sending malformed lookup request. But the real bug should be located in aufs, it should never revalidate root dentry.

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index f1d9c6cc0491..3c2b1b553654 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1197,6 +1197,9 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
        struct dentry *parent;
        struct inode *dir;

+       if (IS_ROOT(dentry))
+               return 1;
+
        if (flags & LOOKUP_RCU) {
                parent = READ_ONCE(dentry->d_parent);
                dir = d_inode_rcu(parent);

Actions

Copy link