Project

General

Profile

Actions

Bug #23972

open

Ceph MDS Crash from client mounting aufs over cephfs

Added by Sean Sullivan about 6 years ago. Updated about 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Here is a rough outline of my topology
https://pastebin.com/HQqbMxyj
---

I can reliably crash all (in my case 2) cephfs MDS from a client by trying to mount cephFS under AUFS. I am not sure what it is doing to cause this but the MDS will refuse to start until I 1.) Reboot my client to stop any more requests and 2.) Mark the current active MDS server as failed.

`ceph -s ` will report that the current monitors are up but the processes will be dead on both MDS servers:

Ceph health prior to trying to mount bridge cephfs with aufs
----------------------------------------------

ceph -s
cluster:
id: 9f58ee5a-7c5d-4d68-81ee-debe16322544
health: HEALTH_OK
services:
mon: 3 daemons, quorum kh08-8,kh09-8,kh10-8
mgr: kh08-8(active)
mds: cephfs-1/1/1 up {0=kh09-8=up:active}, 1 up:standby
osd: 570 osds: 570 up, 570 in

Client tries to mount aufs :: No output here it just hangs.

mount -vvv -t aufs -o br=/cephfs=rw:/mnt/aufs=rw -o udba=reval none /aufs

Monitors now report health_warn state
----------------------------------------------

root@kh08-8:~# ceph -s
cluster:
id: 9f58ee5a-7c5d-4d68-81ee-debe16322544
health: HEALTH_WARN
insufficient standby MDS daemons available
services:
mon: 3 daemons, quorum kh08-8,kh09-8,kh10-8
mgr: kh08-8(active)
mds: cephfs-1/1/1 up {0=kh10-8=up:active(laggy or crashed)}

At this point all mounts hang until I stop the client, mark the mds servers as failed, and restart the mds servers.

I tried installing the following packages (ceph-mds-dbg ceph-mgr-dbg ceph-mon-dbg ceph-osd-dbg ceph-test-dbg)
kh10-8 mds backtrace -- https://pastebin.com/bwqZGcfD
kh09-8 mds backtrace -- https://pastebin.com/vvGiXYVY

The log files are pretty large (one 4.1G and the other 200MB)

kh10-8 (200MB) mds log -- https://griffin-objstore.opensciencedatacloud.org/logs/ceph-mds.kh10-8.log
kh09-8 (4.1GB) mds log -- https://griffin-objstore.opensciencedatacloud.org/logs/ceph-mds.kh09-8.log


I am trying to mount aufs over the cephfs directory /aufstest so here are the last few lines from kh10-8 (secondary MDS server at the time) around the aufs mention.

https://pastebin.com/EL5ALLuE

Actions

Also available in: Atom PDF