Project

General

Profile

Actions

Bug #50468

open

Simultaneous mon daemon crash on cephfs mount

Added by David Prude about 3 years ago. Updated almost 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have a newly installed 16.2.0 cluster consisting of 5 nodes deployed using cephadm on Ubuntu 18.04.5. I have created three cephfs filesystems:

name: cephfstest1, metadata pool: cephfs.cephfstest1.meta, data pools: [cephfs.cephfstest1.data ]
name: cephfsnfs, metadata pool: cephfs.cephfsnfs.meta, data pools: [cephfs.cephfsnfs.data ]
name: cephfsnfsv3, metadata pool: cephfs.cephfsnfsv3.meta, data pools: [cephfs.cephfsnfsv3.data ]
name: sambatest1, metadata pool: cephfs.sambatest1.meta, data pools: [cephfs.sambatest1.data ]

all created as follows:

ceph fs volume create sambatest1

Mounting the cephfstest1 works either via kernel client or ceph-fuse. Attempting to mount the cephfs "sambatest1" using either method times out and results in all 5 mon daemons simultaneously crashing. All monitors show the same stack trace:

{
    "crash_id": "2021-04-20T17:20:50.694919Z_085f320d-276c-4f62-9b60-bc0ba1284813",
    "timestamp": "2021-04-20T17:20:50.694919Z",
    "process_name": "ceph-mon",
    "entity_name": "mon.ceph-01",
    "ceph_version": "16.2.0",
    "utsname_hostname": "ceph-01",
    "utsname_sysname": "Linux",
    "utsname_release": "4.15.0-141-generic",
    "utsname_version": "#145-Ubuntu SMP Wed Mar 24 18:08:07 UTC 2021",
    "utsname_machine": "x86_64",
    "os_name": "CentOS Linux",
    "os_id": "centos",
    "os_version_id": "8",
    "os_version": "8",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12b20) [0x7f26f5b68b20]",
        "gsignal()",
        "abort()",
        "/lib64/libstdc++.so.6(+0x9009b) [0x7f26f518609b]",
        "/lib64/libstdc++.so.6(+0x9653c) [0x7f26f518c53c]",
        "/lib64/libstdc++.so.6(+0x96597) [0x7f26f518c597]",
        "/lib64/libstdc++.so.6(+0x967f8) [0x7f26f518c7f8]",
        "/lib64/libstdc++.so.6(+0x92045) [0x7f26f5188045]",
        "/usr/bin/ceph-mon(+0x4d78a6) [0x55bfb12c78a6]",
        "(MDSMonitor::check_sub(Subscription*)+0x819) [0x55bfb12bde69]",
        "(Monitor::handle_subscribe(boost::intrusive_ptr<MonOpRequest>)+0xcd8) [0x55bfb10b0da8]",
        "(Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x4cd) [0x55bfb10d6c7d]",
        "(Monitor::_ms_dispatch(Message*)+0x5f6) [0x55bfb10d8236]",
        "(Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x5c) [0x55bfb11067bc]",
        "(DispatchQueue::entry()+0x126a) [0x7f26f82a7e0a]",
        "(DispatchQueue::DispatchThread::entry()+0x11) [0x7f26f8357b01]",
        "/lib64/libpthread.so.0(+0x814a) [0x7f26f5b5e14a]",
        "clone()" 
    ]
}

Attached is the crash log for one of the mon daemons. I also have a capture of the mon log with full debug running which I can supply but it exceeds the attachment limit.

-David


Files

mon0_crash.log.gz (144 KB) mon0_crash.log.gz Crash log for mon daemon running on first node. David Prude, 04/21/2021 06:03 PM
Actions #1

Updated by Sage Weil almost 3 years ago

  • Project changed from Ceph to RADOS
  • Category deleted (Monitor)
Actions

Also available in: Atom PDF