Project

General

Profile

Actions

Backport #20972

closed

jewel ceph-fuse segfaults at mount time, assert in ceph::log::Log::stop

Added by Dan van der Ster over 6 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Release:
jewel
Pull request ID:
Crash signature (v1):
Crash signature (v2):


Related issues 2 (0 open2 closed)

Has duplicate CephFS - Bug #21363: ceph-fuse crashing while mounting cephfsDuplicateJos Collin09/12/2017

Actions
Copied from CephFS - Bug #18157: ceph-fuse segfaults on daemonizeResolvedGreg Farnum12/06/2016

Actions
Actions #1

Updated by Dan van der Ster over 6 years ago

Confirmed that 10.2.9 plus cbf18b1d80d214e4203e88637acf4b0a0a201ee7 does not segfault.

Actions #2

Updated by Nathan Cutler over 6 years ago

  • Tracker changed from Bug to Backport
  • Description updated (diff)

description

10.2.9 instroduces a regression where ceph-fuse will segfault at mount time because of an attempt to stop the log service twice.
This bug is basically the jewel version of #18157, because commit 8a2f27cc632c26d7c2b8e8528b4d459b1d78705b was just backported to jewel.

# mount /cephfs-micscratch/
2017-08-10 09:48:35.252428 7f1b6dd12ec0 -1 init, newargv = 0x7f1b798981c0 newargc=11
ceph-fuse[25029]: starting ceph client
ceph-fuse[25029]: starting fuse
# echo $?
255

Here's the backtrace:

(gdb) bt
#0  0x00007f10c14cf23b in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00007f10c29d8e15 in reraise_fatal (signum=6) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:133
#3  <signal handler called>
#4  0x00007f10c02ed1d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#5  0x00007f10c02ee8c8 in __GI_abort () at abort.c:90
#6  0x00007f10c02e6146 in __assert_fail_base (fmt=0x7f10c04373e8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=assertion@entry=0x7f10c2c752db "is_started()", file=file@entry=0x7f10c2c752cf "log/Log.cc",
    line=line@entry=434,
    function=function@entry=0x7f10c2c75580 <ceph::log::Log::stop()::__PRETTY_FUNCTION__> "void ceph::log::Log::stop()")
    at assert.c:92
#7  0x00007f10c02e61f2 in __GI___assert_fail (assertion=0x7f10c2c752db "is_started()", file=0x7f10c2c752cf "log/Log.cc",
    line=434, function=0x7f10c2c75580 <ceph::log::Log::stop()::__PRETTY_FUNCTION__> "void ceph::log::Log::stop()")
    at assert.c:101
#8  0x00007f10c2a14a54 in ceph::log::Log::stop (this=0x7f10ccab0000) at log/Log.cc:434
#9  0x00007f10c2aef056 in CephContext::~CephContext (this=0x7f10cca8e000, __in_chrg=<optimized out>)
    at common/ceph_context.cc:558
#10 0x00007f10c2aef2cc in CephContext::put (this=0x7f10cca8e000) at common/ceph_context.cc:578
#11 0x00007f10c29d4145 in intrusive_ptr_release (cct=<optimized out>) at global/global_init.cc:346
#12 0x00007f10c28c67c9 in ~intrusive_ptr (this=0x7ffdfe632830, __in_chrg=<optimized out>)
    at /usr/include/boost/smart_ptr/intrusive_ptr.hpp:97
#13 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ceph_fuse.cc:83

I guess we need this backported to jewel:

commit cbf18b1d80d214e4203e88637acf4b0a0a201ee7
Author: Greg Farnum <gfarnum@redhat.com>
Date:   Tue Dec 6 15:07:19 2016 -0800

    ceph-fuse: start up log on parent process before shutdown

    Otherwise, we hit an assert in the Ceph context and logging teardown.

    Fixes: http://tracker.ceph.com/issues/18157

    Signed-off-by: Greg Farnum <gfarnum@redhat.com>

diff --git a/src/ceph_fuse.cc b/src/ceph_fuse.cc
index 8bd841be2a..d6983d932e 100644
--- a/src/ceph_fuse.cc
+++ b/src/ceph_fuse.cc
@@ -295,6 +295,8 @@ int main(int argc, const char **argv, const char *envp[]) {
     //cout << "child done" << std::endl;
     return r;
   } else {
+    if (restart_log)
+      g_ceph_context->_log->start();
     // i am the parent
     //cout << "parent, waiting for signal" << std::endl;
     ::close(fd[1]);
Actions #3

Updated by Nathan Cutler over 6 years ago

  • Copied from Bug #18157: ceph-fuse segfaults on daemonize added
Actions #4

Updated by Nathan Cutler over 6 years ago

  • Description updated (diff)
  • Status changed from New to In Progress
  • Assignee set to Nathan Cutler
Actions #5

Updated by Nathan Cutler over 6 years ago

Thanks, Dan! Jewel backport staged: https://github.com/ceph/ceph/pull/16963

Actions #6

Updated by Nathan Cutler over 6 years ago

  • Status changed from In Progress to Resolved
  • Target version set to v10.2.10
Actions #7

Updated by Patrick Donnelly over 6 years ago

  • Has duplicate Bug #21363: ceph-fuse crashing while mounting cephfs added
Actions

Also available in: Atom PDF