Project

General

Profile

Actions

Bug #37721

closed

mds crashes frequently when using snapshots in CephFS on mimic

Added by Soenke Schippmann over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
mimic
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash, snapshots
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After we started to use snapshots in CephFS we expect frequent crashes (every couple of hours) of the active mds daemon. The FS was created with version Mimic (13.2.1), so snapshots were already enabled by default.

Here is an example backtrace:

(gdb) bt
#0  raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00005643386d53ae in reraise_fatal (signum=6) at ./src/global/signal_handler.cc:74
#2  handle_fatal_signal (signum=6) at ./src/global/signal_handler.cc:138
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#5  0x00007f10c61c5801 in __GI_abort () at abort.c:79
#6  0x00007f10c77b4080 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib/ceph/libceph-common.so.0
#7  0x00007f10c77b40f7 in ceph::__ceph_assert_fail(ceph::assert_data const&) () from /usr/lib/ceph/libceph-common.so.0
#8  0x000056433855c4a1 in Locker::snapflush_nudge (this=0x56433afa0640, in=0x5643d0467800) at ./src/mds/Locker.cc:2577
#9  0x000056433855c5f6 in Locker::caps_tick (this=0x56433afa0640) at ./src/mds/Locker.cc:3641
#10 0x0000564338560932 in Locker::tick (this=<optimized out>) at ./src/mds/Locker.cc:127
#11 0x00005643383fe754 in MDSRankDispatcher::tick (this=0x56433b24f000) at ./src/mds/MDSRank.cc:325
#12 0x00005643383ee96c in boost::function1<void, int>::operator() (a0=<optimized out>, this=<optimized out>)
    at ./obj-x86_64-linux-gnu/boost/include/boost/function/function_template.hpp:768
#13 FunctionContext::finish (this=<optimized out>, r=<optimized out>) at ./src/include/Context.h:522
#14 0x00005643383ec719 in Context::complete (this=0x56433b1432c0, r=<optimized out>) at ./src/include/Context.h:77
#15 0x00007f10c77b08ab in SafeTimer::timer_thread() () from /usr/lib/ceph/libceph-common.so.0
#16 0x00007f10c77b1e6d in SafeTimerThread::entry() () from /usr/lib/ceph/libceph-common.so.0
#17 0x00007f10c70c06db in start_thread (arg=0x7f10bc3b7700) at pthread_create.c:463
#18 0x00007f10c62a688f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

A coredump, the mds log and the ceph.conf have been uploaded with ceph-post-file:

  • ceph-mds.log: 7bc127e2-27e9-4a24-b126-b7bd930a82bd
  • core.safe_timer.2474: 7af3b92f-0970-4885-ad86-f5c860616554
  • ceph.conf: 089f90bf-83e5-4915-bc47-1cafeba0c3ff

Thanks for investigating this issue.


Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #37818: mimic: mds crashes frequently when using snapshots in CephFS on mimicResolvedPrashant DActions
Actions #1

Updated by Greg Farnum over 5 years ago

  • Project changed from RADOS to CephFS
  • Category set to 89
  • Priority changed from Normal to High
  • Component(FS) MDS added
  • Labels (FS) crash, snapshots added
Actions #2

Updated by Zheng Yan over 5 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Patrick Donnelly over 5 years ago

  • Assignee set to Zheng Yan
  • Target version set to v14.0.0
  • Start date deleted (12/20/2018)
  • Source set to Community (user)
  • Backport set to mimic
  • Pull request ID set to 25741
  • Affected Versions deleted (v13.2.1)
  • ceph-qa-suite deleted (fs)
Actions #4

Updated by Patrick Donnelly over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #5

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #37818: mimic: mds crashes frequently when using snapshots in CephFS on mimic added
Actions #6

Updated by Patrick Donnelly about 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF