Project

General

Profile

Actions

Bug #23518

closed

mds: crash when failover

Added by wei jin about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash, multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-03-29 10:25:04.719502 7f5ae5ad2700 -1 /build/ceph-12.2.4/src/mds/MDCache.cc: In function 'void MDCache::handle_cache_rejoin_ack(MMDSCacheRejoin*)' thread 7f5ae5ad2700 time 2018-03-29 10:
25:04.716917
/build/ceph-12.2.4/src/mds/MDCache.cc: 5087: FAILED assert(session)

ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55ba1428d8d2]
2: (MDCache::handle_cache_rejoin_ack(MMDSCacheRejoin*)+0x2422) [0x55ba14071542]
3: (MDCache::handle_cache_rejoin(MMDSCacheRejoin*)+0x233) [0x55ba1407def3]
4: (MDCache::dispatch(Message*)+0xa5) [0x55ba1407e045]
5: (MDSRank::handle_deferrable_message(Message*)+0x5bc) [0x55ba13f6aecc]
6: (MDSRank::_dispatch(Message*, bool)+0x1db) [0x55ba13f7858b]
7: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x55ba13f79355]
8: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x55ba13f62b13]
9: (DispatchQueue::entry()+0x7ca) [0x55ba1458ceda]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x55ba143125ad]
11: (()+0x8064) [0x7f5aea8aa064]
12: (clone()+0x6d) [0x7f5ae999562d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues 2 (0 open2 closed)

Related to CephFS - Bug #23503: mds: crash during pressure testDuplicate03/29/2018

Actions
Copied to CephFS - Backport #23946: luminous: mds: crash when failoverResolvedPrashant DActions
Actions #1

Updated by Patrick Donnelly about 6 years ago

  • Status changed from New to Need More Info

Did you evict the client session during this time?

Actions #2

Updated by wei jin about 6 years ago

No. I did nothing.
During pressure test, I ran into two crashes, another one is #23503.

Actions #3

Updated by Patrick Donnelly about 6 years ago

Are you still hitting the issue or has it gone away? If so `debug mds = 20` logs would be helpful..

Actions #4

Updated by Patrick Donnelly about 6 years ago

  • Category set to Correctness/Safety
  • Target version set to v13.0.0
  • Source set to Community (user)
  • Backport set to luminous
  • Severity changed from 3 - minor to 2 - major
  • Component(FS) MDS added
Actions #5

Updated by Patrick Donnelly about 6 years ago

  • Tags set to crash
Actions #6

Updated by Zheng Yan almost 6 years ago

  • Related to Bug #23503: mds: crash during pressure test added
Actions #7

Updated by Zheng Yan almost 6 years ago

This one is related to http://tracker.ceph.com/issues/23503. #23503 can explain why session was evicted

Actions #8

Updated by Zheng Yan almost 6 years ago

  • Status changed from Need More Info to In Progress
Actions #9

Updated by Zheng Yan almost 6 years ago

  • Assignee set to Zheng Yan
Actions #10

Updated by Zheng Yan almost 6 years ago

  • Status changed from In Progress to Fix Under Review
Actions #11

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Tags deleted (crash)
  • Labels (FS) crash, multimds added
Actions #12

Updated by Nathan Cutler almost 6 years ago

Actions #13

Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF