Bug #23518
closedmds: crash when failover
0%
Description
2018-03-29 10:25:04.719502 7f5ae5ad2700 -1 /build/ceph-12.2.4/src/mds/MDCache.cc: In function 'void MDCache::handle_cache_rejoin_ack(MMDSCacheRejoin*)' thread 7f5ae5ad2700 time 2018-03-29 10:
25:04.716917
/build/ceph-12.2.4/src/mds/MDCache.cc: 5087: FAILED assert(session)
ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x55ba1428d8d2]
2: (MDCache::handle_cache_rejoin_ack(MMDSCacheRejoin*)+0x2422) [0x55ba14071542]
3: (MDCache::handle_cache_rejoin(MMDSCacheRejoin*)+0x233) [0x55ba1407def3]
4: (MDCache::dispatch(Message*)+0xa5) [0x55ba1407e045]
5: (MDSRank::handle_deferrable_message(Message*)+0x5bc) [0x55ba13f6aecc]
6: (MDSRank::_dispatch(Message*, bool)+0x1db) [0x55ba13f7858b]
7: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x55ba13f79355]
8: (MDSDaemon::ms_dispatch(Message*)+0xf3) [0x55ba13f62b13]
9: (DispatchQueue::entry()+0x7ca) [0x55ba1458ceda]
10: (DispatchQueue::DispatchThread::entry()+0xd) [0x55ba143125ad]
11: (()+0x8064) [0x7f5aea8aa064]
12: (clone()+0x6d) [0x7f5ae999562d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Patrick Donnelly about 6 years ago
- Status changed from New to Need More Info
Did you evict the client session during this time?
Updated by wei jin about 6 years ago
No. I did nothing.
During pressure test, I ran into two crashes, another one is #23503.
Updated by Patrick Donnelly about 6 years ago
Are you still hitting the issue or has it gone away? If so `debug mds = 20` logs would be helpful..
Updated by Patrick Donnelly about 6 years ago
- Category set to Correctness/Safety
- Target version set to v13.0.0
- Source set to Community (user)
- Backport set to luminous
- Severity changed from 3 - minor to 2 - major
- Component(FS) MDS added
Updated by Zheng Yan almost 6 years ago
- Related to Bug #23503: mds: crash during pressure test added
Updated by Zheng Yan almost 6 years ago
This one is related to http://tracker.ceph.com/issues/23503. #23503 can explain why session was evicted
Updated by Zheng Yan almost 6 years ago
- Status changed from Need More Info to In Progress
Updated by Zheng Yan almost 6 years ago
- Status changed from In Progress to Fix Under Review
Updated by Patrick Donnelly almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
- Tags deleted (
crash) - Labels (FS) crash, multimds added
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #23946: luminous: mds: crash when failover added
Updated by Nathan Cutler almost 6 years ago
- Status changed from Pending Backport to Resolved