Actions
Bug #23812
closedmds: may send LOCK_SYNC_MIX message to starting MDS
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Development
Tags:
Backport:
luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
From mds.0:
2018-04-20 20:01:26.892 7ff249f42700 1 -- 127.0.0.1:6829/4093013988 _send_message--> mds.1 127.0.0.1:6827/3615748515 -- mdsmap(e 32) v1 -- ?+0 0x2e85f98080 2018-04-20 20:01:26.892 7ff249f42700 1 -- 127.0.0.1:6829/4093013988 --> 127.0.0.1:6827/3615748515 -- mdsmap(e 32) v1 -- 0x2e85f98080 con 0 2018-04-20 20:01:26.892 7ff249f42700 1 -- 127.0.0.1:6829/4093013988 _send_message--> mds.1 127.0.0.1:6827/3615748515 -- lock(a=mix inest 0x1.head) v1 -- ?+0 0x2e7d789440 2018-04-20 20:01:26.892 7ff249f42700 1 -- 127.0.0.1:6829/4093013988 --> 127.0.0.1:6827/3615748515 -- lock(a=mix inest 0x1.head) v1 -- 0x2e7d789440 con 0
mds.1:
2018-04-20 20:01:26.896 7f018cd36700 1 -- 127.0.0.1:6827/3615748515 <== mds.0 127.0.0.1:6829/4093013988 2 ==== mdsmap(e 32) v1 ==== 780+0+0 (4159823880 0 0) 0x3209318a80 con 0x3209430e00 2018-04-20 20:01:26.896 7f018cd36700 5 mds.a handle_mds_map epoch 32 from mds.0 2018-04-20 20:01:26.896 7f018cd36700 5 mds.a old map epoch 32 <= 32, discarding 2018-04-20 20:01:26.896 7f018cd36700 1 -- 127.0.0.1:6827/3615748515 <== mds.0 127.0.0.1:6829/4093013988 3 ==== lock(a=mix inest 0x1.head) v1 ==== 291+0+0 (4212595484 0 0) 0x32091a5e40 con 0x3209430e00 2018-04-20 20:01:26.896 7f018cd36700 -1 /home/pdonnell/ceph/src/mds/Locker.cc: In function 'void Locker::handle_lock(MLock*)' thread 7f018cd36700 time 2018-04-20 20:01:26.898953 /home/pdonnell/ceph/src/mds/Locker.cc: 3870: FAILED assert(mds->is_rejoin() || mds->is_clientreplay() || mds->is_active() || mds->is_stopping()) ceph version 13.0.2-1597-g94271de (94271de7ff6ed4f05c9415cf81e493677adb1e6d) mimic (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x7f0194551e92] 2: (()+0x298067) [0x7f0194552067] 3: (Locker::handle_lock(MLock*)+0x1c0) [0x32078255b0] 4: (MDSRank::handle_deferrable_message(Message*)+0x545) [0x32076bb5b5] 5: (MDSRank::_dispatch(Message*, bool)+0x62b) [0x32076c71fb] 6: (MDSRankDispatcher::ms_dispatch(Message*)+0x15) [0x32076c78f5] 7: (MDSDaemon::ms_dispatch(Message*)+0xd3) [0x32076b36b3] 8: (DispatchQueue::entry()+0xb5a) [0x7f01945cbaba] 9: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f019466b5cd] 10: (()+0x76ba) [0x7f0193e336ba] 11: (clone()+0x6d) [0x7f01930bd41d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
This condition looks wrong: https://github.com/ceph/ceph/blob/6f60a995de5645667f2c330d03459a7a9ca469f9/src/mds/Locker.cc#L893-L894
Updated by Patrick Donnelly about 6 years ago
- Status changed from New to Fix Under Review
Updated by Zheng Yan about 6 years ago
- Related to Bug #23814: mds: newly active mds aborts may abort in handle_file_lock added
Updated by Patrick Donnelly about 6 years ago
- Assignee changed from Patrick Donnelly to Zheng Yan
Updated by Patrick Donnelly almost 6 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler almost 6 years ago
- Copied to Backport #23935: luminous: mds: may send LOCK_SYNC_MIX message to starting MDS added
Updated by Nathan Cutler almost 6 years ago
- Status changed from Pending Backport to Resolved
Actions