Project

General

Profile

Actions

Bug #54833

open

crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)

Added by Telemetry Bot about 2 years ago. Updated 2 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

100%

Source:
Telemetry
Tags:
backport_processed
Backport:
reef,quincy,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):

0253f38e2bd872a2f74cdd3714f166d569c2c8e87301d000df3d58bdfa526c70
4133b924526414f2bdd07b685dca1b5221ec81576d91e562bb6e4df3a2b561c4


Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=93eb25af7720dff70a580dca12e470b71bb89141ba1e8ac2560b7ef4fd188f3a

Assert condition: lock->get_state() LOCK_LOCK || lock->get_state() LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2
Assert function: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&)

Sanitized backtrace:

    pthread_kill()
    raise()
    Locker::handle_file_lock(ScatterLock*, boost::intrusive_ptr<MLock const> const&)
    Locker::dispatch(boost::intrusive_ptr<Message const> const&)
    MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)
    MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)
    MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)
    DispatchQueue::entry()
    DispatchQueue::DispatchThread::entry()

Crash dump sample:
{
    "assert_condition": "lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2",
    "assert_file": "mds/Locker.cc",
    "assert_func": "void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&)",
    "assert_line": 5732,
    "assert_msg": "mds/Locker.cc: In function 'void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&)' thread 7f6645486640 time 2022-02-20T22:30:46.711103-0500\nmds/Locker.cc: 5732: FAILED ceph_assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)",
    "assert_thread_name": "ms_dispatch",
    "backtrace": [
        "/lib/x86_64-linux-gnu/libc.so.6(+0x46520) [0x7f664afac520]",
        "pthread_kill()",
        "raise()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x172) [0x7f664b736487]",
        "/usr/lib/x86_64-linux-gnu/ceph/libceph-common.so.2(+0x24b5e6) [0x7f664b7365e6]",
        "(Locker::handle_file_lock(ScatterLock*, boost::intrusive_ptr<MLock const> const&)+0x18b3) [0x556fb61a5393]",
        "(Locker::dispatch(boost::intrusive_ptr<Message const> const&)+0x64) [0x556fb61ad254]",
        "(MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x5b7) [0x556fb5fba477]",
        "(MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x53) [0x556fb5fbaa83]",
        "(MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x1bf) [0x556fb5fa643f]",
        "(Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x450) [0x7f664b968420]",
        "(DispatchQueue::entry()+0x647) [0x7f664b965807]",
        "(DispatchQueue::DispatchThread::entry()+0x11) [0x7f664ba26561]",
        "/lib/x86_64-linux-gnu/libc.so.6(+0x98927) [0x7f664affe927]",
        "clone()" 
    ],
    "ceph_version": "16.2.7",
    "crash_id": "2022-02-21T03:30:46.752836Z_d902f215-af3f-4fbb-89d9-7ad51c405b67",
    "entity_name": "mds.dc0af19b7efe873f7993738fb083813d67ff43e2",
    "os_id": "22.04",
    "os_name": "Ubuntu Jammy Jellyfish (development branch)",
    "os_version": "22.04 (Jammy Jellyfish)",
    "os_version_id": "22.04",
    "process_name": "ceph-mds",
    "stack_sig": "0253f38e2bd872a2f74cdd3714f166d569c2c8e87301d000df3d58bdfa526c70",
    "timestamp": "2022-02-21T03:30:46.752836Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.15.0-17-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#17-Ubuntu SMP Thu Jan 13 16:27:23 UTC 2022" 
}


Subtasks 1 (0 open1 closed)

Bug #54834: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)DuplicateNeeraj Pratap Singh

Actions

Related issues 3 (1 open2 closed)

Copied to CephFS - Backport #64584: reef: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)ResolvedXiubo LiActions
Copied to CephFS - Backport #64585: squid: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)ResolvedXiubo LiActions
Copied to CephFS - Backport #64586: quincy: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2)In ProgressXiubo LiActions
Actions #1

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v16.2.7 added
Actions #2

Updated by Telemetry Bot almost 2 years ago

  • Project changed from RADOS to CephFS
  • Crash signature (v1) updated (diff)
Actions #3

Updated by Telemetry Bot almost 2 years ago

  • Affected Versions v17.1.0 added
Actions #4

Updated by Milind Changire 10 months ago

  • Related to Bug #54834: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2) added
Actions #5

Updated by Venky Shankar 9 months ago

  • Category set to Correctness/Safety
  • Assignee set to Xiubo Li
  • Target version set to v19.0.0
  • Backport set to reef,quincy,pacific
  • Component(FS) MDS added
  • Labels (FS) crash added
Actions #6

Updated by Xiubo Li 9 months ago

There is no logs and I just went through MDS locker code, it seems buggy here:

4740 void Locker::simple_excl(SimpleLock *lock, bool *need_issue)
4741 { 
4742   dout(7) << "simple_excl on " << *lock << " on " << *lock->get_parent() << dendl;
4743   ceph_assert(lock->get_parent()->is_auth());
4744   ceph_assert(lock->is_stable());
4745   
4746   CInode *in = 0;
4747   if (lock->get_cap_shift())
4748     in = static_cast<CInode *>(lock->get_parent());
4749   
4750   switch (lock->get_state()) {
4751   case LOCK_LOCK: lock->set_state(LOCK_LOCK_EXCL); break;
4752   case LOCK_SYNC: lock->set_state(LOCK_SYNC_EXCL); break;
4753   case LOCK_XSYN: lock->set_state(LOCK_XSYN_EXCL); break;
4754   default: ceph_abort();      
4755   }
4756   
4757   int gather = 0;
4758   if (lock->is_rdlocked())
4759     gather++;
4760   if (lock->is_wrlocked())
4761     gather++;
4762   if (gather && lock->is_cached())
4763     invalidate_lock_caches(lock);
4764 
4765   if (lock->get_parent()->is_replicated() && 
4766       lock->get_state() != LOCK_LOCK_EXCL &&
4767       lock->get_state() != LOCK_XSYN_EXCL) {
4768     send_lock_message(lock, LOCK_AC_LOCK);                                                                                                                                                                                        
4769     lock->init_gather();      
4770     gather++;
4771   }

when the old lock state is LOCK_SYNC and the simple_excl() will set the lock state to LOCK_SYNC_EXCL directly in current MDS. While from mds/lock.c:

 84 const struct sm_state_t filelock[LOCK_MAX] = {
 85                       // stable     loner  rep state  r     rp   rd   wr   fwr  l    x    caps(any,loner,xlocker,replica)
 86     [LOCK_SYNC]      = { 0,         false, LOCK_SYNC, ANY,  0,   ANY, 0,   0,   ANY, 0,   CEPH_CAP_GSHARED|CEPH_CAP_GCACHE|CEPH_CAP_GRD|CEPH_CAP_GLAZYIO,0,0,CEPH_CAP_GSHARED|CEPH_CAP_GCACHE|CEPH_CAP_GRD },

We can see that if the auth MDS's lock old state is LOCK_SYNC, then the replicas' lock state should be LOCK_SYCN.

So in simple_excl() in Line#4752 and Line#4758 it will send the LOCK_AC_LOCK lock request all the replica MDSs and in replicas' MDS it could trigger the assert:

5744 void Locker::handle_file_lock(ScatterLock *lock, const cref_t<MLock> &m)
5745 { 
5746   CInode *in = static_cast<CInode*>(lock->get_parent());
5747   int from = m->get_asker();
5748   
5749   if (mds->is_rejoin()) {
5750     if (in->is_rejoining()) {
5751       dout(7) << "handle_file_lock still rejoining " << *in
5752               << ", dropping " << *m << dendl;    
5753       return;
5754     }
5755   }
5756   
5757   dout(7) << "handle_file_lock a=" << lock->get_lock_action_name(m->get_action())                                                                                                                                                 
5758           << " on " << *lock  
5759           << " from mds." << from << " "      
5760           << *in << dendl;
5761   
5762   bool caps = lock->get_cap_shift();  
5763   
5764   switch (m->get_action()) {
5765     // -- replica --
5766   case LOCK_AC_SYNC:
5767     ceph_assert(lock->get_state() == LOCK_LOCK ||
5768            lock->get_state() == LOCK_MIX ||    
5769            lock->get_state() == LOCK_MIX_SYNC2);
5770     
...
Actions #7

Updated by Xiubo Li 9 months ago

  • Status changed from New to In Progress
Actions #8

Updated by Venky Shankar 9 months ago

If the auth is sending `LOCK_AC_LOCK` then in the replica it should be handled here:

  case LOCK_AC_LOCK:
    switch (lock->get_state()) {
    case LOCK_SYNC: lock->set_state(LOCK_SYNC_LOCK); break;
    case LOCK_MIX: lock->set_state(LOCK_MIX_LOCK); break;
    default: ceph_abort();
    }
Actions #9

Updated by Xiubo Li 9 months ago

Venky Shankar wrote:

If the auth is sending `LOCK_AC_LOCK` then in the replica it should be handled here:

[...]

Okay, you are right. I missed reading the code and I will check it again. Thanks.

Actions #10

Updated by Xiubo Li 6 months ago

Found another case:

Firstly in auth mds the simple_sync() just transmits the filelock state LOCK_MIX --> LOCK_MIX_SYNC, and then sends a LOCK_AC_SYNC message to a replicate mds:

4636 bool Locker::simple_sync(SimpleLock *lock, bool *need_issue)
4637 {
...
4650     switch (lock->get_state()) {
4651     case LOCK_MIX: lock->set_state(LOCK_MIX_SYNC); break;
4652     case LOCK_LOCK: lock->set_state(LOCK_LOCK_SYNC); break;
4653     case LOCK_XSYN: lock->set_state(LOCK_XSYN_SYNC); break;
4654     case LOCK_EXCL: lock->set_state(LOCK_EXCL_SYNC); break;
4655     default: ceph_abort();
4656     }
...
4676     if (lock->get_parent()->is_replicated() && old_state == LOCK_MIX) {
4677       send_lock_message(lock, LOCK_AC_SYNC);
4678       lock->init_gather();
4679       gather++;
4680     }
...
4708     if (gather) {
4709       lock->get_parent()->auth_pin(lock);
4710       if (need_recover)
4711         mds->mdcache->do_file_recover();
4712       return false;
4713     }
4715 
4716   if (lock->get_parent()->is_replicated()) {    // FIXME
4717     bufferlist data;
4718     lock->encode_locked_state(data);
4719     send_lock_message(lock, LOCK_AC_SYNC, data);
4720   }
4721   lock->set_state(LOCK_SYNC);
4722   lock->finish_waiters(SimpleLock::WAIT_RD|SimpleLock::WAIT_STABLE);
4723   if (in && in->is_head()) {
4724     if (need_issue)
4725       *need_issue = true;
4726     else
4727       issue_caps(in);
4728   }
4729   return true;
4730 }

While dues to some reasons just before the replicate mds receiving the LOCK_AC_SYNC msg the active mds restarts or failover, and the replicated mds just detects that the auth mds state is less than MDSMap::STATE_REJOIN, then in handle_file_lock() set the filelock state to LOCK_MIX_SYNC and then in eval_gather() will miss push it further to set the state to LOCK_MIX_SYNC2:

5743 void Locker::handle_file_lock(ScatterLock *lock, const cref_t<MLock> &m)
5744 {           
5745   CInode *in = static_cast<CInode*>(lock->get_parent());
5746   int from = m->get_asker();
5747 
5748   if (mds->is_rejoin()) {
5749     if (in->is_rejoining()) {
5750       dout(7) << "handle_file_lock still rejoining " << *in
5751               << ", dropping " << *m << dendl;
5752       return;
5753     }
5754   }         
5755     
5756   dout(7) << "handle_file_lock a=" << lock->get_lock_action_name(m->get_action())
5757           << " on " << *lock
5758           << " from mds." << from << " " 
5759           << *in << dendl;
5760 
5761   bool caps = lock->get_cap_shift();
5762   
5763   switch (m->get_action()) {
5764     // -- replica --
5765   case LOCK_AC_SYNC:                                                                                                                                                                                                              
5766     ceph_assert(lock->get_state() == LOCK_LOCK ||
5767            lock->get_state() == LOCK_MIX ||
5768            lock->get_state() == LOCK_MIX_SYNC2);
5769      
5770     if (lock->get_state() == LOCK_MIX) {
5771       lock->set_state(LOCK_MIX_SYNC);                    ===> set the state to LOCK_MIX_SYNC
5772       eval_gather(lock, true);                           ===> then call eval_gather()
5773       if (lock->is_unstable_and_locked()) {
5774         if (lock->is_cached())
5775           invalidate_lock_caches(lock);
5776         mds->mdlog->flush();
5777       }
5778       break;
5779     }
5780     
...
1066 void Locker::eval_gather(SimpleLock *lock, bool first, bool *pneed_issue, MDSContext::vec *pfinishers)
1067 {
...
1141       if (!mds->is_cluster_degraded() ||
1142           mds->mdsmap->get_state(auth) >= MDSMap::STATE_REJOIN) {   ===> cluster is degraded and auth MDS is not ready
1143         switch (lock->get_state()) {
1144         case LOCK_SYNC_LOCK:
1145           mds->send_message_mds(make_message<MLock>(lock, LOCK_AC_LOCKACK, mds->get_nodeid()), auth);
1146           break;
1147 
1148         case LOCK_MIX_SYNC:
1149           {
1150             auto reply = make_message<MLock>(lock, LOCK_AC_SYNCACK, mds->get_nodeid());
1151             lock->encode_locked_state(reply->get_data());
1152             mds->send_message_mds(reply, auth);
1153             next = LOCK_MIX_SYNC2;
1154             (static_cast<ScatterLock *>(lock))->start_flush();
1155           }
1156           break;
...

And then what if just after the active mds recovers and then sends a new LOCK_AC_SYNC msg ? Then in handle_file_lock() it will fail:

5743 void Locker::handle_file_lock(ScatterLock *lock, const cref_t<MLock> &m)
5744 {
...
5763   switch (m->get_action()) {
5764     // -- replica --
5765   case LOCK_AC_SYNC:                                                                                                                                                                                                              
5766     ceph_assert(lock->get_state() == LOCK_LOCK ||
5767            lock->get_state() == LOCK_MIX ||
5768            lock->get_state() == LOCK_MIX_SYNC2);
5769     
5770     if (lock->get_state() == LOCK_MIX) {
5771       lock->set_state(LOCK_MIX_SYNC);
5772       eval_gather(lock, true);
5773       if (lock->is_unstable_and_locked()) {
5774         if (lock->is_cached())
5775           invalidate_lock_caches(lock);
5776         mds->mdlog->flush();
5777       }
5778       break;
5779     }
...
Actions #11

Updated by Xiubo Li 5 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 54687
Actions #12

Updated by Venky Shankar 2 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from reef,quincy,pacific to reef,quincy,squid
Actions #13

Updated by Backport Bot 2 months ago

  • Copied to Backport #64584: reef: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2) added
Actions #14

Updated by Backport Bot 2 months ago

  • Copied to Backport #64585: squid: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2) added
Actions #15

Updated by Backport Bot 2 months ago

  • Copied to Backport #64586: quincy: crash: void Locker::handle_file_lock(ScatterLock*, ceph::cref_t<MLock>&): assert(lock->get_state() == LOCK_LOCK || lock->get_state() == LOCK_MIX || lock->get_state() == LOCK_MIX_SYNC2) added
Actions #16

Updated by Backport Bot 2 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF