Project

General

Profile

Actions

Bug #1538

closed

mds: all clients can and up becoming unresponsive, mds locker waiting for unfreeze

Added by Brandon Seibel over 12 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Cluster mds config as such: mds e72: 2/2/2 up {0=mds02=up:active,1=mds01=up:active}, 2 up:standby-replay

Tried to force the condition to occur again by rsync --inplace (norenames) into client02, while running through the same directory tree in client01 running md5sums on all files.
After an hour or so both the md5sums and rsyncs started blocking.

While this was happening all that was occurring on mds1 was :

2011-09-12 17:20:10.507592 7f5514360700 mds1.locker scatter_tick
2011-09-12 17:20:10.507641 7f5514360700 mds1.locker scatter_nudge waiting for unfreeze on [inode 20000007334 [...2,head] /xxx.com/www/htdocs/files/comp-photos/stills_white_0640/ auth{0=1} v11144 ap=0+1 f(v0 m2011-09-12 15:48:40.702360 117=0+117) n(v219 rc2011-09-12 16:32:58.682720 b1040975835 6965=6847+118) (inest mix dirty) (iversion lock) caps={4122=pAsLsXsFs/-@1,4163=pAsLsXsFs/-@2} | ptrwaiter dirtyscattered dirfrag caps replicated dirty 0xfe00d20]
2011-09-12 17:20:10.507663 7f5514360700 mds1.locker scatter_nudge waiting for unfreeze on [inode 10000000006 [...2,head] /xxx.com/www/htdocs/files/ auth{0=1} v67647 ap=0+1 f(v1 m2011-09-12 14:44:45.580215 4342=4331+11) n(v1095 rc2011-09-12 16:32:58.682720 b2010910079 49259=48532+727) (inest mix dirty) (iversion lock) caps={4122=pAsLsXsFs/-@1,4163=pAsLsXsFs/-@1} | ptrwaiter dirtyscattered dirfrag caps replicated dirty 0x6c2e140]
2011-09-12 17:20:10.507700 7f5514360700 mds1.locker scatter_nudge waiting for unfreeze on [inode 10000000002 [...2,head] /xxx.com/www/htdocs/ auth{0=1} v16660 ap=0+1 f(v1 m2011-09-12 11:01:39.000016 25=12+13) n(v1158 rc2011-09-12 16:32:58.682720 b2087649934 54786=53661+1125) (inest mix dirty) (iversion lock) caps={4122=pAsLsXsFs/-@1,4163=pAsLsXsFs/-@4} | ptrwaiter dirtyscattered dirfrag caps replicated dirty 0xfd9b460]
2011-09-12 17:20:10.507721 7f5514360700 mds1.locker scatter_nudge waiting for unfreeze on [inode 20000007330 [...2,head] /xxx.com/www/htdocs/files/comp-photos/ auth{0=1} v92743 ap=0+1 f(v0 m2011-09-12 14:44:48.726817 6=3+3) n(v528 rc2011-09-12 16:32:58.682720 b1379294006 20373=20030+343) (inest mix dirty) (iversion lock) caps={4122=pAsLsXsFs/-@1,4163=pAsLsXsFs/-@1} | ptrwaiter dirtyscattered dirfrag caps replicated dirty 0xfdfeba0]
2011-09-12 17:20:10.507739 7f5514360700 mds1.locker scatter_nudge waiting for unfreeze on [inode 10000000001 [...2,head] /xxx.com/www/ auth{0=1} v15172 ap=0+1 f(v1 m2011-09-12 11:01:38.811432 2=0+2) n(v1119 rc2011-09-12 16:36:39.869162 b2088488587 54828=53701+1127) (inest mix dirty) (iversion lock) caps={4122=pAsLsXsFs/-@1,4163=pAsLsXsFs/-@1} | ptrwaiter dirtyscattered dirfrag caps replicated dirty 0x13c6920]

along with the migrator that couldn't do anything due to:
2011-09-12 17:20:27.858671 7f5515463700 mds1.migrator can't export, freezing|frozen. wait for other exports to finish first.

So looks like somewhere part of the tree got frozen and then never unfrozen.

Can retrieve logs here:
http://evul.net/~xnevious/mds01.shortened.log.bz2
http://evul.net/~xnevious/mds02.shortend.log.bz2
http://evul.net/~xnevious/client01.shortened.log.bz2


Files

docapupdate-deadlock.patch (1.2 KB) docapupdate-deadlock.patch Brandon Seibel, 09/29/2011 11:15 AM
Actions #1

Updated by Brandon Seibel over 12 years ago

Narrowed it down to the following scenario:

  • At some point earlier on we're imported with a loner
  • A client request comes in that triggers a _do_cap_update and file_max needs to change.
  • filelock is sync and stable at this point
  • caps dont allow for wrlock or force_wrlock
  • We're stable, and also have a loner, so file_excl and consequently get auth_pinned
  • We hit:
    if (!in->filelock.can_wrlock(client) &&
    !in->filelock.can_force_wrlock(client)) {
    in->filelock.add_waiter(SimpleLock::WAIT_STABLE, new C_MDL_CheckMaxSize(this, in));
    change_max = false;
    }
    So we wait.
  • Lockack comes in from the other mds, handle_file_lock
  • Only had one gather, so eval_gather
  • We hit this conditional in eval_gather:
    if (!lock->is_gathering() &&
    (IS_TRUE_AND_LT_AUTH(lock->get_sm()->states[next].can_rdlock, auth) || !lock->is_rdlocked()) &&
    (IS_TRUE_AND_LT_AUTH(lock->get_sm()->states[next].can_wrlock, auth) || !lock->is_wrlocked()) &&
    (IS_TRUE_AND_LT_AUTH(lock->get_sm()->states[next].can_xlock, auth) || !lock->is_xlocked()) &&
    (IS_TRUE_AND_LT_AUTH(lock->get_sm()->states[next].can_lease, auth) || !lock->is_leased()) &&
    !(lock->get_parent()->is_auth() && lock->is_flushing()) && // i.e. wait for scatter_writebehind!
    (!caps || ((~lock->gcaps_allowed(CAP_ANY, next) & other_issued) 0 &&
    (~lock->gcaps_allowed(CAP_LONER, next) & loner_issued) 0 &&
    (~lock->gcaps_allowed(CAP_XLOCKER, next) & xlocker_issued) == 0)) &&
    lock->get_state() != LOCK_SYNC_MIX2 && // these states need an explicit trigger from the auth mds
    lock->get_state() != LOCK_MIX_SYNC2
    )
    and fail it.... at this point eval_gather returns, with the previous waiter never finishing, which results in the pin never being removed.

At this point we're deadlocked, since we're waiting for a freeze which is waiting on an auth pin which is waiting for lock stability that will never happen.

I'm still learning how the loner stuff is supposed to work so I haven't been able to find a cause yet, but it seems like we imported the inode with a loner we maybe shouldn't have?

Actions #2

Updated by Brandon Seibel over 12 years ago

I think I have a fix for this, let me know if it makes sense.

Actions #3

Updated by Sage Weil over 12 years ago

  • Status changed from New to Resolved
  • Assignee set to Sage Weil

I applied your patch, slightly modified.. I still pass need_issue and call issue_caps explicitly. Same end result, but the code is more consistent with surrounding calls.

Thanks!

Actions #4

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF