Project

General

Profile

Bug #16288

mds: `session evict` tell command blocks forever with async messenger (TestVolumeClient.test_evict_client failure)

Added by John Spray about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
High
Category:
Code Hygiene
Target version:
-
Start date:
06/14/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:

Description

I'm assuming for the moment that this is an MDS bug rather than something getting dropped in the new messenger code.

MDSRankDispatcher::evict_sessions blocks on journal flush. Seems that we might be preventing the osd op reply being serviced by doing that.


Related issues

Copied to fs - Backport #16621: jewel: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClient.test_evict_client failure) Resolved

History

#1 Updated by John Spray about 3 years ago

  • Subject changed from mds: `session evict` tell command blocks forever with async messenger to mds: `session evict` tell command blocks forever with async messenger (TestVolumeClient.test_evict_client failure)

#2 Updated by Greg Farnum about 3 years ago

  • Priority changed from Normal to High

This deadlocks and lockdep makes it crash in our nightlies; we should fix it quickly! :)

#3 Updated by John Spray about 3 years ago

NB back out part of https://github.com/ceph/ceph-qa-suite/pull/1054 when fixing this, it's switched back to simple messenger for the moment.

#4 Updated by Douglas Fuller about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Douglas Fuller

#5 Updated by Greg Farnum about 3 years ago

John, do you have any logs? The only failure of this test I can find is http://qa-proxy.ceph.com/teuthology/teuthology-2016-05-07_18:04:02-fs-master---basic-smithi/178451, but that's complaining about client counts, not stuck asok requests.

#6 Updated by Greg Farnum about 3 years ago

  • Status changed from In Progress to Need More Info

#7 Updated by John Spray about 3 years ago

  • Status changed from Need More Info to New

Oops, I meant to paste to begin with. I think it was this one:
/a/jspray-2016-06-13_14:56:46-fs-wip-jcsp-testing-quota-2-distro-basic-mira/257054

#8 Updated by Greg Farnum about 3 years ago

Not to take away Doug's thunder, but I gather he's been unable to reproduce it. The AsyncMessenger may have already been "fixed" so that this isn't a problem, but we should also change the way evict_sessions() works to not block where it does. We should discuss.

#9 Updated by Douglas Fuller almost 3 years ago

  • Status changed from New to In Progress

Still no reproducer, but

https://github.com/ceph/ceph/pull/9971

may help.

#10 Updated by John Spray almost 3 years ago

  • Status changed from In Progress to Pending Backport
  • Backport set to jewel

#11 Updated by Loic Dachary almost 3 years ago

  • Copied to Backport #16621: jewel: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClient.test_evict_client failure) added

#12 Updated by Greg Farnum almost 3 years ago

  • Category set to Code Hygiene

#13 Updated by Loic Dachary almost 3 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF