Project

General

Profile

Actions

Bug #10944

closed

Deadlock, MDS logs "slow request", getattr pAsLsXsFs failed to rdlock

Added by Ilja Slepnev about 9 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

CephFS path got stuck in directory listing process. No OSD/network activity. Only MDS logs. Client reboot does not help.

  1. Created big files on path /ceph/a
  2. cat /ceph/a/* >/dev/null
  3. ls /ceph/a (deadlock here)
  4. /ceph/a is locked for all clients

Log messages on MDS (repeating):
2015-02-24 16:02:41.564071 7fdb0055c700 0 log_channel(default) log [WRN] : 9 slow requests, 1 included below; oldest blocked for > 14463.448519 secs
2015-02-24 16:02:41.564077 7fdb0055c700 0 log_channel(default) log [WRN] : slow request 1922.318256 seconds old, received at 2015-02-24 15:30:39.245786: client_request(client.66401597:2440 getattr pAsLsXsFs #10000002d68) currently failed to rdlock, waiting

Configuration of MDS and CephFS client:
CentOS 7.0.1406, Linux 3.10.0-123.20.1.el7.centos.plus.x86_64
ceph-0.87
dmesg: libceph: loaded (mon/osd proto 15/24)
dmesg: ceph: loaded (mds proto 32)

HEALTH_OK. OSD commit/apply latency is not changed, <100 ms.

How to find out what is going wrong and why getattr causes a deadlock?

Is there a workaround?


Files

Actions

Also available in: Atom PDF