Project

General

Profile

Actions

Bug #4742

closed

mds: stuck clientreplay request

Added by Greg Farnum about 11 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/teuthology-2013-04-17_01:00:56-fs-master-testing-basic/14246

It has a single request which isn't completing; who knows why or if we can reproduce by restarting the MDS.


Files

ceph-mds.b-s-a.log.gz (6.61 MB) ceph-mds.b-s-a.log.gz Sam Lang, 04/23/2013 10:29 AM
setattr-ceph-client.0.13000.log.bz2 (2.61 MB) setattr-ceph-client.0.13000.log.bz2 Sam Lang, 04/25/2013 10:38 AM
setattr-ceph-mds.a.log.bz2 (7.15 MB) setattr-ceph-mds.a.log.bz2 Sam Lang, 04/25/2013 10:38 AM
setattr-ceph-mds.b-s-a.log.bz2 (3.32 MB) setattr-ceph-mds.b-s-a.log.bz2 Sam Lang, 04/25/2013 10:38 AM
setattr-mds_requests (1.52 KB) setattr-mds_requests Sam Lang, 04/25/2013 10:38 AM
rename-ceph-client.0.12715.log.bz2 (2.61 MB) rename-ceph-client.0.12715.log.bz2 Sam Lang, 04/25/2013 10:38 AM
rename-ceph-mds.a.log.bz2 (3.34 MB) rename-ceph-mds.a.log.bz2 Sam Lang, 04/25/2013 10:38 AM
rename-ceph-mds.b-s-a.log.bz2 (6.46 MB) rename-ceph-mds.b-s-a.log.bz2 Sam Lang, 04/25/2013 10:38 AM
rename-mds_requests (1.16 KB) rename-mds_requests Sam Lang, 04/25/2013 10:38 AM
Actions #1

Updated by Sam Lang about 11 years ago

Looks like a setattr and a create:

ubuntu@plana72:~$ sudo ceph --admin-daemon /var/run/ceph/ceph-client.0.19374.asok mds_requests { "tid": 1219,
"op": "setattr",
"path": "#100000001f5",
"path2": "",
"ino": "100000001f5",
"target_ino": "100000001f5",
"hint_ino": "0",
"sent_stamp": "2013-04-17 03:00:24.344943",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 0,
"got_unsafe": 1,
"uid": 0,
"gid": 0,
"oldest_client_tid": 1211,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0}{ "tid": 1220,
"op": "create",
"path": "#100000001f5\/fstest_b5c1034e024d6f8e44a438e430391c84",
"path2": "",
"ino": "100000001f5",
"dentry": "fstest_b5c1034e024d6f8e44a438e430391c84",
"hint_ino": "0",
"sent_stamp": "2013-04-17 03:00:25.360236",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 0,
"got_unsafe": 0,
"uid": 0,
"gid": 0,
"oldest_client_tid": 1219,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0}

Actions #2

Updated by Sam Lang about 11 years ago

Marked #4741 as a duplicate of this bug. It looks like setattr is the culprit. I was able to generate a core file of the mds while it was in this state, and the only request sitting in mds->mdcache->active_requests is the setattr which the client is waiting for (and already has an unsafe reply to). I have the dump of the mds cache as well, all that it shows for the inode the setattr is operating on is that its dirty.

Actions #3

Updated by Sam Lang about 11 years ago

  • Status changed from New to In Progress
  • Assignee set to Sam Lang
Actions #4

Updated by Sam Lang almost 11 years ago

Attaching mds log from mds stuck on clientreplay. Looks like setattr is gets put on the inode waiting list by the locker, after sending the client a caps revoke for pin auth.

Actions #6

Updated by Zheng Yan almost 11 years ago

Looks like a client bug, it may add cap releases to the replay requests. (encode_cap_releases() should be called when creating request, instead of sending request)

Actions #7

Updated by Greg Farnum almost 11 years ago

Yeah, we've discussed this some on github around wip-4742 and on irc. :)

Actions #8

Updated by Sage Weil almost 11 years ago

  • Status changed from In Progress to Resolved

commit:5121e56c255c079569f02e0ee852e469f38f470e

Actions #9

Updated by Greg Farnum almost 8 years ago

  • Component(FS) MDS added
Actions

Also available in: Atom PDF