Bug #4742: mds: stuck clientreplay request - CephFS - Ceph

Actions

Copy link

Bug #4742

closed

mds: stuck clientreplay request

Added by Greg Farnum about 11 years ago. Updated almost 8 years ago.

Status:

Resolved

Priority:

High

Assignee:

Sam Lang

Category:

Target version:

v0.61 - Cuttlefish

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

/a/teuthology-2013-04-17_01:00:56-fs-master-testing-basic/14246

It has a single request which isn't completing; who knows why or if we can reproduce by restarting the MDS.

Files

Download all files

ceph-mds.b-s-a.log.gz (6.61 MB) ceph-mds.b-s-a.log.gz		Sam Lang, 04/23/2013 10:29 AM
setattr-ceph-client.0.13000.log.bz2 (2.61 MB) setattr-ceph-client.0.13000.log.bz2		Sam Lang, 04/25/2013 10:38 AM
setattr-ceph-mds.a.log.bz2 (7.15 MB) setattr-ceph-mds.a.log.bz2		Sam Lang, 04/25/2013 10:38 AM
setattr-ceph-mds.b-s-a.log.bz2 (3.32 MB) setattr-ceph-mds.b-s-a.log.bz2		Sam Lang, 04/25/2013 10:38 AM
setattr-mds_requests (1.52 KB) setattr-mds_requests		Sam Lang, 04/25/2013 10:38 AM
rename-ceph-client.0.12715.log.bz2 (2.61 MB) rename-ceph-client.0.12715.log.bz2		Sam Lang, 04/25/2013 10:38 AM
rename-ceph-mds.a.log.bz2 (3.34 MB) rename-ceph-mds.a.log.bz2		Sam Lang, 04/25/2013 10:38 AM
rename-ceph-mds.b-s-a.log.bz2 (6.46 MB) rename-ceph-mds.b-s-a.log.bz2		Sam Lang, 04/25/2013 10:38 AM
rename-mds_requests (1.16 KB) rename-mds_requests		Sam Lang, 04/25/2013 10:38 AM

Actions

Copy link

Updated by Sam Lang about 11 years ago

Looks like a setattr and a create:

ubuntu@plana72:~$ sudo ceph --admin-daemon /var/run/ceph/ceph-client.0.19374.asok mds_requests { "tid": 1219,
"op": "setattr",
"path": "#100000001f5",
"path2": "",
"ino": "100000001f5",
"target_ino": "100000001f5",
"hint_ino": "0",
"sent_stamp": "2013-04-17 03:00:24.344943",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 0,
"got_unsafe": 1,
"uid": 0,
"gid": 0,
"oldest_client_tid": 1211,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0}{ "tid": 1220,
"op": "create",
"path": "#100000001f5\/fstest_b5c1034e024d6f8e44a438e430391c84",
"path2": "",
"ino": "100000001f5",
"dentry": "fstest_b5c1034e024d6f8e44a438e430391c84",
"hint_ino": "0",
"sent_stamp": "2013-04-17 03:00:25.360236",
"mds": 0,
"resend_mds": -1,
"send_to_auth": 0,
"sent_on_mseq": 0,
"retry_attempt": 0,
"got_unsafe": 0,
"uid": 0,
"gid": 0,
"oldest_client_tid": 1219,
"mdsmap_epoch": 0,
"flags": 0,
"num_retry": 0,
"num_fwd": 0,
"num_releases": 0}

Actions

Copy link

Updated by Sam Lang about 11 years ago

Marked #4741 as a duplicate of this bug. It looks like setattr is the culprit. I was able to generate a core file of the mds while it was in this state, and the only request sitting in mds->mdcache->active_requests is the setattr which the client is waiting for (and already has an unsafe reply to). I have the dump of the mds cache as well, all that it shows for the inode the setattr is operating on is that its dirty.

Actions

Copy link

Updated by Sam Lang about 11 years ago

Status changed from New to In Progress
Assignee set to Sam Lang

Actions

Copy link

Updated by Sam Lang almost 11 years ago

File ceph-mds.b-s-a.log.gz ceph-mds.b-s-a.log.gz added

Attaching mds log from mds stuck on clientreplay. Looks like setattr is gets put on the inode waiting list by the locker, after sending the client a caps revoke for pin auth.

Actions

Copy link Download all files

Updated by Sam Lang almost 11 years ago

File setattr-ceph-client.0.13000.log.bz2 setattr-ceph-client.0.13000.log.bz2 added
File setattr-ceph-mds.a.log.bz2 setattr-ceph-mds.a.log.bz2 added
File setattr-ceph-mds.b-s-a.log.bz2 setattr-ceph-mds.b-s-a.log.bz2 added
File setattr-mds_requests setattr-mds_requests added
File rename-ceph-client.0.12715.log.bz2 rename-ceph-client.0.12715.log.bz2 added
File rename-ceph-mds.a.log.bz2 rename-ceph-mds.a.log.bz2 added
File rename-ceph-mds.b-s-a.log.bz2 rename-ceph-mds.b-s-a.log.bz2 added
File rename-mds_requests rename-mds_requests added

Logs for two runs, one is stuck in replay from a setattr, the other is stuck in replay from a rename.

Actions

Copy link

Updated by Zheng Yan almost 11 years ago

Looks like a client bug, it may add cap releases to the replay requests. (encode_cap_releases() should be called when creating request, instead of sending request)

Actions

Copy link