Project

General

Profile

Actions

Bug #18641

closed

mds: stalled clients apparently due to stale sessions

Added by Patrick Donnelly over 7 years ago. Updated about 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

4/16 clients building the kernel with 2 active MDS blocked on IO. After digging into the ceph-fuse log, I found that the client is attempting to renew caps but gets no reply from the MDS:

2017-01-23 20:12:16.474388 7f81edbf3700 10 client.4241 renew_caps mds.0
2017-01-23 20:12:16.474396 7f81edbf3700  1 -- 192.168.171.154:0/1201971437 --> 192.168.220.2:6800/3808684069 -- client_session(request_renewcaps seq 4890) v2 -- 0x55cdc1980fc0 con 0

The MDS log repeatedly shows (from a different message, so different timestamp):

2017-01-23 20:51:56.825544 7f0f80d13700  3 mds.0.server handle_client_session client_session(request_renewcaps seq 5009) v1 from client.4241
2017-01-23 20:51:56.825548 7f0f80d13700 10 mds.0.server ignoring renewcaps on non open|stale session (closed)

Unfortunately, I do not have the MDS logs from the time when the client last talked to mds.0. I do have the entire client log.

Last bit of the MDS log: /ceph/cephfs-perf/drop/ceph-mds-0.log.gz
Client log: /ceph/cephfs-perf/drop/ceph-client-4241.log.gz

I have other logs (e.g. mds.1) too if they're needed.

Actions #1

Updated by John Spray almost 7 years ago

  • Status changed from New to Can't reproduce
Actions #2

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (90)
  • Labels (FS) multimds added
Actions

Also available in: Atom PDF