Project

General

Profile

Bug #61947

mds: enforce a limit on the size of a session in the sessionmap

Added by Patrick Donnelly 8 months ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If the session's "completed_requests" vector gets too large, the session can get to a size where the MDS goes read-only because the OSD rejects sessionmap object updates with "Message size too long".

2023-07-10 13:53:30.529 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.529 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.530 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:35.635 7f8fe687e700 -1 mds.0.2679609 unhandled write error (90) Message too long, force readonly...
2023-07-10 13:53:35.635 7f8fe687e700  1 mds.0.cache force file system read-only
2023-07-10 13:53:35.635 7f8fe687e700  0 log_channel(cluster) log [WRN] : force file system read-only

If a session exceeds some configurable encoded size (maybe 16MB), then evict it.


Subtasks

Bug #62257: mds: blocklist clients that are not advancing `oldest_client_tid`NewVenky Shankar


Related issues

Related to CephFS - Bug #63364: MDS_CLIENT_OLDEST_TID: 15 clients failing to advance oldest client/flush tid Pending Backport
Copied to CephFS - Backport #62583: reef: mds: enforce a limit on the size of a session in the sessionmap Resolved
Copied to CephFS - Backport #62584: pacific: mds: enforce a limit on the size of a session in the sessionmap Resolved
Copied to CephFS - Backport #62585: quincy: mds: enforce a limit on the size of a session in the sessionmap Resolved

History

#1 Updated by Venky Shankar 8 months ago

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

Patrick, was the client ceph-mgr in this case?

#2 Updated by Patrick Donnelly 7 months ago

Venky Shankar wrote:

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

There is a genuine bug somewhere that needs tracked down but the MDS shouldn't fail like this if a client is buggy.

Patrick, was the client ceph-mgr in this case?

No, it was Ganesha.

#3 Updated by Venky Shankar 7 months ago

Patrick Donnelly wrote:

Venky Shankar wrote:

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

There is a genuine bug somewhere that needs tracked down but the MDS shouldn't fail like this if a client is buggy.

Yeh. For now, maybe blocklist the client if the completed_request count shoots above a limit.

It could be a buggy client or a bug in the MDS - we've seen reports where the clients are ceph-mgr (libcephfs) and even kclient. I feel a certain code path is missing marking the session as dirty (in LogSegment::touched_sessions).

EDIT: In the sense, not just delaying persisting the session map, but also accumulating it it memory.

#4 Updated by Leonid Usov 7 months ago

  • Assignee set to Leonid Usov

#5 Updated by Venky Shankar 7 months ago

Leonid, I forgot to update the tracker assignee post our sync. I've done implementing 50% of the work.

#6 Updated by Venky Shankar 7 months ago

  • Assignee changed from Leonid Usov to Venky Shankar

#7 Updated by Leonid Usov 7 months ago

Sure, no problem.

#8 Updated by Venky Shankar 7 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 52944

#9 Updated by Venky Shankar 6 months ago

  • Status changed from Fix Under Review to Pending Backport

#10 Updated by Backport Bot 6 months ago

  • Copied to Backport #62583: reef: mds: enforce a limit on the size of a session in the sessionmap added

#11 Updated by Backport Bot 6 months ago

  • Copied to Backport #62584: pacific: mds: enforce a limit on the size of a session in the sessionmap added

#12 Updated by Backport Bot 6 months ago

  • Copied to Backport #62585: quincy: mds: enforce a limit on the size of a session in the sessionmap added

#13 Updated by Backport Bot 6 months ago

  • Tags set to backport_processed

#14 Updated by Xiubo Li 4 months ago

  • Related to Bug #63364: MDS_CLIENT_OLDEST_TID: 15 clients failing to advance oldest client/flush tid added

#15 Updated by Konstantin Shalygin 3 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF