Project

General

Profile

Actions

Bug #61947

closed

mds: enforce a limit on the size of a session in the sessionmap

Added by Patrick Donnelly 10 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If the session's "completed_requests" vector gets too large, the session can get to a size where the MDS goes read-only because the OSD rejects sessionmap object updates with "Message size too long".

2023-07-10 13:53:30.529 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.529 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.530 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:30.534 7f8fed08b700  0 log_channel(cluster) log [WRN] : client.744507717 does not advance its oldest_client_tid (3221389957), 5905929 completed requests recorded in session
2023-07-10 13:53:35.635 7f8fe687e700 -1 mds.0.2679609 unhandled write error (90) Message too long, force readonly...
2023-07-10 13:53:35.635 7f8fe687e700  1 mds.0.cache force file system read-only
2023-07-10 13:53:35.635 7f8fe687e700  0 log_channel(cluster) log [WRN] : force file system read-only

If a session exceeds some configurable encoded size (maybe 16MB), then evict it.


Subtasks 1 (1 open0 closed)

Bug #62257: mds: blocklist clients that are not advancing `oldest_client_tid`NewVenky Shankar

Actions

Related issues 4 (1 open3 closed)

Related to CephFS - Bug #63364: MDS_CLIENT_OLDEST_TID: 15 clients failing to advance oldest client/flush tidPending BackportXiubo Li

Actions
Copied to CephFS - Backport #62583: reef: mds: enforce a limit on the size of a session in the sessionmapResolvedVenky ShankarActions
Copied to CephFS - Backport #62584: pacific: mds: enforce a limit on the size of a session in the sessionmapResolvedVenky ShankarActions
Copied to CephFS - Backport #62585: quincy: mds: enforce a limit on the size of a session in the sessionmapResolvedVenky ShankarActions
Actions #1

Updated by Venky Shankar 10 months ago

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

Patrick, was the client ceph-mgr in this case?

Actions #2

Updated by Patrick Donnelly 10 months ago

Venky Shankar wrote:

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

There is a genuine bug somewhere that needs tracked down but the MDS shouldn't fail like this if a client is buggy.

Patrick, was the client ceph-mgr in this case?

No, it was Ganesha.

Actions #3

Updated by Venky Shankar 10 months ago

Patrick Donnelly wrote:

Venky Shankar wrote:

This one's interesting. I did mention in the standup yesterday that I've seen this earlier and that cluster too had NFS ganesha, however, I digged up the BZ and surprisingly the client was ceph-mgr which was building up lots of completed_requests and that resulted in journal I/O affecting performance. One thing that was suspected back then was selinux relabeling on the PVC that somehow caused lots of unacknowledged ops to get build up for ceph-mgr.

There is a genuine bug somewhere that needs tracked down but the MDS shouldn't fail like this if a client is buggy.

Yeh. For now, maybe blocklist the client if the completed_request count shoots above a limit.

It could be a buggy client or a bug in the MDS - we've seen reports where the clients are ceph-mgr (libcephfs) and even kclient. I feel a certain code path is missing marking the session as dirty (in LogSegment::touched_sessions).

EDIT: In the sense, not just delaying persisting the session map, but also accumulating it it memory.

Actions #4

Updated by Leonid Usov 9 months ago

  • Assignee set to Leonid Usov
Actions #5

Updated by Venky Shankar 9 months ago

Leonid, I forgot to update the tracker assignee post our sync. I've done implementing 50% of the work.

Actions #6

Updated by Venky Shankar 9 months ago

  • Assignee changed from Leonid Usov to Venky Shankar
Actions #7

Updated by Leonid Usov 9 months ago

Sure, no problem.

Actions #8

Updated by Venky Shankar 9 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 52944
Actions #9

Updated by Venky Shankar 8 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #10

Updated by Backport Bot 8 months ago

  • Copied to Backport #62583: reef: mds: enforce a limit on the size of a session in the sessionmap added
Actions #11

Updated by Backport Bot 8 months ago

  • Copied to Backport #62584: pacific: mds: enforce a limit on the size of a session in the sessionmap added
Actions #12

Updated by Backport Bot 8 months ago

  • Copied to Backport #62585: quincy: mds: enforce a limit on the size of a session in the sessionmap added
Actions #13

Updated by Backport Bot 8 months ago

  • Tags set to backport_processed
Actions #14

Updated by Xiubo Li 6 months ago

  • Related to Bug #63364: MDS_CLIENT_OLDEST_TID: 15 clients failing to advance oldest client/flush tid added
Actions #15

Updated by Konstantin Shalygin 5 months ago

  • Status changed from Pending Backport to Resolved
Actions #16

Updated by Niklas Hambuechen about 2 months ago

Potentially related issue:

Actions

Also available in: Atom PDF