Project

General

Profile

Actions

Bug #63105

open

mds: report clients laggy due laggy OSDs only after checking any OSD is laggy

Added by Dhairya Parmar 8 months ago. Updated 7 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Currently the code to report health warning about laggy clients due to laggy OSDs in mds/Beacon.cc is buggy since it reports:

Health check failed: 1 client(s) laggy due to laggy OSDs (MDS_CLIENTS_LAGGY) MDS health message cleared (mds.?): Client 17676 is laggy; not evicted because some OSD(s) is/are laggy
Health check cleared: MDS_CLIENTS_LAGGY (was: 1 client(s) laggy due to laggy OSDs)
Health check failed: 1 client(s) laggy due to laggy OSDs (MDS_CLIENTS_LAGGY)
MDS health message cleared (mds.?): Client 28931 is laggy; not evicted because some OSD(s) is/are laggy
Health check cleared: MDS_CLIENTS_LAGGY (was: 1 client(s) laggy due to laggy OSDs)

because the current code in Beacon.cc checks if the laggy_clients set is non-empty. This is erroneous and must be fixed:
1) if any osd is laggy and there are laggy_clients with defer_client_eviction_on_laggy_osds true: Client X is laggy; not evicted because some OSD is/are laggy
2) if any osd is laggy and there are laggy_clients but defer_client_eviction_on_laggy_osds is unset: Client X is laggy because some OSD is/are laggy

I.e. we will continue reporting clients that are laggy due to laggy osds but we will not say they are evicted when config defer_client_eviction_on_laggy_osds is unset/false/off.


Related issues 3 (2 open1 closed)

Copied to CephFS - Backport #63269: pacific: mds: report clients laggy due laggy OSDs only after checking any OSD is laggyResolvedDhairya ParmarActions
Copied to CephFS - Backport #63270: quincy: mds: report clients laggy due laggy OSDs only after checking any OSD is laggyIn ProgressDhairya ParmarActions
Copied to CephFS - Backport #63271: reef: mds: report clients laggy due laggy OSDs only after checking any OSD is laggyIn ProgressDhairya ParmarActions
Actions

Also available in: Atom PDF