Project

General

Profile

Actions

Bug #64563

open

mds: enhance laggy clients detections due to laggy OSDs

Added by Dhairya Parmar 3 months ago. Updated 24 days ago.

Status:
Triaged
Priority:
Urgent
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
quincy,reef,squid
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Right now the code happily accepts that if there is any laggy OSD and a client got laggy then it must be due to the OSD, the code is note able to differentiate between "this client isn't responding" and "this client is slow to release caps" meaning if the client went off the grid AND we have any laggy OSD, MDS will not evict that client and instead mark it as "laggy due to laggy OSDs" which is completely wrong. There are five instances where clients are added to set `laggy_clients` in `Server.cc` which needs to re-evaluated and make sure that we do consider client eviction in cases like [0] where the last cap renew span is more than the session autoclose duration(i.e. 300 seconds default config) which is long enough to conclude that we have lost the client.

Not only this, there needs to be some sane/practical values for which we consider any osd to be worthy enough to be considered laggy i.e. "laggy enough to make a client(or anything) laggy", current implementation is too naive as it just checks whether any laggy param(osd_xinfo_t.laggy_interval or osd_xinfo_t.laggy_probability) is non-zero. This will make MDS not evict clients even though that slight OSD lagginess might not be that serious at all. In other words we need to make the helper OSDMap::any_osd_laggy a bit more smart and fine-grained.

[0] https://github.com/ceph/ceph/blob/main/src/mds/Server.cc#L1184-L1190


Related issues 1 (1 open0 closed)

Related to CephFS - Fix #58023: mds: do not evict clients if OSDs are laggyPending BackportDhairya Parmar

Actions
Actions

Also available in: Atom PDF