Project

General

Profile

Actions

Bug #46440

closed

mgr: don't update osd stat which is already out

Added by Zhi Zhang almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
octopus, nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When our OSDs get hang and have slow requests, we try to identify the strange hang issue and also let the cluster get back to normal by setting noup flag and then marking OSD down and out. The cluster still reports slow requests on this OSD as a false alarm.

We found when orignal PG monitor handled PGSTATS msg, it wouldn't update osd stat if this OSD is not in OSD map, but current MGR had no checks on that.


Related issues 3 (0 open3 closed)

Related to RADOS - Bug #48385: nautilus: statfs: a cluster with any up but out osd will report bytes_used == storedResolvedIgor Fedotov

Actions
Copied to mgr - Backport #48400: nautilus: mgr: don't update osd stat which is already outResolvedIgor FedotovActions
Copied to mgr - Backport #48401: octopus: mgr: don't update osd stat which is already outResolvedIgor FedotovActions
Actions

Also available in: Atom PDF