Project

General

Profile

Actions

Bug #48221

open

ceph_osd_numpg reported from ceph-mgr different from ceph osd df tree

Added by Norman Shen over 3 years ago. Updated over 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We are using ceph 12.2.12 and saw ceph_osd_numpg is different
from value got from ceph osd df tree.

root@stor-mgt01:~# ceph osd df tree | grep osd.79
79 hdd 1.63739 1.00000 1.64TiB 16.9GiB 1.62TiB 1.01 1.60 189 osd.79

root@stor-mgt01:~# curl 10.200.112.68:9283/metrics -s | grep numpg | grep osd.79
ceph_osd_numpg{ceph_daemon="osd.79"} 204.0

Actions #1

Updated by Neha Ojha over 3 years ago

  • Status changed from New to Need More Info

How consistent is this behavior? Do the numbers eventually match up? Also, was the cluster active+clean when this was seen?

Actions #2

Updated by Norman Shen over 3 years ago

Neha Ojha wrote:

How consistent is this behavior? Do the numbers eventually match up? Also, was the cluster active+clean when this was seen?

Yes the cluster is HEALTH_OK when this happens. I try to fix it by restarting ceph-mgr, but it does not work. It eventually got solved when I restart related ceph-osd pod.

Actions #3

Updated by Neha Ojha over 3 years ago

Can you reproduce this on a newer version of Ceph? I see this on luminous.
One possible explanation for ceph_osd_numpg to be higher is that we aren't deleting strays for PGs correctly and therefore see a higher count. We'll try to add a test to verify this behavior.

Actions

Also available in: Atom PDF