Bug #48221
ceph_osd_numpg reported from ceph-mgr different from ceph osd df tree
0%
Description
We are using ceph 12.2.12 and saw ceph_osd_numpg is different
from value got from ceph osd df tree.
root@stor-mgt01:~# ceph osd df tree | grep osd.79
79 hdd 1.63739 1.00000 1.64TiB 16.9GiB 1.62TiB 1.01 1.60 189 osd.79
root@stor-mgt01:~# curl 10.200.112.68:9283/metrics -s | grep numpg | grep osd.79
ceph_osd_numpg{ceph_daemon="osd.79"} 204.0
History
#1 Updated by Neha Ojha about 3 years ago
- Status changed from New to Need More Info
How consistent is this behavior? Do the numbers eventually match up? Also, was the cluster active+clean when this was seen?
#2 Updated by Norman Shen about 3 years ago
Neha Ojha wrote:
How consistent is this behavior? Do the numbers eventually match up? Also, was the cluster active+clean when this was seen?
Yes the cluster is HEALTH_OK when this happens. I try to fix it by restarting ceph-mgr, but it does not work. It eventually got solved when I restart related ceph-osd pod.
#3 Updated by Neha Ojha about 3 years ago
Can you reproduce this on a newer version of Ceph? I see this on luminous.
One possible explanation for ceph_osd_numpg to be higher is that we aren't deleting strays for PGs correctly and therefore see a higher count. We'll try to add a test to verify this behavior.