Support #22520
closednearfull threshold is not cleared when osd really is not nearfull.
0%
Description
Today one of my osd is reached nearfull ratio. mon_osd_nearfull_ratio: '.85'. I increased mon_osd_nearfull_ratio to '0.9'
I rebalanced data by increase weights on another osd's in this root. For that time while I was looking for the golden rule some another osds reached nearfull. But at the end all of this osds should clear nearfull flag because USED space % is lower than mon_osd_nearfull_ratio. osds in this root used with pool size 2 min_size 1 (idontcareaboutmydata).
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-12 6.29997 - 5716G 4719G 997G 82.56 8.71 - root solid
-14 6.29997 - 5716G 4719G 997G 82.56 8.71 - datacenter xxx_solid
-15 2.09999 - 1905G 1455G 450G 76.37 8.06 - rack rack2-solid
-13 1.00000 - 952G 686G 266G 72.03 7.60 - host ceph-osd0-solid
24 nvme 1.00000 1.00000 952G 686G 266G 72.03 7.60 74 osd.24
-19 1.09999 - 952G 768G 183G 80.70 8.51 - host ceph-osd2-solid
26 nvme 1.09999 1.00000 952G 768G 183G 80.70 8.51 83 osd.26
-16 2.09999 - 1905G 1590G 314G 83.49 8.81 - rack rack3-solid
-20 1.09999 - 952G 775G 177G 81.40 8.59 - host ceph-osd3-solid
30 nvme 1.09999 1.00000 952G 775G 177G 81.40 8.59 84 osd.30
-22 1.00000 - 952G 815G 137G 85.58 9.03 - host ceph-osd5-solid
29 nvme 1.00000 1.00000 952G 815G 137G 85.58 9.03 89 osd.29
-17 2.09999 - 1905G 1673G 232G 87.82 9.27 - rack rack4-solid
-18 1.09999 - 952G 835G 117G 87.72 9.25 - host ceph-osd1-solid
25 nvme 1.09999 1.00000 952G 835G 117G 87.72 9.25 91 osd.25
-21 1.00000 - 952G 837G 115G 87.92 9.28 - host ceph-osd4-solid
28 nvme 1.00000 1.00000 952G 837G 115G 87.92 9.28 91 osd.28
HEALTH:
[root@ceph-mon0 ceph]# ceph health detail
HEALTH_WARN 3 nearfull osd(s); 1 pool(s) nearfull
OSD_NEARFULL 3 nearfull osd(s)
osd.25 is near full
osd.28 is near full
osd.29 is near full
POOL_NEARFULL 1 pool(s) nearfull
pool 'solid_rbd' is nearfull
OSD DF:
[root@ceph-mon0 ceph]# ceph osd df | grep nvme | grep -E '(25|28|29)'
29 nvme 1.00000 1.00000 952G 815G 137G 85.58 9.03 89
25 nvme 1.09999 1.00000 952G 835G 117G 87.72 9.25 91
28 nvme 1.00000 1.00000 952G 837G 115G 87.92 9.28 91
MONs settings:
[root@ceph-mon0 ceph]# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon0.asok config show | grep nearfull
"mon_osd_nearfull_ratio": "0.900000",
OSDs settings:
[root@ceph-osd4 ceph]# ceph daemon osd.28 config get mon_osd_nearfull_ratio
{
"mon_osd_nearfull_ratio": "0.900000"
}
When I was find out 'ceph tell' is not working I was deployed ceph.conf with new settings:
[root@ceph-osd4 ceph]# grep full ceph.conf
mon_osd_full_ratio = .91
mon_osd_nearfull_ratio = .90
[root@ceph-mon0 ceph]# grep full ceph.conf
mon_osd_full_ratio = .91
mon_osd_nearfull_ratio = .90
And restart this osds - not helped.
The (?) same (?) behavior in ceph-users ML http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-December/023397.html
This bug or I need to do some magic?