Support #22520
closednearfull threshold is not cleared when osd really is not nearfull.
0%
Description
Today one of my osd is reached nearfull ratio. mon_osd_nearfull_ratio: '.85'. I increased mon_osd_nearfull_ratio to '0.9'
I rebalanced data by increase weights on another osd's in this root. For that time while I was looking for the golden rule some another osds reached nearfull. But at the end all of this osds should clear nearfull flag because USED space % is lower than mon_osd_nearfull_ratio. osds in this root used with pool size 2 min_size 1 (idontcareaboutmydata).
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS TYPE NAME
-12 6.29997 - 5716G 4719G 997G 82.56 8.71 - root solid
-14 6.29997 - 5716G 4719G 997G 82.56 8.71 - datacenter xxx_solid
-15 2.09999 - 1905G 1455G 450G 76.37 8.06 - rack rack2-solid
-13 1.00000 - 952G 686G 266G 72.03 7.60 - host ceph-osd0-solid
24 nvme 1.00000 1.00000 952G 686G 266G 72.03 7.60 74 osd.24
-19 1.09999 - 952G 768G 183G 80.70 8.51 - host ceph-osd2-solid
26 nvme 1.09999 1.00000 952G 768G 183G 80.70 8.51 83 osd.26
-16 2.09999 - 1905G 1590G 314G 83.49 8.81 - rack rack3-solid
-20 1.09999 - 952G 775G 177G 81.40 8.59 - host ceph-osd3-solid
30 nvme 1.09999 1.00000 952G 775G 177G 81.40 8.59 84 osd.30
-22 1.00000 - 952G 815G 137G 85.58 9.03 - host ceph-osd5-solid
29 nvme 1.00000 1.00000 952G 815G 137G 85.58 9.03 89 osd.29
-17 2.09999 - 1905G 1673G 232G 87.82 9.27 - rack rack4-solid
-18 1.09999 - 952G 835G 117G 87.72 9.25 - host ceph-osd1-solid
25 nvme 1.09999 1.00000 952G 835G 117G 87.72 9.25 91 osd.25
-21 1.00000 - 952G 837G 115G 87.92 9.28 - host ceph-osd4-solid
28 nvme 1.00000 1.00000 952G 837G 115G 87.92 9.28 91 osd.28
HEALTH:
[root@ceph-mon0 ceph]# ceph health detail
HEALTH_WARN 3 nearfull osd(s); 1 pool(s) nearfull
OSD_NEARFULL 3 nearfull osd(s)
osd.25 is near full
osd.28 is near full
osd.29 is near full
POOL_NEARFULL 1 pool(s) nearfull
pool 'solid_rbd' is nearfull
OSD DF:
[root@ceph-mon0 ceph]# ceph osd df | grep nvme | grep -E '(25|28|29)'
29 nvme 1.00000 1.00000 952G 815G 137G 85.58 9.03 89
25 nvme 1.09999 1.00000 952G 835G 117G 87.72 9.25 91
28 nvme 1.00000 1.00000 952G 837G 115G 87.92 9.28 91
MONs settings:
[root@ceph-mon0 ceph]# ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon0.asok config show | grep nearfull
"mon_osd_nearfull_ratio": "0.900000",
OSDs settings:
[root@ceph-osd4 ceph]# ceph daemon osd.28 config get mon_osd_nearfull_ratio
{
"mon_osd_nearfull_ratio": "0.900000"
}
When I was find out 'ceph tell' is not working I was deployed ceph.conf with new settings:
[root@ceph-osd4 ceph]# grep full ceph.conf
mon_osd_full_ratio = .91
mon_osd_nearfull_ratio = .90
[root@ceph-mon0 ceph]# grep full ceph.conf
mon_osd_full_ratio = .91
mon_osd_nearfull_ratio = .90
And restart this osds - not helped.
The (?) same (?) behavior in ceph-users ML http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-December/023397.html
This bug or I need to do some magic?
Updated by Konstantin Shalygin over 6 years ago
When I was delete some data from this osds, nearfull flag was also deleted.
2017-12-21 18:29:15.653156 [INF] Cluster is now healthy 2017-12-21 18:29:15.653145 [INF] Health check cleared: POOL_NEARFULL (was: 1 pool(s) nearfull) 2017-12-21 18:29:15.653125 [INF] Health check cleared: OSD_NEARFULL (was: 1 nearfull osd(s)) 2017-12-21 18:29:11.649585 [WRN] Health check update: 1 nearfull osd(s) (OSD_NEARFULL) 2017-12-21 18:28:52.239743 [WRN] Health check update: 2 nearfull osd(s) (OSD_NEARFULL)
29 nvme 1.00000 1.00000 952G 779G 172G 81.85 8.72 89
25 nvme 1.09999 1.00000 952G 799G 153G 83.90 8.93 91
28 nvme 1.00000 1.00000 952G 801G 151G 84.12 8.96 91
This proves that the osd nearfull flag can not be removed by setting a higher threshold. This can be a big problem if the threshold is accidentally set at times less than necessary (e.g. 0.2 instead 0.8).
Updated by Greg Farnum over 6 years ago
- Tracker changed from Bug to Support
- Project changed from Ceph to RADOS
- Category deleted (
OSD) - Status changed from New to Closed
You need to change this in the osd map, not the config. "ceph osd set-nearfull-ratio" or something similar.