Bug #4922
closedAdding OSD to CRUSH leads to scrubbing while disabled in config
0%
Description
Hello,
I'm using ceph version:
ceph version 0.56.4 (63b0f854d1cef490624de5d6cf9039735c7de5ca)
I have next lines in my ceph config:
osd scrub load threshold = 0.1
osd scrub min interval = 1000000
osd scrub max interval = 10000000
Yesterday I added 4 OSD to cluster and next I'm using script like this
#!/usr/bin/perl use strict; use warnings; my $host = $ARGV[0]; my @args = @ARGV[1 .. scalar @ARGV - 1]; foreach my $w qw(0.2 0.4 0.6 0.8 1.0) { foreach my $n (@args) { system("ceph osd crush set $n osd.$n $w pool=default host=$host"); sleep 3600; } }
to fill new OSDs with data.
My LA on every OSD host is more than 0.5 and actually near to 1.0.
Now, I see that it launches deep scrubbing from time to time and I believe that something in OSD data population leads to this behaviour.
I also want to ask another question about OSD behaviour:
I have OSD on 500GB drives and OSD with 2TB drives and found next situation after adding new OSD of 2TB. I've got NEAR FULL situation on small OSD, but as I think adding of new OSD of 2TB should lead to less data on 500GB OSD because some data should be moved to 2TB drive, but I got NEAR FULL while it was HEALTH_OK before.
The question is: does ceph OSDs with different space should have different weight, like 2TB - 1.0 and 500GB, just 0.25 to handle space correctly or ceph calculates free space when does CRUSH/PG mapping and another operations?
My OSD tree looks like:
# id weight type name up/down reweight -1 36.8 pool default -11 24 datacenter YYY -10 24 room YYY-room-1 -8 12 rack YYY-rack-1 -4 12 host ceph-osd-2-1 1 1 osd.1 up 1 5 1 osd.5 up 1 6 1 osd.6 up 1 9 1 osd.9 up 1 12 1 osd.12 up 1 15 1 osd.15 up 1 16 1 osd.16 up 1 17 1 osd.17 up 1 18 1 osd.18 up 1 30 1 osd.30 up 1 31 1 osd.31 up 1 32 1 osd.32 up 1 -9 12 rack YYY-rack-5 -5 12 host ceph-osd-1-1 2 1 osd.2 up 1 7 1 osd.7 up 1 8 1 osd.8 up 1 11 1 osd.11 up 1 13 1 osd.13 up 1 19 1 osd.19 up 1 20 1 osd.20 up 1 21 1 osd.21 up 1 22 1 osd.22 up 1 27 1 osd.27 up 1 28 1 osd.28 up 1 29 1 osd.29 up 1 -7 12.8 datacenter XXX-1 -6 12.8 room XXX-1-room-1 -3 12.8 rack XXX-1-rack-1 -2 12.8 host ceph-osd-3-1 0 1 osd.0 up 1 3 1 osd.3 up 1 4 1 osd.4 up 1 10 1 osd.10 up 1 14 1 osd.14 up 1 23 1 osd.23 up 1 24 1 osd.24 up 1 25 1 osd.25 up 1 26 1 osd.26 up 1 33 1 osd.33 up 1 34 1 osd.34 up 1 35 1 osd.35 up 1 36 0.8 osd.36 up 1
Should I set smaller weight for OSDs with smaller amount of space as I mentioned before to avoid NEAR FULL? I think this question response should be in documentation.
Best wishes.
Updated by David Zafman almost 11 years ago
- Status changed from New to Won't Fix
In this release to disable scrubbing you'd want much larger values for the scrub intervals. Also, you need to specify the deep scrub interval.
osd scrub min interval = 315360000
osd scrub max interval = 315360000
osd deep scrub interval = 315360000
(315360000 = 60 * 60 * 24 * 365 * 10)
Also, my read of the code is that each PG could get a single scrub and/or deep scrub, since the last_deep_scrub_stamp or last_scrub_pg starts at 0.