Bug #4922: Adding OSD to CRUSH leads to scrubbing while disabled in config - Ceph - Ceph

Actions

Copy link

Bug #4922

closed

Adding OSD to CRUSH leads to scrubbing while disabled in config

Added by Ivan Kudryavtsev almost 11 years ago. Updated almost 11 years ago.

Status:

Won't Fix

Priority:

High

Assignee:

David Zafman

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hello,
I'm using ceph version:
ceph version 0.56.4 (63b0f854d1cef490624de5d6cf9039735c7de5ca)

I have next lines in my ceph config:
osd scrub load threshold = 0.1
osd scrub min interval = 1000000
osd scrub max interval = 10000000

Yesterday I added 4 OSD to cluster and next I'm using script like this

#!/usr/bin/perl

use strict;
use warnings;

my $host = $ARGV[0];
my @args = @ARGV[1 .. scalar @ARGV - 1];

foreach my $w qw(0.2 0.4 0.6 0.8 1.0) {
    foreach my $n (@args) {
        system("ceph osd crush set $n osd.$n $w pool=default host=$host");
        sleep 3600;
    }
}

to fill new OSDs with data.

My LA on every OSD host is more than 0.5 and actually near to 1.0.

Now, I see that it launches deep scrubbing from time to time and I believe that something in OSD data population leads to this behaviour.

I also want to ask another question about OSD behaviour:

I have OSD on 500GB drives and OSD with 2TB drives and found next situation after adding new OSD of 2TB. I've got NEAR FULL situation on small OSD, but as I think adding of new OSD of 2TB should lead to less data on 500GB OSD because some data should be moved to 2TB drive, but I got NEAR FULL while it was HEALTH_OK before.

The question is: does ceph OSDs with different space should have different weight, like 2TB - 1.0 and 500GB, just 0.25 to handle space correctly or ceph calculates free space when does CRUSH/PG mapping and another operations?

My OSD tree looks like:

# id    weight    type name    up/down    reweight
-1    36.8    pool default
-11    24        datacenter YYY
-10    24            room YYY-room-1
-8    12                rack YYY-rack-1
-4    12                    host ceph-osd-2-1
1    1                        osd.1    up    1    
5    1                        osd.5    up    1    
6    1                        osd.6    up    1    
9    1                        osd.9    up    1    
12    1                        osd.12    up    1    
15    1                        osd.15    up    1    
16    1                        osd.16    up    1    
17    1                        osd.17    up    1    
18    1                        osd.18    up    1    
30    1                        osd.30    up    1    
31    1                        osd.31    up    1    
32    1                        osd.32    up    1    
-9    12                rack YYY-rack-5
-5    12                    host ceph-osd-1-1
2    1                        osd.2    up    1    
7    1                        osd.7    up    1    
8    1                        osd.8    up    1    
11    1                        osd.11    up    1    
13    1                        osd.13    up    1    
19    1                        osd.19    up    1    
20    1                        osd.20    up    1    
21    1                        osd.21    up    1    
22    1                        osd.22    up    1    
27    1                        osd.27    up    1    
28    1                        osd.28    up    1    
29    1                        osd.29    up    1    
-7    12.8        datacenter XXX-1
-6    12.8            room XXX-1-room-1
-3    12.8                rack XXX-1-rack-1
-2    12.8                    host ceph-osd-3-1
0    1                        osd.0    up    1    
3    1                        osd.3    up    1    
4    1                        osd.4    up    1    
10    1                        osd.10    up    1    
14    1                        osd.14    up    1    
23    1                        osd.23    up    1    
24    1                        osd.24    up    1    
25    1                        osd.25    up    1    
26    1                        osd.26    up    1    
33    1                        osd.33    up    1    
34    1                        osd.34    up    1    
35    1                        osd.35    up    1    
36    0.8                        osd.36    up    1

Should I set smaller weight for OSDs with smaller amount of space as I mentioned before to avoid NEAR FULL? I think this question response should be in documentation.

Best wishes.

Actions

Copy link

Updated by Anonymous almost 11 years ago

Priority changed from Normal to High

Actions

Copy link

Updated by Ian Colle almost 11 years ago

Assignee set to David Zafman

Actions

Copy link

Updated by David Zafman almost 11 years ago

Status changed from New to Won't Fix

In this release to disable scrubbing you'd want much larger values for the scrub intervals. Also, you need to specify the deep scrub interval.

osd scrub min interval = 315360000
osd scrub max interval = 315360000
osd deep scrub interval = 315360000

(315360000 = 60 * 60 * 24 * 365 * 10)

Also, my read of the code is that each PG could get a single scrub and/or deep scrub, since the last_deep_scrub_stamp or last_scrub_pg starts at 0.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #4922

Adding OSD to CRUSH leads to scrubbing while disabled in config

Updated by Anonymous almost 11 years ago

Updated by Ian Colle almost 11 years ago

Updated by David Zafman almost 11 years ago