Project

General

Profile

Actions

Feature #9802

open

When replaced a disk, the CRUSH weight of the related host changed

Added by Jingjing Zhao over 9 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

In disk replacement test, when add a disk into cluster. The osd tree likes
below:

-25          52.36            host osd035c005
9            4.63              osd.9      up    1    
77           4.48              osd.77     up    1    
135          5.45              osd.135    up    1    
190          4.79              osd.190    up    1    
253          4.93              osd.253    up    1    
305          4.79              osd.305    up    1    
364          4.52              osd.364    up    1    
428          4.21              osd.428    up    1
479          4.32              osd.479    DNE     
528          4.79              osd.528    up    1    
0            5.45              osd.0      up    1    
-45           0               host osd036c005
The test steps:
1. Remove osd.479 from host host osd035c005
2. Add this disk into cluster again Expected result:
The new disk will take ID osd.479. The CRUSH weight of the host osd035c005
doesn't change. Actual result:
The new disk ID is osd.0, and its weight is 5.45. The CRUSH weight of the host
osd035c005 changed.

====The reason:==
At the beginning, some hosts had hardware issues, so we removed all OSD daemon
on it, including crushmap. Crushmap doesn't have these OSD daemon any more, for
example, osd.0.
Then, we started disk replacement test. When added a disk, it took the ID
osd.0, not expected ID.

The effect:
1. We can't minimize the data to be migrated back and forth, because the ID
changes.
2. Since the OSD weight is 5.45, this disk will be injected more data than
other disks.

This situation could happen on real production, for example, if some OSDs are
taken out of the cluster due to hardware issue, then other OSDs need to do disk
replacement.

Although this is not a bug, for this launch we need to be careful for doing
disk replacement under above situation by appropriate operation to control data
migration to some extent.

For a long term, it would be better if CEPH can support to specify OSD ID when
adding an OSD.

No data to display

Actions

Also available in: Atom PDF