Feature #9802
openWhen replaced a disk, the CRUSH weight of the related host changed
0%
Description
In disk replacement test, when add a disk into cluster. The osd tree likes
below:
-25 52.36 host osd035c005 9 4.63 osd.9 up 1 77 4.48 osd.77 up 1 135 5.45 osd.135 up 1 190 4.79 osd.190 up 1 253 4.93 osd.253 up 1 305 4.79 osd.305 up 1 364 4.52 osd.364 up 1 428 4.21 osd.428 up 1 479 4.32 osd.479 DNE 528 4.79 osd.528 up 1 0 5.45 osd.0 up 1 -45 0 host osd036c005The test steps:
1. Remove osd.479 from host host osd035c005
2. Add this disk into cluster again Expected result:
The new disk will take ID osd.479. The CRUSH weight of the host osd035c005
doesn't change. Actual result:
The new disk ID is osd.0, and its weight is 5.45. The CRUSH weight of the host
osd035c005 changed.
====The reason:==
At the beginning, some hosts had hardware issues, so we removed all OSD daemon
on it, including crushmap. Crushmap doesn't have these OSD daemon any more, for
example, osd.0.
Then, we started disk replacement test. When added a disk, it took the ID
osd.0, not expected ID.
1. We can't minimize the data to be migrated back and forth, because the ID
changes.
2. Since the OSD weight is 5.45, this disk will be injected more data than
other disks.
This situation could happen on real production, for example, if some OSDs are
taken out of the cluster due to hardware issue, then other OSDs need to do disk
replacement.
Although this is not a bug, for this launch we need to be careful for doing
disk replacement under above situation by appropriate operation to control data
migration to some extent.
For a long term, it would be better if CEPH can support to specify OSD ID when
adding an OSD.
No data to display