Bug #44446
closedosd status reports old crush location after osd moves
0%
Description
Scenario:
Move an OSD disk from host=worker1 to a new node (host=worker0) and on that new node we update the crush location (e.g. host=worker0). After the OSD starts up `ceph osd status` reports the old location, but `ceph osd tree` reports the new location.
[nwatkins@smash rook]$ kubectl -n rook-ceph exec -it rook-ceph-tools-7cf4cc7568-kz4q6 ceph osd status +----+---------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+---------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | worker1 | 1027M | 8188M | 0 | 0 | 0 | 0 | exists,up | | 1 | worker0 | 1027M | 8188M | 0 | 0 | 0 | 0 | exists,up | +----+---------+-------+-------+--------+---------+--------+---------+-----------+ [nwatkins@smash rook]$ kubectl -n rook-ceph exec -it rook-ceph-tools-7cf4cc7568-kz4q6 ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.01758 root default -5 0.01758 host worker0 0 hdd 0.00879 osd.0 up 1.00000 1.00000 1 hdd 0.00879 osd.1 up 1.00000 1.00000
And then after restarting the manager, `ceph osd status` also starts reporting the correct location
[nwatkins@smash rook]$ kubectl -n rook-ceph exec -it rook-ceph-tools-7cf4cc7568-kz4q6 ceph osd status +----+---------+-------+-------+--------+---------+--------+---------+-----------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+---------+-------+-------+--------+---------+--------+---------+-----------+ | 0 | worker0 | 1027M | 8188M | 0 | 0 | 0 | 0 | exists,up | | 1 | worker0 | 1027M | 8188M | 0 | 0 | 0 | 0 | exists,up | +----+---------+-------+-------+--------+---------+--------+---------+-----------+
related: http://tracker.ceph.com/issues/40011
Updated by Kefu Chai about 4 years ago
- Copied from Bug #40871: osd status reports old crush location after osd moves added
Updated by Kefu Chai about 4 years ago
this issue is copied from #40871. the PR of https://github.com/ceph/ceph/pull/30448/files does not address this issue at all. the stale hostname of OSD is caused by another root cause.
Updated by Kefu Chai about 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 33752
Updated by Sage Weil about 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #44522: luminous: osd status reports old crush location after osd moves added
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #44523: mimic: osd status reports old crush location after osd moves added
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #44524: nautilus: osd status reports old crush location after osd moves added
Updated by Nathan Cutler about 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".