Bug #21803
closed
objects degraded higher than 100%
Added by David Zafman over 6 years ago.
Updated over 5 years ago.
Description
Original post:
1. Jewel deployment with filestore.
2. Upgrade to Luminous (including mgr deployment and "ceph osd
require-osd-release luminous"), still on filestore.
3. rados bench with subsequent cleanup.
4. All OSDs up, all PGs active+clean.
5. Stop one OSD. Remove from CRUSH, auth list, OSD map.
6. Reinitialize OSD with bluestore.
7. Start OSD, commencing backfill.
8. Degraded objects above 100%.
I reproduced with a simpler test:
1. ceph osd pool create test 1 1
2. ceph osd pool set test size 1
3. rados -p test bench 10 write --no-cleanup
4. ceph osd pool set test size 3
- Status changed from New to 7
- Assignee set to David Zafman
- Backport set to luminous, jewel
- Related to Bug #21887: degraded calculation is off during backfill added
- Related to Bug #20059: miscounting degraded objects added
- Status changed from 7 to Pending Backport
- Status changed from Pending Backport to Resolved
- Backport deleted (
luminous, jewel)
It turns out this change is completely superseded by #20059. So I'm switching it to resolved.
I've decided that we won't backport to jewel for now either.
- Affected Versions v10.2.9, v12.2.8 added
David Zafman wrote:
It turns out this change is completely superseded by #20059. So I'm switching it to resolved.
I created the "original post" referred to in the description (part of a longer thread on the issue):
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021512.html
We are still seeing this reproducibly on current Luminous (upgraded from latest Jewel). So I don't believe that #20059 fixed this. Is there anything users can do to avoid this issue? It can massively lengthen recovery times, rather unexpectedly.
- Status changed from Resolved to Pending Backport
- Priority changed from Normal to High
- Backport set to luminous
- Status changed from Pending Backport to 4
- Related to Bug #22837: discover_all_missing() not always called during activating added
- Status changed from 4 to Resolved
- Backport deleted (
luminous)
This change fixes the internal calculation of degraded objects. The _update_calc_stats() function was re-written by #20059, so this code can not be backported.
cluster:
health: HEALTH_WARN
3/1524 objects misplaced (0.197%)
Degraded data redundancy: 197528/1524 objects degraded
(12961.155%), 1057 pgs unclean, 1055 pgs degraded, 3 pgs undersized
data:
pools: 1 pools, 2048 pgs
objects: 508 objects, 1467 MB
usage: 127 GB used, 35639 GB / 35766 GB avail
pgs: 197528/1524 objects degraded (12961.155%)
3/1524 objects misplaced (0.197%)
1042 active+recovery_wait+degraded
991 active+clean
8 active+recovering+degraded
3 active+undersized+degraded+remapped+backfill_wait
2 active+recovery_wait+degraded+remapped
2 active+remapped+backfill_wait
io:
recovery: 340 kB/s, 80 objects/s
There are multiple issues reflected by the above status:
- There are still 508 objects present (asynchronous deletes still in progress?)
- Deleting an OSD from the crush map may have cause many PGs to move around requiring lots of recovery
Caused 7 PGs to need to be temporarily remapped (state: remapped)
Still need to recover 1052 PGs (states: recovery_wait or recovering)
Need to backfill 5 PGs. (states: backfill_wait)
- Master had additional pull request https://github.com/ceph/ceph/pull/20220 (#22837)
The tracker #22837 which which I'm marking for backport might address some of the high degraded count.
I think there is a procedure for filestore to bluestore conversion. That conversion should NOT change the crush map and the osd retains it's number and noout might be set so that PGs don't move while the OSD is down.
Also available in: Atom
PDF