Bug #22837
discover_all_missing() not always called during activating
Start date:
01/30/2018
Due date:
% Done:
0%
Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Description
Sometimes discover_all_missing() isn't called so we don't get a complete picture of misplaced objects. This makes the new _update_calc_stats undercount misplaced as degraded which is the point of the changes. Also, this race makes a test case unreliable.
On the run that does get the missing for osd.1 pg 1.0 goes from [1,0] -> [2,4] -> [2,4,3,5]
A search_for_missing() was triggered during the [2,4] transition
Then discover_all_missing during the [2,4,3,5] transition
2018-01-29 20:27:20.963 7f05606aa700 15 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] build_might_have_unfound: built 0,1,3,4,5 2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing 200 missing, 200 unfound 2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.0: requesting pg_missing_t 2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.1: requesting pg_missing_t 2018-01-29 20:27:20.963 7f05606aa700 20 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.4: we already have pg_missing_t
Related issues
History
#1 Updated by David Zafman about 1 year ago
- Subject changed from discover_all_missing() not always called during peering to discover_all_missing() not always called during activating
#2 Updated by David Zafman about 1 year ago
- Status changed from New to In Progress
#3 Updated by David Zafman 9 months ago
- Status changed from In Progress to Resolved
#4 Updated by David Zafman 6 months ago
- Related to Bug #21803: objects degraded higher than 100% added
#5 Updated by David Zafman 6 months ago
- Status changed from Resolved to Pending Backport
- Backport set to luminous
Based on information from http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021512.html I'm marking this pending backport to luminous.
I can't say if this will be difficult to backport.
#6 Updated by Nathan Cutler 6 months ago
- Copied to Backport #26992: luminous: discover_all_missing() not always called during activating added
#7 Updated by Nathan Cutler 5 months ago
- Status changed from Pending Backport to Resolved