Project

General

Profile

Actions

Bug #22837

closed

discover_all_missing() not always called during activating

Added by David Zafman about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Sometimes discover_all_missing() isn't called so we don't get a complete picture of misplaced objects. This makes the new _update_calc_stats undercount misplaced as degraded which is the point of the changes. Also, this race makes a test case unreliable.

On the run that does get the missing for osd.1 pg 1.0 goes from [1,0] -> [2,4] -> [2,4,3,5]

A search_for_missing() was triggered during the [2,4] transition
Then discover_all_missing during the [2,4,3,5] transition

2018-01-29 20:27:20.963 7f05606aa700 15 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] build_might_have_unfound: built 0,1,3,4,5
2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing 200 missing, 200 unfound
2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.0: requesting pg_missing_t
2018-01-29 20:27:20.963 7f05606aa700 10 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.1: requesting pg_missing_t
2018-01-29 20:27:20.963 7f05606aa700 20 osd.2 pg_epoch: 35 pg[1.0( v 29'200 lc 0'0 (0'0,29'200] local-lis/les=33/35 n=200 ec=26/26 lis/c 31/28 les/c/f 32/29/0 33/33/31) [2,4,3,5] r=0 lpr=33 pi=[28,33)/2 crt=29'200 mlcod 0'0 unknown m=200 u=200] discover_all_missing: osd.4: we already have pg_missing_t

Related issues 2 (0 open2 closed)

Related to Ceph - Bug #21803: objects degraded higher than 100%ResolvedDavid Zafman10/13/2017

Actions
Copied to RADOS - Backport #26992: luminous: discover_all_missing() not always called during activatingResolvedPrashant DActions
Actions #1

Updated by David Zafman about 6 years ago

  • Subject changed from discover_all_missing() not always called during peering to discover_all_missing() not always called during activating
Actions #2

Updated by David Zafman about 6 years ago

  • Status changed from New to In Progress
Actions #3

Updated by David Zafman almost 6 years ago

  • Status changed from In Progress to Resolved
Actions #4

Updated by David Zafman over 5 years ago

  • Related to Bug #21803: objects degraded higher than 100% added
Actions #5

Updated by David Zafman over 5 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to luminous

Based on information from http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021512.html I'm marking this pending backport to luminous.

I can't say if this will be difficult to backport.

Actions #6

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #26992: luminous: discover_all_missing() not always called during activating added
Actions #7

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF