Project

General

Profile

Actions

Bug #22233

open

prime_pg_temp breaks on uncreated pgs

Added by Kefu Chai over 6 years ago. Updated over 4 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  1. mon.b instructed osd.3 to create pg 92.4. the upset was [3,6]
  2. osd.3 created pg 92.4, and sent "created" message to mon
  3. but osd.3 was pending on up_thru message from mon, it wanted its up_thru to be 92 in osdmap, but currently 89.
  4. osd.3 was killed, marked down and out by the thrashosd task
  5. mon.b primed pg temp for pg 92.4, and mapped it to [4,6], but osd.4 is not updated with osd_pg_create message.
  6. so wait_for_clean task times out, and fail

/a/kchai-2017-11-23_12:43:54-rados-wip-kefu-testing-2017-11-23-1812-distro-basic-mira/1881963

Actions #1

Updated by John Spray over 6 years ago

  • Project changed from Ceph to RADOS
Actions #2

Updated by Greg Farnum over 6 years ago

  • Subject changed from cluster fails to return clean after marking down an OSD the thrashosd test to prime_pg_temp breaks on uncreated pgs
  • Priority changed from Normal to High
Actions #3

Updated by Sage Weil over 6 years ago

/a/sage-2017-12-05_18:31:27-rados-wip-pg-scrub-preempt-distro-basic-smithi/1934001 ?

Actions #4

Updated by Kefu Chai over 6 years ago

  • Status changed from New to In Progress
Actions #5

Updated by Kefu Chai over 6 years ago

  • Category set to Correctness/Safety
  • Status changed from In Progress to Fix Under Review
  • Backport set to luminous
Actions #6

Updated by Sage Weil over 5 years ago

I don't understand by the bug happened (or what the proposed fix is trying to do). Given the description above, the mon should see hte pg mapping change from [3,6] to [4,6] and send the create to osd.4

(Also, haven't been seeing this failure...)

Actions #7

Updated by Greg Farnum over 4 years ago

  • Status changed from Fix Under Review to In Progress
  • Priority changed from High to Normal
Actions #8

Updated by Kefu Chai over 4 years ago

the mon should see hte pg mapping change from [3,6] to [4,6] and send the create to osd.4

exactly. that's why i am adding inc.new_pg_temp to pending_creatings.pgs in the fix.

Actions

Also available in: Atom PDF