Project

General

Profile

Bug #22233

prime_pg_temp breaks on uncreated pgs

Added by Kefu Chai about 3 years ago. Updated over 1 year ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature:

Description

  1. mon.b instructed osd.3 to create pg 92.4. the upset was [3,6]
  2. osd.3 created pg 92.4, and sent "created" message to mon
  3. but osd.3 was pending on up_thru message from mon, it wanted its up_thru to be 92 in osdmap, but currently 89.
  4. osd.3 was killed, marked down and out by the thrashosd task
  5. mon.b primed pg temp for pg 92.4, and mapped it to [4,6], but osd.4 is not updated with osd_pg_create message.
  6. so wait_for_clean task times out, and fail

/a/kchai-2017-11-23_12:43:54-rados-wip-kefu-testing-2017-11-23-1812-distro-basic-mira/1881963

History

#1 Updated by John Spray almost 3 years ago

  • Project changed from Ceph to RADOS

#2 Updated by Greg Farnum almost 3 years ago

  • Subject changed from cluster fails to return clean after marking down an OSD the thrashosd test to prime_pg_temp breaks on uncreated pgs
  • Priority changed from Normal to High

#3 Updated by Sage Weil almost 3 years ago

/a/sage-2017-12-05_18:31:27-rados-wip-pg-scrub-preempt-distro-basic-smithi/1934001 ?

#4 Updated by Kefu Chai almost 3 years ago

  • Status changed from New to In Progress

#5 Updated by Kefu Chai almost 3 years ago

  • Category set to Correctness/Safety
  • Status changed from In Progress to Fix Under Review
  • Backport set to luminous

#6 Updated by Sage Weil about 2 years ago

I don't understand by the bug happened (or what the proposed fix is trying to do). Given the description above, the mon should see hte pg mapping change from [3,6] to [4,6] and send the create to osd.4

(Also, haven't been seeing this failure...)

#7 Updated by Greg Farnum over 1 year ago

  • Status changed from Fix Under Review to In Progress
  • Priority changed from High to Normal

#8 Updated by Kefu Chai over 1 year ago

the mon should see hte pg mapping change from [3,6] to [4,6] and send the create to osd.4

exactly. that's why i am adding inc.new_pg_temp to pending_creatings.pgs in the fix.

Also available in: Atom PDF