Project

General

Profile

Bug #37775

some pg_created messages not sent to mon

Added by Sage Weil 8 months ago. Updated 4 months ago.

Status:
Pending Backport
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
Start date:
12/31/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
luminous, mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

mon doesn't get pg_created for two pgs. CREATING flag is never removed, job fails with a final scrub timeout

/a/sage-2018-12-29_16:59:10-rados-master-distro-basic-smithi/3405637

osd sends them to mon, but a msgr reconnect drops them. there is no retry.

2018-12-29 17:47:31.080 7f949c7a8700  1 -- 172.21.15.110:6812/10945 --> 172.21.15.110:6790/0 -- osd_pg_created(1.0) v1 -- 0x55ecc3c00e00 con 0

Related issues

Duplicated by RADOS - Bug #37752: pool stuck with 'creating' flag set Duplicate 12/24/2018
Copied to RADOS - Backport #37816: mimic: some pg_created messages not sent to mon Need More Info
Copied to RADOS - Backport #37817: luminous: some pg_created messages not sent to mon New

History

#1 Updated by Sage Weil 8 months ago

how about,
- if pool CREATING flag is sent, we queue a 'created' message when the pg peers
- osd tracks pending created messages, resends on mon reset
- prune pgs from the list when the pool flag is cleared

this will easily mean resending some of these if it takes a while for the pool's pgs to be created, but the messages are cheap and harmless.

#2 Updated by Sage Weil 8 months ago

  • Status changed from Verified to Need Review

#3 Updated by Kefu Chai 8 months ago

  • Status changed from Need Review to Pending Backport
  • Backport set to luminous, mimic
  • Pull request ID set to 25731

#4 Updated by Greg Farnum 8 months ago

  • Duplicated by Bug #37752: pool stuck with 'creating' flag set added

#5 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #37816: mimic: some pg_created messages not sent to mon added

#6 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #37817: luminous: some pg_created messages not sent to mon added

#7 Updated by Neha Ojha 4 months ago

/a/yuriw-2019-04-04_00:00:53-rados-luminous-distro-basic-smithi/3806121/

Also available in: Atom PDF