Project

General

Profile

Actions

Bug #16127

closed

OSDMonitor: drop pg temps from not the current primary

Added by Samuel Just almost 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel,hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Otherwise, the vagaries of pg->osd->mon->other mon message ordering could result in a previous interval pg temp request being processed after a current interval request causing the pg to get stuck.

sjust@teuthology:/a/samuelj-2016-06-01_11:28:31-rados-wip-sam-testing-distro-basic-smithi/228382/remote

<sjusthm> sage: hmm
<sjusthm> pg temps from two osds raced
<sjusthm> on the same osd
<sjusthm> osd.3 sent a request for [0,3] at pg_epoch 16
<sjusthm> which went out at osd epoch 16
<sjusthm> and then again at 17
<sjusthm> (before the pg processed the map)
<sjusthm> however
<sjusthm> 17 also changed the acting set to [0,3]
  • kefu has quit (Quit: My Mac has gone to sleep. ZZZzzz…)
    <sjusthm> and the new primary requested an empty temp mapping
    <sjusthm> also at pg/osd epoch 17
    <sjusthm> the mons processed the one from the new primary
    <sjusthm> and then the stale one from the old primary
    <sjusthm> resulting in the acting set remaining at [0,3] and the pg being stuck
    <sjusthm> the osd epoch part seems to bge a red herring since it's not used in the OSDMonitor
    <sjusthm> I think we need to include with each mapping the interval start epoch
    <sjusthm> and remember that in the OSDMonitor
    <sjusthm> that would allow us to dicard pg temp mappings based on previous intervals
    <sjusthm> hmm
    <sjusthm> wouldn't actually help here
    <sjusthm> since the empty mapping wouldn't be remembered explicitely
  • sahid has quit (Quit: Lost terminal)
    <sjusthm> maybe we can just ignore if it comes from not the current primary from the mon's point of view?
  • rzarzynski has quit (Quit: This computer has gone to sleep)
  • dgurtner_ has quit (Ping timeout: 480 seconds)
  • swami2 has quit (Ping timeout: 480 seconds)
  • rzarzynski (~) has joined #ceph-devel
    <sjusthm> I guess that should be safe
    <sjusthm> can I do that in preprocess?

Related issues 2 (0 open2 closed)

Copied to Ceph - Backport #16429: jewel: OSDMonitor: drop pg temps from not the current primaryResolvedLoïc DacharyActions
Copied to Ceph - Backport #16430: hammer: OSDMonitor: drop pg temps from not the current primaryResolvedWei-Chung ChengActions
Actions

Also available in: Atom PDF