Project

General

Profile

Actions

Bug #53824

open

Stretch mode: peering can livelock with acting set changes swapping primary back and forth

Added by Greg Farnum over 2 years ago. Updated over 1 year ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Peering
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From https://bugzilla.redhat.com/show_bug.cgi?id=2025800

We're getting repeated swaps in the acting set, with logging like

calc_replicated_acting_stretch
osd 4 primary accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 4 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 14 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 0 (up) backfill 2.1f( v 59'5 (0'0,59'5] lb MIN local-lis/les=223/224 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 10 (up) accepted 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
up set insufficient, considering remaining osds
acting candidate 17 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 3 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 6 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 16 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
accepting candidate 6

calc_replicated_acting_stretch
osd 4 primary accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 4 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 14 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 0 (up) backfill 2.1f( v 59'5 (0'0,59'5] lb MIN local-lis/les=223/224 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 10 (up) accepted 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
up set insufficient, considering remaining osds
acting candidate 6 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
other candidate 3 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
other candidate 16 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
other candidate 17 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
accepting candidate 17


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #53933: pacific: Stretch mode: peering can livelock with acting set changes swapping primary back and forthResolvedGreg FarnumActions
Actions #1

Updated by Greg Farnum over 2 years ago

So, why is it accepting the non-acting-set member each time, when they seem to have the same data? There's a clue in the source code — the "acting candidate" and "other candidate" output is pushing an OSD representation into the "candidates" list, and the representation is generated with a pair formed by the "get_osd_ord()" function and the osd ID. get_osd_ord() takes as parameters whether the OSD is in the acting set and the OSD's pg_info. And when the list is generated, we sort it.

The very first sort priority is "in the acting set", which is clearly SUPPOSED to prefer OSDs which are already in the acting set1, and that logic is all correct. But it goes wrong because we ALSO move these items, in order, into a mapping from "ancestor crush bucket" (ie, data center) to the list of relevant candidates — and that placement preserves our sorted order, with acting set OSDs first in the list.

But then when we select members of the set by going through the ancestor->list_of_osds mapping, we pop the OSD off the back of the list, which has the effect of preferring non-acting-set OSD candidates!

I think we resolve this by switching to pop_front(), which is just as simple as it sounds.

[1]: std::sort() sorts lists into ascending order, and 0 < 1 so false < true, so the sorting tuple is actually formed of <!in_acting_set, info.last_update, osd_id> to make sure acting set members come first.

Actions #2

Updated by Greg Farnum over 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 44518
Actions #3

Updated by Greg Farnum over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53933: pacific: Stretch mode: peering can livelock with acting set changes swapping primary back and forth added
Actions #5

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF