Project

General

Profile

Actions

Bug #53824

open

Stretch mode: peering can livelock with acting set changes swapping primary back and forth

Added by Greg Farnum over 2 years ago. Updated almost 2 years ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Peering
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From https://bugzilla.redhat.com/show_bug.cgi?id=2025800

We're getting repeated swaps in the acting set, with logging like

calc_replicated_acting_stretch
osd 4 primary accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 4 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 14 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 0 (up) backfill 2.1f( v 59'5 (0'0,59'5] lb MIN local-lis/les=223/224 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
osd 10 (up) accepted 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
up set insufficient, considering remaining osds
acting candidate 17 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 3 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 6 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2315)
other candidate 16 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
accepting candidate 6

calc_replicated_acting_stretch
osd 4 primary accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 4 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 14 (up) accepted 2.1f( empty local-lis/les=0/0 n=0 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 0 (up) backfill 2.1f( v 59'5 (0'0,59'5] lb MIN local-lis/les=223/224 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
osd 10 (up) accepted 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
up set insufficient, considering remaining osds
acting candidate 6 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
other candidate 3 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
other candidate 16 2.1f( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0)
other candidate 17 2.1f( v 59'5 (0'0,59'5] local-lis/les=229/230 n=2 ec=43/43 lis/c=229/229 les/c/f=230/230/0 sis=2316)
accepting candidate 17


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #53933: pacific: Stretch mode: peering can livelock with acting set changes swapping primary back and forthResolvedGreg FarnumActions
Actions

Also available in: Atom PDF