Bug #57852: osd: unhealthy osd cannot be marked down in time - RADOS - Ceph

Actions

Copy link

Bug #57852

open

osd: unhealthy osd cannot be marked down in time

Added by wencong wan over 1 year ago. Updated about 1 year ago.

Status:

Need More Info

Priority:

Normal

Assignee:

Prashant D

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

OSD

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Before an unhealthy osd is marked down by mon, other osd may choose it as
heartbeat peer and then report an incorrect failure time(first_tx) to mon.

reproduce:
Shutdown cluster_network and public_network of an osd node several times.

Files

Download all files

p1.png (63.1 KB) p1.png	ifdown net at 13:10	wencong wan, 10/12/2022 02:13 AM
p2.png (246 KB) p2.png	after 10 minutes,unhealthy osd still keep up status	wencong wan, 10/12/2022 02:15 AM

History
Notes
Property changes

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #57852

osd: unhealthy osd cannot be marked down in time

Updated by Radoslaw Zarzynski over 1 year ago

Updated by wencong wan over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Prashant D over 1 year ago

Updated by Radoslaw Zarzynski over 1 year ago

Updated by Prashant D about 1 year ago