Project

General

Profile

Actions

Bug #1100

closed

osd: marking peers down

Added by Sage Weil almost 13 years ago. Updated almost 13 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I'm reliably seeing peers mark each other down when they shouldn't on benjamin. There are ~21 osds across 3 nodes, and simply restarting them all starts a storm. Something is broken in the heartbeat exchanges.

The workaround is to temporarily increase osd heartbeat grace until everything is up and then lower it again.

Actions #1

Updated by Sage Weil almost 13 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF