Project

General

Profile

Bug #1100

osd: marking peers down

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

I'm reliably seeing peers mark each other down when they shouldn't on benjamin. There are ~21 osds across 3 nodes, and simply restarting them all starts a storm. Something is broken in the heartbeat exchanges.

The workaround is to temporarily increase osd heartbeat grace until everything is up and then lower it again.

History

#1 Updated by Sage Weil over 9 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF