Project

General

Profile

Actions

Bug #16365

closed

Better network partition detection

Added by Peter Sabaini almost 8 years ago. Updated almost 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We had a situation where due to hardware issues our network was getting lossy for one OSD node, without the Ceph monitoring/heartbeating detecting this. The loss came with a twist: the loss rate would rise with packet size. With only a few bytes payload loss was hardly noticeable, but loss rate would rise quasi-proportionally up to 60-80% with payload sizes up to 7kB (we're using jumboframes with max. MTU 8k).

With those high loss rates most reads from OSDs would basically stall, but still MONs would happily report OSDs as up/in. From my reading of docs/code (my C++ is weak++ sadly) I can see why the heartbeating wouldn't catch this - it's sending relatively small MOSDPing messages on separate channels to detect network partitions; those would have passed largely unimpeded.

Still, the pathological state of those OSDs could have been detected by the MON (packets queueing up, tcp timeouts). From the user POV it would have been hugely desirable to just mark those OSDs as down, or at least warn about unresponsive OSDs.

Actions #1

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Status changed from New to Resolved

We're switching to 2KB heartbeat packets now for other reasons. I don't think there's much else we can do here, practically speaking.

Actions

Also available in: Atom PDF