Bug #16365: Better network partition detection - RADOS - Ceph

Actions

Copy link

Bug #16365

closed

Better network partition detection

Added by Peter Sabaini almost 8 years ago. Updated almost 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v10.2.0

ceph-qa-suite:

Component(RADOS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

We had a situation where due to hardware issues our network was getting lossy for one OSD node, without the Ceph monitoring/heartbeating detecting this. The loss came with a twist: the loss rate would rise with packet size. With only a few bytes payload loss was hardly noticeable, but loss rate would rise quasi-proportionally up to 60-80% with payload sizes up to 7kB (we're using jumboframes with max. MTU 8k).

With those high loss rates most reads from OSDs would basically stall, but still MONs would happily report OSDs as up/in. From my reading of docs/code (my C++ is weak++ sadly) I can see why the heartbeating wouldn't catch this - it's sending relatively small MOSDPing messages on separate channels to detect network partitions; those would have passed largely unimpeded.

Still, the pathological state of those OSDs could have been detected by the MON (packets queueing up, tcp timeouts). From the user POV it would have been hugely desirable to just mark those OSDs as down, or at least warn about unresponsive OSDs.

Actions

Copy link

Updated by Greg Farnum almost 7 years ago

Project changed from Ceph to RADOS
Status changed from New to Resolved

We're switching to 2KB heartbeat packets now for other reasons. I don't think there's much else we can do here, practically speaking.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #16365

Better network partition detection

Updated by Greg Farnum almost 7 years ago