Feature #18438
openConfigurable OSD Heartbeat packet size (MTU)
0%
Description
Hi,
Firstly, apologies as I am fairly high level in my understanding here.
During maintenance of our Ceph cluster, we moved one of our OSD nodes to a new rack. When this node came back online, the whole cluster came to a halt.
This turned out to be due to Jumbo frames not being enabled along part of the path to the previous rack (and other infrastructure), essentially a networking issue. (We only run Jumbo frames on our replication network for the moment due to client side issues with the public network.)
However, I would have thought that if the OSD's were unable to properly communicate they would take themselves out of the crushmap/or stop the service.
This appears to be due to the heartbeat packets being under 1500 bytes, thus not being segmented or dropped as a larger frame would, and causing the OSD's to stay online. This means that the the rest of the cluster, still attempting to sync, grinds to a halt trying to talk to the node with large frames (which are dropped).
Its fair to say that this isn't really a bug, but more of a feature request, to further increase Ceph's resilience to network issues, by enabling configuration to send larger heartbeat packets.
To reiterate, one single misconfigured switch port, could cause a very big ceph outage!
We used 'ping ip -M do -s 1450 (and then 1650) to diagnose the MTU fault.
Appreciate any feedback!
Thanks.
Ross
Updated by Ross Martyn over 7 years ago
Tried to remove the 'Target Version'... Not able to!
Updated by Greg Farnum almost 7 years ago
- Has duplicate Feature #20087: OSD: Add heartbeat message for Jumbo Frames(MTU 9000) added