Project

General

Profile

Actions

Bug #21813

closed

OSD bind to IPv6 link-local address

Added by Wido den Hollander over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
messenger,luminous,osd
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Just observed this behavior on a cluster when upgrading to Luminous:

osd.2 up   in  weight 1 up_from 175547 up_thru 175711 down_at 175546 last_clean_interval [175531,175545) [2a04:XXX:1:5:ec4:7aff:fe1e:44c8]:6808/2302 [fe80::ec4:7aff:fe1e:44c8%bond0.204]:6828/1002302 [2a04:XXX:1:5:ec4:7aff:fe1e:44c8]:6828/1002302 [2a04:XXX:1:5:ec4:7aff:fe1e:44c8]:6829/1002302 exists,up 7bdbcb99-fd7f-4880-859a-9e54d26c96da
osd.5 up   in  weight 1 up_from 175700 up_thru 175712 down_at 175699 last_clean_interval [175527,175698) [fe80::ec4:7aff:fe1e:44c8%bond0.204]:6800/1658 [2a04:XXX:1:5:ec4:7aff:fe1e:44c8]:6809/1001658 [fe80::ec4:7aff:fe1e:44c8%bond0.204]:6809/1001658 [fe80::ec4:7aff:fe1e:44c8%bond0.204]:6810/1001658 exists,up c3e13f69-43b6-4441-922b-aef5d2bfe262
osd.30 up   in  weight 1 up_from 175677 up_thru 175845 down_at 175676 last_clean_interval [175665,175675) [fe80::ec4:7aff:fe1e:3f3c%bond0.204]:6800/1662 [2a04:XXX:1:5:ec4:7aff:fe1e:3f3c]:6808/1001662 [fe80::ec4:7aff:fe1e:3f3c%bond0.204]:6808/1001662 [fe80::ec4:7aff:fe1e:3f3c%bond0.204]:6809/1001662 exists,up 3c1aeb5b-0ace-49cf-84f8-c85dfedd7c2f

In this case OSD 2, 5 and 30 bound to a Link-Local Ipv6 (fe80:XX:XX) address after they booted.

This is probably some form of race condition where the Unicast 2a04:X address isn't online yet but the OSDs boot.

These fe80 addresses should however not qualify as an address to bind on as they can't be routed thus breaks traffic.


Related issues 1 (0 open1 closed)

Copied to Ceph - Backport #23501: luminous: OSD bind to IPv6 link-local addressResolvedWido den HollanderActions
Actions #1

Updated by Sage Weil over 6 years ago

  • Status changed from New to 4

We do allow binding to 127.0.0.1 (and do that frequently for vstart.sh for devs). Is it okay to only allow loopback testing on ipv4 and not on ipv6?

Actions #2

Updated by Wido den Hollander over 6 years ago

Sage Weil wrote:

We do allow binding to 127.0.0.1 (and do that frequently for vstart.sh for devs). Is it okay to only allow loopback testing on ipv4 and not on ipv6?

Yes, you can allow localhost for IPv6, that would be ::1

But the link-local addresses are Layer 2 addresses and can't be routed. Those should not be selected. It could be made configurable, but it shouldn't be the default.

fe80::/10 is reserved for link-local with IPv6.

I've been looking to patch this, but I can't find the loop where the Messenger selects the available addresses.

Actions #3

Updated by Wido den Hollander about 6 years ago

I just noticed this again on a cluster which is running with IPv6 and Jewel:

Feb  2 13:42:19 ceph04 ceph-osd[3704]: 2018-02-02 13:42:19.287735 7f8ea9949700 -1 log_channel(cluster) log [ERR] : map e15776 had wrong cluster addr ([fe80::a236:9fff:fed8:54c0%enp5s0f1]:6801/3704 != my [2a05:XXXX:ff01:1:a236:9fff:fed8:54c0]:6801/3704)

The cluster in this case is running with StateLess Address Auto Configuration (SLAAC) and the IPs are not online after boot before the OSDs start, so they only find their link-local address.

We should simply ignore fe80:: addressess when selecting a IPv6 address.

Actions #4

Updated by Kefu Chai about 6 years ago

  • Status changed from 4 to Fix Under Review
  • Assignee set to Wido den Hollander
  • Backport set to luminous
Actions #5

Updated by Kefu Chai about 6 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Nathan Cutler about 6 years ago

  • Copied to Backport #23501: luminous: OSD bind to IPv6 link-local address added
Actions #8

Updated by Nathan Cutler about 6 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF