Project

General

Profile

Actions

Feature #10029

closed

Retry binding on IPv6 address if not available

Added by Wido den Hollander over 9 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
ipv6
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

On systems with IPv6 it might be that the IPv6 address is not yet available when a MON or OSD boots.

This can have multiple causes:
  • DAD still in progress (Duplicate Address Detection)
  • SLAAC is still in progress (Stateless Autoconfiguration)

When an interface comes up it can take up to a couple of seconds before IPv6 connectivity is available or even an address is assigned to the interface.

systemd/upstart/sysvinit will start the daemons as soon as they think the network is ready, but it might be that IPv6 is not configured yet.

Monitors and OSDs will fail to start since they can't bind to a IPv6 socket and exit.

It would be usefull if the daemons would retry the binding again within a couple of seconds:

1. Try to bind
2. If it fails, wait 5 seconds
3. Try to bind again

We might add a short loop here where we have a configureable delay and number of retries, that would make it flexible and usefull for most situations.

This only applies to IPv6 though, so only when 'ms_bind_ipv6' is set to true.

Actions #1

Updated by Wido den Hollander over 9 years ago

I started playing with this a bit (no commits yet), I simply loop in SimpleMessenger's Accepter.cc and retry to bind a couple of times before giving up.

For IPv4 you have a net.ipv4.ip_nonlocal_bind, but that does not exist for IPv6.

A work-around would be to disable DAD on the interfaces, but that isn't the best way imho.

On the internet you find all kinds of posts where people run into this issue. It's not limited to Ceph, but the same goes for Nginx for example.

Actions #2

Updated by Wido den Hollander over 9 years ago

Logs I'm seeing on a monitor when it boots:

2014-12-08 13:04:16.291838 7f1fd75ef7c0  0 ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3), process ceph-mon, pid 1897
2014-12-08 13:04:16.473408 7f1fd75ef7c0  0 starting mon.srv-51d5-11 rank 1 at [XXXX:XXXX:1:1:ec4:7aff:fe1e:390e]:6789/0 mon_data /var/lib/ceph/mon/ceph-srv-51d5-11 fsid ada2c7ae-2483-4428-a159-1a20fe2a579d
2014-12-08 13:04:16.473445 7f1fd75ef7c0 -1 accepter.accepter.bind unable to bind to [XXXX:XXXX:1:1:ec4:7aff:fe1e:390e]:6789: (99) Cannot assign requested address
2014-12-08 13:04:16.473457 7f1fd75ef7c0 -1 unable to bind monitor to [XXXX:XXXX:1:1:ec4:7aff:fe1e:390e]:6789/0
Actions #3

Updated by Samuel Just over 9 years ago

  • Status changed from New to Resolved
Actions #4

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
Actions

Also available in: Atom PDF