Project

General

Profile

Actions

Bug #6380

closed

Monitor tries to bind on IPv6 address while not available yet after boot

Added by Wido den Hollander over 10 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Not sure yet what the problem is, but creating a issue so it's documented.

When using IPv6 only and binding the monitor on a IPv6 address the monitor doesn't come online after boot.

This is what the logs show:

2013-09-23 11:46:08.037723 7f7110b13780 -1 accepter.accepter.bind unable to bind to [2a00:XXX:XXX:200::6789:2]:6789: Cannot assign requested address

I think this has something to do with DAD (Duplicate Address Detection) of IPv6 where it takes a couple of seconds for the address to actually be available on the nic.

I'll try to find the root cause.

Actions #1

Updated by Sage Weil over 10 years ago

We've seen similar issues on fedora with ipv4; there it was just a problem with the init/systemd ordering (networkmanager vs normal network service). This is ubuntu I assume? I would expect the ip to be fully configured by the time the network service finishes and/or we get to runlevel 2 (where ceph-all starts)...

Actions #2

Updated by Wido den Hollander over 10 years ago

Sage Weil wrote:

We've seen similar issues on fedora with ipv4; there it was just a problem with the init/systemd ordering (networkmanager vs normal network service). This is ubuntu I assume? I would expect the ip to be fully configured by the time the network service finishes and/or we get to runlevel 2 (where ceph-all starts)...

This is indeed Ubuntu, but with IPv6 there is something different.

When you add a IPv6 address to the NIC it will be attached, but it will take about 3 to 5 seconds before it to become functional (and bind-able) due do the Duplicate Address Detection the kernel performs.

So the network service exits and the booting continues and within that 3 to 5 seconds the monitor starts and tries to bind.

That is my educated guess for now.

Actions #3

Updated by Wido den Hollander over 10 years ago

So I did a small test:

[mon]
    pre start command = "sleep 3"

That works, the mon now start nicely and binds to the IPv6 address.

So it seems that when the monitor tries to start the IPv6 address isn't available yet and this short sleep fixes that. 1 second might work, but I only tried with 3 seconds.

Actions #4

Updated by Wido den Hollander about 8 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF