Project

General

Profile

Actions

Bug #46845

closed

Newly orchestrated OSD fails with 'unable to find any IPv4 address in networks '2001:db8:11d::/120' with ms_bind_ipv6=true

Added by Daniël Vos almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Matthew Oliver
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I just started deploying 60 OSDs to my new 15.2.4 OCtopus IPv6 cephadm cluster. I applied the spec for the OSDs and the orchestrator started creating OSDs. Unfortunately all 60 OSDs crashed at startup with the following message: 'unable to find any IPv4 address in networks '2001:db8:11d::/120'

ms_bind_ipv6 is set to true.

-- The job identifier is 14258.
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-22
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-24e819f4-9089-48ae-b817-014a29addf23/osd-data-0ccc10ee-018d-43e8-8350-6ea1dd67102e --path /var/lib/ceph/osd/ceph-22 --no-mon-config
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/ln -snf /dev/ceph-24e819f4-9089-48ae-b817-014a29addf23/osd-data-0ccc10ee-018d-43e8-8350-6ea1dd67102e /var/lib/ceph/osd/ceph-22/block
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-22/block
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/chown -R ceph:ceph /dev/mapper/ceph--24e819f4--9089--48ae--b817--014a29addf23-osd--data--0ccc10ee--018d--43e8--8350--6ea1dd67102e
Aug 06 09:21:01 node3.example.net bash[64671]: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-22
Aug 06 09:21:01 node3.example.net bash[64671]: --> ceph-volume lvm activate successful for osd ID: 22
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.465+0000 7fee3e813f40  0 set uid:gid to 167:167 (ceph:ceph)
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.465+0000 7fee3e813f40  0 ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable), process ceph-osd, pid 1
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.465+0000 7fee3e813f40  0 pidfile_write: ignore empty --pid-file
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev create path /var/lib/ceph/osd/ceph-22/block type kernel
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev(0x562f2f600000 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev(0x562f2f600000 /var/lib/ceph/osd/ceph-22/block) open size 1000203091968 (0xe8e0c00000, 932 GiB) block_size 4096 (4 KiB) non-rotational discard supported
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bluestore(/var/lib/ceph/osd/ceph-22) _set_cache_sizes cache_size 3221225472 meta 0.4 kv 0.4 data 0.2
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev create path /var/lib/ceph/osd/ceph-22/block type kernel
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev(0x562f2f600700 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev(0x562f2f600700 /var/lib/ceph/osd/ceph-22/block) open size 1000203091968 (0xe8e0c00000, 932 GiB) block_size 4096 (4 KiB) non-rotational discard supported
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bluefs add_block_device bdev 1 path /var/lib/ceph/osd/ceph-22/block size 932 GiB
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.469+0000 7fee3e813f40  1 bdev(0x562f2f600700 /var/lib/ceph/osd/ceph-22/block) close
Aug 06 09:21:01 node3.example.net bash[64907]: debug 2020-08-06T07:21:01.773+0000 7fee3e813f40  1 bdev(0x562f2f600000 /var/lib/ceph/osd/ceph-22/block) close
Aug 06 09:21:02 node3.example.net bash[64907]: debug 2020-08-06T07:21:02.037+0000 7fee3e813f40  1  objectstore numa_node 0
Aug 06 09:21:02 node3.example.net bash[64907]: debug 2020-08-06T07:21:02.037+0000 7fee3e813f40  0 starting osd.22 osd_data /var/lib/ceph/osd/ceph-22 /var/lib/ceph/osd/ceph-22/journal
Aug 06 09:21:02 node3.example.net bash[64907]: debug 2020-08-06T07:21:02.037+0000 7fee3e813f40 -1 unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''
Aug 06 09:21:02 node3.example.net bash[64907]: debug 2020-08-06T07:21:02.037+0000 7fee3e813f40 -1 unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''
Aug 06 09:21:02 node3.example.net bash[64907]: debug 2020-08-06T07:21:02.037+0000 7fee3e813f40 -1 Failed to pick public address.
Aug 06 09:21:02 node3.example.net systemd[1]: ceph-d77f7c4a-d656-11ea-95cb-531234b0f844@osd.22.service: Main process exited, code=exited, status=1/FAILURE

I double checked to see if ms_bind_ipv6 was set to True, this is the case.

While searching for ms_bind I noticed ms_bind_ipv4 is a thing that exists and it was also set to true (default). When I configure this to be false, the OSDs can boot up. Switching ms_bind_ipv4 back to the default (true), the OSDs can not start.

ms_bind_ipv4 set to false (for OSD only):

Aug 06 09:55:22 node3.example.net bash[66959]: debug 2020-08-06T07:55:22.013+0000 7f54b86daf40  0 starting osd.22 osd_data /var/lib/ceph/osd/ceph-22 /var/lib/ceph/osd/ceph-22/journal
Aug 06 09:55:22 node3.example.net bash[66959]: debug 2020-08-06T07:55:22.033+0000 7f54b86daf40  0 load: jerasure load: lrc load: isa
Aug 06 09:55:22 node3.example.net bash[66959]: debug 2020-08-06T07:55:22.033+0000 7f54b86daf40  1 bdev create path /var/lib/ceph/osd/ceph-22/block type kernel
Aug 06 09:55:22 node3.example.net bash[66959]: debug 2020-08-06T07:55:22.033+0000 7f54b86daf40  1 bdev(0x55c143d6a000 /var/lib/ceph/osd/ceph-22/block) open path /var/lib/ceph/osd/ceph-22/block
...snip...
Aug 06 09:55:25 node3.example.net bash[66959]: debug 2020-08-06T07:55:25.628+0000 7f54a21c7700  1 osd.22 88 state: booting -> active

ms_bind_ipv4 back to the default value (true) and then it fails to start again:

Aug 06 10:10:43 node3.example.net bash[70455]: debug 2020-08-06T08:10:43.617+0000 7f78b53d3f40  0 starting osd.22 osd_data /var/lib/ceph/osd/ceph-22 /var/lib/ceph/osd/ceph-22/journal
Aug 06 10:10:43 node3.example.net bash[70455]: debug 2020-08-06T08:10:43.617+0000 7f78b53d3f40 -1 unable to find any IPv4 address in networks '2001:db8:11d::/120' interfaces ''

To be sure this was the only thing in the way, i tried it 2 more times. I can confirm that with the ms_bind_ipv4 set to false, my OSDs can boot. With ms_bind_ipv4 set to default (true), my OSDs fail to boot.

If you need any more information i'd be happy to supply you with it.


Related issues 2 (1 open1 closed)

Related to RADOS - Bug #52867: pick_address.cc prints: unable to find any IPv4 address in networks 'fd00:fd00:fd00:3000::/64' interfacesNew

Actions
Has duplicate Messengers - Bug #39711: "unable to find any IPv4 address in networks <ipv6-network>" after upgrade to nautilus on osd and mdsDuplicate

Actions
Actions

Also available in: Atom PDF