Project

General

Profile

Actions

Bug #49938

closed

daemons bind to loopback iface

Added by Dan van der Ster about 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
pacific, octopus, nautilus
Regression:
Yes
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There seems to be a regression in 14.2.18 whereby in some envs OSDs will bind to 127.0.0.1.

E.g. https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/3Z5J7MYZIPM3ZUTNU4LTWADXOSZVK27R/

This was probably introduced in https://github.com/ceph/ceph/commit/89321762ad4cfdd1a68cae467181bdd1a501f14d

I don't think ifa_name contains a colon.. on my machine I tested the example code at https://man7.org/linux/man-pages/man3/getifaddrs.3.html and it outputs just `lo`

# ./a.out
lo       AF_PACKET (17)
                tx_packets = 1683333517; rx_packets = 1683333517
                tx_bytes   = 1685898949; rx_bytes   = 1685898949
eno1     AF_PACKET (17)
                tx_packets =          0; rx_packets =          0
                tx_bytes   =          0; rx_bytes   =          0
ens785f0 AF_PACKET (17)
                tx_packets = 3787675362; rx_packets = 4154015233
                tx_bytes   = 3146993958; rx_bytes   = 1004572644
ens785f1 AF_PACKET (17)
                tx_packets =          0; rx_packets =          0
                tx_bytes   =          0; rx_bytes   =          0
eno2     AF_PACKET (17)
                tx_packets =          0; rx_packets =          0
                tx_bytes   =          0; rx_bytes   =          0
lo       AF_INET (2)
                address: <127.0.0.1>
ens785f0 AF_INET (2)
                address: <10.116.6.8>
lo       AF_INET6 (10)
                address: <::1>
ens785f0 AF_INET6 (10)
                address: <fd01:1458:e00:1e::100:5>
ens785f0 AF_INET6 (10)
                address: <fe80::bdbd:76be:63fd:a4c2%ens785f0>

So we need to also explicitly skip when the iface name is exactly 'lo'.

Marking this with critical because it can take down entire clusters if operators yum update.


Related issues 6 (1 open5 closed)

Related to RADOS - Bug #50012: Ceph-osd refuses to bind on an IP on the local loopback lo (again)Fix Under ReviewKefu Chai

Actions
Related to Ceph - Bug #43417: Since the local loopback address is set to a virtual IP,OSD can't restart .Resolved

Actions
Related to Ceph - Bug #48893: Ceph-osd refuses to bind on an IP on the local loopback loResolved

Actions
Copied to Ceph - Backport #49995: octopus: daemons bind to loopback ifaceResolvedKonstantin ShalyginActions
Copied to Ceph - Backport #49996: nautilus: daemons bind to loopback ifaceResolvedKonstantin ShalyginActions
Copied to Ceph - Backport #49997: pacific: daemons bind to loopback ifaceResolvedKonstantin ShalyginActions
Actions

Also available in: Atom PDF