daemons bind to loopback iface
There seems to be a regression in 14.2.18 whereby in some envs OSDs will bind to 127.0.0.1.
This was probably introduced in https://github.com/ceph/ceph/commit/89321762ad4cfdd1a68cae467181bdd1a501f14d
I don't think ifa_name contains a colon.. on my machine I tested the example code at https://man7.org/linux/man-pages/man3/getifaddrs.3.html and it outputs just `lo`
# ./a.out lo AF_PACKET (17) tx_packets = 1683333517; rx_packets = 1683333517 tx_bytes = 1685898949; rx_bytes = 1685898949 eno1 AF_PACKET (17) tx_packets = 0; rx_packets = 0 tx_bytes = 0; rx_bytes = 0 ens785f0 AF_PACKET (17) tx_packets = 3787675362; rx_packets = 4154015233 tx_bytes = 3146993958; rx_bytes = 1004572644 ens785f1 AF_PACKET (17) tx_packets = 0; rx_packets = 0 tx_bytes = 0; rx_bytes = 0 eno2 AF_PACKET (17) tx_packets = 0; rx_packets = 0 tx_bytes = 0; rx_bytes = 0 lo AF_INET (2) address: <127.0.0.1> ens785f0 AF_INET (2) address: <10.116.6.8> lo AF_INET6 (10) address: <::1> ens785f0 AF_INET6 (10) address: <fd01:1458:e00:1e::100:5> ens785f0 AF_INET6 (10) address: <fe80::bdbd:76be:63fd:a4c2%ens785f0>
So we need to also explicitly skip when the iface name is exactly 'lo'.
Marking this with critical because it can take down entire clusters if operators yum update.
#2 Updated by Dan van der Ster 23 days ago
I suppose this will re-break the use-case described in #48893.
I would argue that OOTB, ceph should do the right thing on the most common deployments. But if we want to support this bgp-to-the-host use-case ootb also, the heuristic to pick addrs needs to be improved further.
#3 Updated by Dan van der Ster 21 days ago
All daemons are impacted by this, not just OSDs: https://firstname.lastname@example.org/thread/7IAGFUXMRZU77M4KYS5NW5MZ6YJ7YN4G/