Bug #53496
closedcephadm: list-networks swallows /128 networks, breaking the orchestrator ("Filtered out host mon1: does not belong to mon public_network")
0%
Description
Commit 1897d1cd15af ("mgr/cephadm: update list-networks to report interface names too", backported to Pacific as 3237e485ef5f) made cephadm skip over route entries without a mask (implicit /128) in the `ip -6 route ls` output.
This breaks setups that are doing routing to the host, with IP addresses configured on the loopback interface (with /128 mask). When the mgr picks up a cephadm binary containing that commit, it stops considering mon hosts as belonging to the configured public network:
log_channel(cephadm) log [INF] : Filtered out host mon1: does not belong to mon public_network (2001:67c:295c:1001:d191:923a:8db5:46ce/128,2001:67c:295c:1001:20ef:f1ea:ea76:55a8/128,2001:67c:295c:1001:f591:a8bc:fbb5:c8a1/128)
Example `ip -6 route ls` output from affected systems looks like the following:
::1 dev lo proto kernel metric 256 pref medium 2001:67c:295c:1001:20ef:f1ea:ea76:55a8 dev lo proto kernel metric 256 pref medium fe80::/64 dev enp3s0f0 proto kernel metric 100 pref medium fe80::/64 dev enp3s0f1 proto kernel metric 101 pref medium default proto bgp metric 20 pref medium nexthop via fe80::1e34:daff:fe29:bc53 dev enp3s0f1 weight 1 nexthop via fe80::1e34:daff:fe29:c053 dev enp3s0f0 weight 1
Updated by Lars Seipel over 2 years ago
Applying the following tiny modification to cephadm and telling the mgr to use such patched binary (by setting the cephadm_path config option) works around the issue but obviously requires additional care around future updates.
--- a/src/cephadm/cephadm
+++ b/src/cephadm/cephadm
@@ -4926,7 +4926,7 @@ def _parse_ipv6_route(routes: str, ips: str) -> Dict[str, Dict[str, Set[str]]]:
continue
net = m[0][0]
if '/' not in net: # only consider networks with a mask
- continue
+ net += '/128'
iface = m[0][1]
if net not in r:
r[net] = {}
Not completely sure whether that might have ill effects on other systems but probably not. It breaks cephadm's list-networks tests though.
Updated by Sebastian Wagner over 2 years ago
Want to make a PR? If yes, please add your command outputs to https://github.com/ceph/ceph/blob/8c54a705e293682a8bbbd50f579a983e103c3020/src/cephadm/tests/test_networks.py#L192 then such that we can avoid future regressions.
Updated by Lars Seipel over 2 years ago
Sebastian Wagner wrote:
Want to make a PR? If yes, please add your command outputs to https://github.com/ceph/ceph/blob/8c54a705e293682a8bbbd50f579a983e103c3020/src/cephadm/tests/test_networks.py#L192 then such that we can avoid future regressions.
Will do later this week.
Updated by Redouane Kachach Elhichou about 2 years ago
- Related to Bug #51257: mgr/cephadm: Cannot add managed (ceph apply) mon daemons on different subnets added
Updated by Redouane Kachach Elhichou almost 2 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 46202
Updated by Redouane Kachach Elhichou almost 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Redouane Kachach Elhichou almost 2 years ago
- Backport set to quincy,pacific
Updated by Redouane Kachach Elhichou almost 2 years ago
- Status changed from Pending Backport to Resolved