Project

General

Profile

Actions

Bug #53496

closed

cephadm: list-networks swallows /128 networks, breaking the orchestrator ("Filtered out host mon1: does not belong to mon public_network")

Added by Lars Seipel over 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy,pacific
Regression:
Yes
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Commit 1897d1cd15af ("mgr/cephadm: update list-networks to report interface names too", backported to Pacific as 3237e485ef5f) made cephadm skip over route entries without a mask (implicit /128) in the `ip -6 route ls` output.

This breaks setups that are doing routing to the host, with IP addresses configured on the loopback interface (with /128 mask). When the mgr picks up a cephadm binary containing that commit, it stops considering mon hosts as belonging to the configured public network:

log_channel(cephadm) log [INF] : Filtered out host mon1: does not belong to mon public_network (2001:67c:295c:1001:d191:923a:8db5:46ce/128,2001:67c:295c:1001:20ef:f1ea:ea76:55a8/128,2001:67c:295c:1001:f591:a8bc:fbb5:c8a1/128)

Example `ip -6 route ls` output from affected systems looks like the following:

::1 dev lo proto kernel metric 256 pref medium
2001:67c:295c:1001:20ef:f1ea:ea76:55a8 dev lo proto kernel metric 256 pref medium
fe80::/64 dev enp3s0f0 proto kernel metric 100 pref medium
fe80::/64 dev enp3s0f1 proto kernel metric 101 pref medium
default proto bgp metric 20 pref medium
    nexthop via fe80::1e34:daff:fe29:bc53 dev enp3s0f1 weight 1 
    nexthop via fe80::1e34:daff:fe29:c053 dev enp3s0f0 weight 1

Related issues 1 (0 open1 closed)

Related to Orchestrator - Bug #51257: mgr/cephadm: Cannot add managed (ceph apply) mon daemons on different subnetsResolvedRedouane Kachach Elhichou

Actions
Actions #1

Updated by Lars Seipel over 2 years ago

Applying the following tiny modification to cephadm and telling the mgr to use such patched binary (by setting the cephadm_path config option) works around the issue but obviously requires additional care around future updates.

--- a/src/cephadm/cephadm
+++ b/src/cephadm/cephadm
@@ -4926,7 +4926,7 @@ def _parse_ipv6_route(routes: str, ips: str) -> Dict[str, Dict[str, Set[str]]]:
             continue
         net = m[0][0]
         if '/' not in net:  # only consider networks with a mask
-            continue
+            net += '/128'
         iface = m[0][1]
         if net not in r:
             r[net] = {}

Not completely sure whether that might have ill effects on other systems but probably not. It breaks cephadm's list-networks tests though.

Actions #2

Updated by Sebastian Wagner over 2 years ago

Want to make a PR? If yes, please add your command outputs to https://github.com/ceph/ceph/blob/8c54a705e293682a8bbbd50f579a983e103c3020/src/cephadm/tests/test_networks.py#L192 then such that we can avoid future regressions.

Actions #3

Updated by Lars Seipel over 2 years ago

Sebastian Wagner wrote:

Want to make a PR? If yes, please add your command outputs to https://github.com/ceph/ceph/blob/8c54a705e293682a8bbbd50f579a983e103c3020/src/cephadm/tests/test_networks.py#L192 then such that we can avoid future regressions.

Will do later this week.

Actions #4

Updated by Redouane Kachach Elhichou about 2 years ago

  • Related to Bug #51257: mgr/cephadm: Cannot add managed (ceph apply) mon daemons on different subnets added
Actions #5

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 46202
Actions #6

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Backport set to quincy,pacific
Actions #8

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF