Bug #54235
closedFiltered out host ceph03: does not belong to mon public_network
0%
Description
We are upgrading a test cluster from Octopus to Pacific via "ceph orch upgrade start quay.io/ceph/ceph:v16.2.7"
In the output of "ceph -W cephadm" we notice these lines:
2022-02-09T15:48:18.851098+0100 mgr.ceph03.wdkzcv [INF] inventory: adjusted host ceph01 addr 'ceph01' -> '10.24.4.128'
2022-02-09T15:48:18.852908+0100 mgr.ceph03.wdkzcv [INF] inventory: adjusted host ceph02 addr 'ceph02' -> '10.24.4.129'
2022-02-09T15:48:18.857070+0100 mgr.ceph03.wdkzcv [INF] inventory: adjusted host ceph04 addr 'ceph04' -> '10.24.4.131'
2022-02-09T15:48:29.687798+0100 mgr.ceph03.wdkzcv [INF] Filtered out host ceph01: does not belong to mon public_network (10.24.4.0/24)
2022-02-09T15:48:29.688702+0100 mgr.ceph03.wdkzcv [INF] Filtered out host ceph02: does not belong to mon public_network (10.24.4.0/24)
2022-02-09T15:48:29.689774+0100 mgr.ceph03.wdkzcv [INF] Filtered out host ceph03: does not belong to mon public_network (10.24.4.0/24)
10.24.4.128 is obviously in the public_network 10.24.4.0/24, why does it complain about it?
Shortly after that this happens:
2022-02-09T15:55:21.086293+0100 mgr.ceph01.faewwr [INF] Safe to remove mon.ceph03: new quorum should be ['ceph01', 'ceph02'] (from ['ceph01', 'ceph02'])
2022-02-09T15:55:21.086455+0100 mgr.ceph01.faewwr [INF] Removing monitor ceph03 from monmap...
2022-02-09T15:55:21.111598+0100 mgr.ceph01.faewwr [INF] Removing daemon mon.ceph03 from ceph03
2022-02-09T15:55:58.438551+0100 mgr.ceph01.faewwr [INF] Filtered out host ceph03: does not belong to mon public_network (10.24.4.0/24)
And now the cluster runs with only two MONs.
Updated by Christoph Glaubitz about 2 years ago
Maybe this is somehow related:
I saw similar behavior when creating a new cluster for testing purposes. First bootstrap mon worked fine, for all nodes I added, I got `does not belong to mon public_network` as well. I had a cluster running on the same nodes before, which I deleted.
Turned out, because there were leftover /var/lib/ceph/OLD_FS-folders on my nodes, cephadm was not able to infer the FSID on those nodes. So the cephadm mgr always thought the network list of those nodes is empty, resulting the node to be filtered.
The solution for me was just to clean up /var/lib/ceph.
Updated by Redouane Kachach Elhichou over 1 year ago
- Status changed from New to Duplicate
Updated by Redouane Kachach Elhichou over 1 year ago
- Related to Bug #57060: cephadm won't deploy mon service, reports wrongly filtered out added
Updated by Redouane Kachach Elhichou over 1 year ago
- Status changed from Duplicate to Closed
Closing as this issue is a duplicate for an already solved BUG. Feel free to reopen if you think it's not.
Updated by Redouane Kachach Elhichou over 1 year ago
- Pull request ID set to 47882