Project

General

Profile

Bug #45016

mgr: `ceph tell mgr mgr_status` hangs

Added by Sebastian Wagner 4 months ago. Updated 19 days ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

cephadm bootstrap hangs:

root@buster:/cephadm# ./cephadm --image quay.io/ceph-ci/ceph:master bootstrap --mon-ip '[::1]' --skip-mon-network
INFO:cephadm:Verifying podman|docker is present...
INFO:cephadm:Verifying lvm2 is present...
INFO:cephadm:Verifying time synchronization is in place...
INFO:cephadm:Unit systemd-timesyncd.service is enabled and running
INFO:cephadm:Repeating the final host check...
INFO:cephadm:podman|docker (/usr/bin/podman) is present
INFO:cephadm:systemctl is present
INFO:cephadm:lvcreate is present
INFO:cephadm:Unit systemd-timesyncd.service is enabled and running
INFO:cephadm:Host looks OK
INFO:root:Cluster fsid: 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Verifying IP [::1] port 3300 ...
INFO:cephadm:Verifying IP [::1] port 6789 ...
INFO:cephadm:Pulling latest quay.io/ceph-ci/ceph:master container...
INFO:cephadm:Extracting ceph user uid/gid from container image...
INFO:cephadm:Creating initial keys...
INFO:cephadm:Creating initial monmap...
INFO:cephadm:Creating mon...
INFO:cephadm:Waiting for mon to start...
INFO:cephadm:Waiting for mon...
INFO:cephadm:Assimilating anything we can from ceph.conf...
INFO:cephadm:Generating new minimal ceph.conf...
INFO:cephadm:Restarting the monitor...
INFO:cephadm:Creating mgr...
INFO:cephadm:Wrote keyring to /etc/ceph/ceph.client.admin.keyring
INFO:cephadm:Wrote config to /etc/ceph/ceph.conf
INFO:cephadm:Waiting for mgr to start...
INFO:cephadm:Waiting for mgr...
INFO:cephadm:mgr not available, waiting (1/10)...
INFO:cephadm:mgr not available, waiting (2/10)...
INFO:cephadm:mgr not available, waiting (3/10)...
INFO:cephadm:mgr not available, waiting (4/10)...
INFO:cephadm:Enabling cephadm module...
INFO:cephadm:Waiting for the mgr to restart...
INFO:cephadm:Waiting for mgr epoch 5...
^C

While `mgr dump` works:

root@buster:/cephadm# ./cephadm shell -- ceph mgr dump | jq .epoch
INFO:cephadm:Inferring fsid 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Using recent ceph image quay.io/ceph-ci/ceph:master
8

But `ceph tell mgr mgr_status` hangs:

root@buster:/cephadm# ./cephadm shell -- ceph tell mgr mgr_status
INFO:cephadm:Inferring fsid 160e9ea8-7a60-11ea-b487-525400e3bceb
INFO:cephadm:Using recent ceph image quay.io/ceph-ci/ceph:master
^CInterrupted

And neither mon or mgr logs show a trace of the attempted call.

root@buster:/cephadm# ./cephadm shell -- ceph config generate-minimal-conf
# minimal ceph.conf for 160e9ea8-7a60-11ea-b487-525400e3bceb
[global]
        fsid = 160e9ea8-7a60-11ea-b487-525400e3bceb
        mon_host = [v2:[::1]:3300/0,v1:[::1]:6789/0]

Any idea where to look for clues?


Related issues

Related to Orchestrator - Bug #43816: cephadm: Unable to use IPv6 on "cephadm bootstrap" Resolved

History

#1 Updated by Sebastian Wagner 4 months ago

  • Related to Bug #43816: cephadm: Unable to use IPv6 on "cephadm bootstrap" added

#2 Updated by Jan Fajerski 4 months ago

Quick check in vstart suggests the mgr works fine (current octopus).

Guess a next step could be to run this with debug logs enabled? See if anything happens in the shell container?

#3 Updated by Matthew Oliver about 2 months ago

  • Assignee set to Matthew Oliver

I'll have a poke around and see if I can get this unblocked so we can continue your IPv6 adventure :)

#4 Updated by Matthew Oliver about 2 months ago

Aha! Solved it. We bind mon to ipv6 (::1), in reality it's messenger is bound to ::1, however the mgr is still binding to ipv4 (so 0.0.0.0) which is messenger default.

If I have conf:

$ cat /tmp/ceph.conf 
[global]
ms bind ipv6 = true

To tell the messenger to bind to ipv6, and then run:

$ sudo ./cephadm --image quay.io/ceph-ci/ceph:master -v bootstrap --mon-ip '[::1]' --skip-mon-network -c /tmp/ceph.conf

It all works.

So I can see two ways of fixing this. Take the `--mon-ip` and check what it is and add the `ms bind ipv6 = true` in `cephadm bootstrap` or maybe cleaner, add a new option to bootstrap to set this, something like `--ipv6`.

I'll write up a patch for the latter to start with.

#5 Updated by Sebastian Wagner about 1 month ago

  • Status changed from New to Pending Backport
  • Pull request ID set to 35633

#6 Updated by Sebastian Wagner 19 days ago

  • Status changed from Pending Backport to Resolved
  • Target version set to v15.2.5

Also available in: Atom PDF