Project

General

Profile

Actions

Bug #48019

closed

cephadm: `ceph daemon <daemon-name> ...` is broken

Added by Nathan Cutler over 3 years ago. Updated about 2 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

$ ceph daemon mgr.ceph03 config show
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
$ ceph --admin-daemon /var/run/ceph/mgr.ceph03 config show
No such file or directory

It seems like a bug that we are requiring users to enter a running container just to exercise a standard ceph feature.


Related issues 1 (0 open1 closed)

Related to Orchestrator - Documentation #45564: cephadm: document workaround for accessing the admin socket by entering running containerDuplicate

Actions
Actions #1

Updated by Nathan Cutler over 3 years ago

  • Related to Documentation #45564: cephadm: document workaround for accessing the admin socket by entering running container added
Actions #2

Updated by Sebastian Wagner over 3 years ago

I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy

Actions #3

Updated by Nathan Cutler over 3 years ago

Sebastian Wagner wrote:

I think this actually works already. the mgr daemons have some random identifier. like mgr.<hostname>.xyzuizxy

Does not seem to be the case:

master:~ # ceph -s
  cluster:
    id:     3d58c86e-2990-11eb-8e8e-525400cd9699
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum node1,node3,node2 (age 9h)
    mgr: node2.ankmgz(active, since 9h), standbys: node3.ddjhkx, node1.jptrux

OK, the active mgr's name is "mgr.node2.ankmgz". I ssh over to node2 to check it out.

node2:~ # ceph daemon mgr.node2.ankmgz config show
admin_socket: exception getting command descriptions: [Errno 2] No such file or directory

Nope. If I know how to find the asok file, and go through the trouble to do it, then this does work:

node2:~ # ceph --admin-daemon /var/run/ceph/3d58c86e-2990-11eb-8e8e-525400cd9699/ceph-mgr.node2.ankmgz.asok config show
Actions #4

Updated by Nathan Cutler over 3 years ago

Could it be that the admin_socket code has not been updated to reflect the fact that the asok files have moved?

This error seems to indicate it's still looking for the asok under /var/log/ceph:

admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
Actions #5

Updated by Sebastian Wagner over 3 years ago

Hm. Seems that when calling

ceph daemon mgr.x.y

the ceph binary call something like

ceph-conf --show-config-value admin_socket

which is determined by

https://github.com/ceph/ceph/blob/07cba31a03a3a311940a5338944f11ad2c87b641/src/common/options.cc#L476-L488

which means, it won't find the admin socket, due to the admin socket being in a different location within the container. Thus is is actually important that the admin socket is in the exact same location within the container and outside of the container.

So, the path that is set within the container and in options.cc is:

/var/run/ceph/ceph-$daemon-name.asok

but we're mounting the path on the host to

/var/run/ceph/$fsid/ceph-$daemon-name.asok

That's a bummer.

Actions #6

Updated by Sebastian Wagner over 3 years ago

  • Subject changed from cephadm: not possible to access the admin socket from outside container to cephadm: `ceph daemon <daemon-name> ...` is broken
Actions #7

Updated by Sebastian Wagner almost 3 years ago

  • Category set to cephadm (binary)
Actions #8

Updated by Redouane Kachach Elhichou about 2 years ago

Closing beacause I can't reproduce the issue on the latest version of Ceph. If you think the issue still exists please re-open and provide detailed instructions on how to reproduce it on the latest Ceph version.

[root@ceph-node-0 ~]# cephadm shell
Inferring fsid 35e99e04-aeab-11ec-a051-525400283b7b
Using recent ceph image registry.hub.docker.com/rkachach/ceph@sha256:565277a8bcb13d2e123d444c4e5433cd28b221ec48fb97d99e4fcaae2233cf38
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph --version
ceph version 17.0.0-10469-g29e1fc17 (29e1fc1722aa5915b44828a5ad02ec45ce760aa3) quincy (dev)
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph orch ps | grep mgr
mgr.ceph-node-0.scljdt          ceph-node-0  *:9283       running (19h)     4m ago  19h     501M        -  17.0.0-10469-g29e1fc17  8bc723507631  38b25a1d5b12  
mgr.ceph-node-1.ndtirr          ceph-node-1  *:9283       running (19h)     4m ago  19h     426M        -  17.0.0-10469-g29e1fc17  8bc723507631  f153a85e7708  
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# ceph daemon mgr.ceph-node-0.scljdt config show | head -10
{
    "name": "mgr.ceph-node-0.scljdt",
    "cluster": "ceph",
    "admin_socket": "/var/run/ceph/ceph-mgr.ceph-node-0.scljdt.asok",
    "admin_socket_mode": "",
    "auth_allow_insecure_global_id_reclaim": "true",
    "auth_client_required": "cephx, none",
    "auth_cluster_required": "cephx",
    "auth_debug": "false",
    "auth_expose_insecure_global_id_reclaim": "true",
admin_socket: [Errno 32] Broken pipe
[ceph: root@ceph-node-0 /]# 
[ceph: root@ceph-node-0 /]# 

Actions #9

Updated by Redouane Kachach Elhichou about 2 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF