Project

General

Profile

Actions

Bug #55334

closed

mgr/cephadm: socket path too long for some daemons

Added by Avan Thakkar about 2 years ago. Updated 10 months ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The admin socket commands fails for daemons like rgw/rbd-mirror/cephfs-mirror
"admin_socket: exception getting command descriptions: AF_UNIX path too long".

This is probably because of too long filenames. An example path name for rbd-mirror:
/var/run/ceph/e3f41acc-ba6a-11ec-9629-525400c43ed6/ceph-client.rbd-mirror.ceph-node-00.dpqslq.2.93914410882624.asok

116 chars

This exceeds the sizeof(sockaddr_un.sun_path) , which is 108 currently:
struct sockaddr_un {
_SOCKADDR_COMMON (sun);
char sun_path108; /* Path name. */
};

Actions #1

Updated by Avan Thakkar about 2 years ago

Avan Thakkar wrote:

The admin socket commands fails for daemons like rgw/rbd-mirror/cephfs-mirror
"admin_socket: exception getting command descriptions: AF_UNIX path too long".

This is probably because of too long filenames. An example path name for rbd-mirror:
/var/run/ceph/e3f41acc-ba6a-11ec-9629-525400c43ed6/ceph-client.rbd-mirror.ceph-node-00.dpqslq.2.93914410882624.asok

116 chars

This exceeds the sizeof(sockaddr_un.sun_path) , which is 108 currently:

struct sockaddr_un
  {
    __SOCKADDR_COMMON (sun_);
    char sun_path[108];   /* Path name.  */
  };

Actions #2

Updated by Avan Thakkar about 2 years ago

Ok I see name for these container are stored in form in src/cephadm
client.rbd-mirror.{daemon_id}

but the thing is the asok files stored under /var/run/ceph/<fsid> ..for e.g.
ceph-client.rbd-mirror.ceph-node-00.fqbmns.2.94595577574976.asok

So where's the extra 2.94595577574976 coming from? I couldn't see it in daemon spec of rbd-mirror as well

Actions #3

Updated by Redouane Kachach Elhichou almost 2 years ago

I'm afraid cephadm has nothing to do with this issue. cephadm responsibility ends on daemon name generation etc. Admin socket is created/managed by the RGW code. The part 94595577574976 corresponds to the cctid I guess (from sample.ceph.conf):

    # The Ceph admin socket allows you to query a daemon via a socket interface
    # From a client perspective this can be a virtual machine using librbd
    # Type: String
    # Required: No
    ;admin socket                       = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok

I'm not sure if this could be changed or not (RGW team may probably help here) but this is something that cephadm cannot control AFAIK.

Actions #4

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from New to Rejected

Please, feel free to re-open if you still think it's a cephadm issue. Otherwise you can move this issue to the corresponding component.

Actions #5

Updated by Sebastian Wagner almost 2 years ago

this issue did popped up a whole ago already. I think there is an internal rhbz about it. $cctid was added years ago to avoid having multiple RBD volumes (each having it's own librados instance thus opening a socket) conflicting with each other. Back then there was no easy way to solve it. Neha might help?

Don't think that simply closing this as "Rejected" is going to make this issue go away any time soon.

Actions #6

Updated by Matt Benjamin 12 months ago

Avan Thakkar,

A global requirement to make all socket paths less than 108 characters seems generally unreasonable, in an era when PATH_MAX is 4K (and generally NAME_MAX has been 255 for many many years). Obviously, that's not under your control.

However, I notice that

len(ceph-client.rbd-mirror.ceph-node-00.dpqslq.2.93914410882624.asok)

is in fact 64--well within the limit. So another view of 55334 is that, due to the use of long-ish socket file names, the Ceph tooling should not be using a full path.
Actions #7

Updated by Redouane Kachach Elhichou 10 months ago

  • Status changed from Rejected to New
Actions #8

Updated by Redouane Kachach Elhichou 10 months ago

Reopning just to confir if this is a real issue or not.

I created an rgw daemon on my test cluster, and the length of the admin ASOK is the following:

(running from inside a cephadm shell):

[ceph: root@ceph-node-0 /]# echo "/var/run/ceph/ceph-client.rgw.rgw.1.ceph-node-0.rrzylk.2.94169732110424.asok" | wc
      1       1      77

Running an admin socket command on the rgw daemon:

[ceph: root@ceph-node-0 /]# ceph --admin-daemon /var/run/ceph/ceph-client.rgw.rgw.1.ceph-node-0.rrzylk.2.94169732110424.asok help
{
    "config diff": "dump diff of current config and default config",
    "config diff get": "dump diff get <field>: dump diff of current and default config setting <field>",
    "config get": "config get <field>: get the config value",
    "config help": "get config setting schema and descriptions",

So in summary, working from inside a chephadm shell there's no issue as the path '/var/run/ceph/ceph-client.rgw.rgw.1.ceph-node-0.rrzylk.2.94169732110424.asok' is less than the maximum (108) characters.

It only could be an issue in case somebody tries to use the admin socket from outside the cephadm shell command. In the case the whole path is longer than 108: '/var/run/ceph/994d5594-060c-11ee-b1a9-525400a90a1b/ceph-client.rgw.rgw.1.ceph-node-0.rrzylk.2.93938707486808.asok'

[root@ceph-node-0 ~]# echo "/var/run/ceph/994d5594-060c-11ee-b1a9-525400a90a1b/ceph-client.rgw.rgw.1.ceph-node-0.rrzylk.2.93938707486808.asok" | wc
      1       1     114

However I'm not sure if the last use case (outside of cephadm shell) makes sense or not and what would be the case when such usage is needed.

Actions #9

Updated by Redouane Kachach Elhichou 10 months ago

  • Status changed from New to Rejected

After an offline discussion with Avan we agreed that this is in fact not a valid BUG because it doesn't make sense to run the commands from outside cephadm shell. In case you have a different opinion feel free to reopen the BUG.

Actions

Also available in: Atom PDF