Project

General

Profile

Bug #49223

unrecognized arguments: --container-init

Added by Sebastian Wagner 2 months ago. Updated 8 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
39423
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.127+0000 7f130910d700  0 [cephadm DEBUG cephadm.serve] _run_cephadm : command = gather-facts
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.127+0000 7f130910d700  0 [cephadm DEBUG cephadm.serve] _run_cephadm : args = []
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.127+0000 7f130910d700  0 [cephadm DEBUG root] Have connection to ubuntu
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.127+0000 7f130910d700  0 [cephadm DEBUG cephadm.serve] args: gather-facts --container-init
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.371+0000 7f131165d700  0 [restful DEBUG root] Unhandled notification type 'service_map'
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.375+0000 7f12fe8f8700  0 [rbd_support DEBUG root] PerfHandler: tick
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.375+0000 7f12fe0f7700  0 [rbd_support DEBUG root] TaskHandler: tick
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.507+0000 7f130910d700  0 [cephadm DEBUG cephadm.serve] code: 2
Feb 09 10:29:57 ubuntu conmon[22007]: debug 2021-02-09T09:29:57.507+0000 7f130910d700  0 [cephadm DEBUG cephadm.serve] err: usage:  [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
Feb 09 10:29:57 ubuntu conmon[22007]:         [--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
Feb 09 10:29:57 ubuntu conmon[22007]:         [--unit-dir UNIT_DIR] [--verbose] [--timeout TIMEOUT] [--retry RETRY]
Feb 09 10:29:57 ubuntu conmon[22007]:         [--env ENV]
Feb 09 10:29:57 ubuntu conmon[22007]:         {version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,unit,logs,bootstrap,deploy,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts,exporter,host-maintenance,verify-prereqs}
Feb 09 10:29:57 ubuntu conmon[22007]:         ...
Feb 09 10:29:57 ubuntu conmon[22007]: : error: unrecognized arguments: --container-init
Feb

using

    {
        "style": "cephadm:v1",
        "name": "mgr.ubuntu.micfpd",
        "fsid": "943f28ea-6ab7-11eb-923e-0242b47faa5c",
        "systemd_unit": "ceph-943f28ea-6ab7-11eb-923e-0242b47faa5c@mgr.ubuntu.micfpd",
        "enabled": true,
        "state": "running",
        "container_id": "38547b1f4168a81bfa2dc8fd831286045e1acd12197e7b2638b61c27d96a8ba9",
        "container_image_name": "docker.io/ceph/daemon-base:latest-master-devel",
        "container_image_id": "7146f2bd66bd219e642f5ac73b1371f3c169477afcfa92fe097a7e923fd397cc",
        "container_image_digests": [
            "docker.io/ceph/daemon-base@sha256:2f08b03807623cf4702f489659ddfef224fd3bb6aeb83b317f69128b0b782749" 
        ],
        "version": "17.0.0-389-gcced65aa",
        "started": "2021-02-09T09:17:16.708559Z",
        "created": "2021-02-09T09:17:17.013180Z",
        "deployed": "2021-02-09T09:17:16.045155Z",
        "configured": "2021-02-09T09:17:17.013180Z" 
    },

Looks like this is the old cephadm binary and the old mgr/cephadm module. but with --container-init enabled for all commands even though they don't support it

$  sudo ./cephadm enter --name mgr.ubuntu.micfpd
[sudo] Passwort für sebastian: 
Inferring fsid 943f28ea-6ab7-11eb-923e-0242b47faa5c
[ceph: root@ubuntu /]# cephadm gather-facts --container-init
usage: cephadm [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
               [--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
               [--unit-dir UNIT_DIR] [--verbose] [--timeout TIMEOUT]
               [--retry RETRY] [--env ENV]
               {version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,unit,logs,bootstrap,deploy,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts,exporter,host-maintenance,verify-prereqs}
               ...
cephadm: error: unrecognized arguments: --container-init
[ceph: root@ubuntu /]# cd /usr/share/ceph/mgr/cephadm
[ceph: root@ubuntu cephadm]# grep -C 3 container-init serve.py 
                final_args += ['--fsid', self.mgr._cluster_fsid]

            if self.mgr.container_init:
                final_args += ['--container-init']

            final_args += args

Look, deploy does support container-init:

usage: cephadm deploy [-h] --name NAME --fsid FSID [--config CONFIG]
                      [--config-json CONFIG_JSON] [--keyring KEYRING]
                      [--key KEY] [--osd-fsid OSD_FSID] [--skip-firewalld]
                      [--tcp-ports TCP_PORTS] [--reconfig] [--allow-ptrace]
                      [--container-init]
cephadm deploy: error: the following arguments are required: --name, --fsid

workaround:

sudo ./cephadm shell --fsid 943f28ea-6ab7-11eb-923e-0242b47faa5c -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring ceph config set mgr mgr/cephadm/container_init false

History

#1 Updated by Sebastian Wagner 2 months ago

  • Description updated (diff)

#2 Updated by Sebastian Wagner 2 months ago

  • Description updated (diff)

#3 Updated by Sebastian Wagner 2 months ago

  • Description updated (diff)

#4 Updated by Sebastian Wagner 2 months ago

  • Assignee deleted (Michael Fritch)
  • Priority changed from Urgent to Normal

looks like https://github.com/ceph/ceph/pull/36822 was broken back then and https://github.com/ceph/ceph/pull/37648 never went in. sigh.

#5 Updated by Nathan Cutler 2 months ago

If I remember correctly, the "--container-init" saga went about like so:

1. in general, there is a need for containerized daemons to produce coredumps, but containerized Ceph daemons do not produce them because the container does not have an init process running inside it (?)
2. to address this issue, a new option "--container-init" was added. Apparently, the idea was that, with "--container-init", coredumps would work. When this option is not given, coredumps would continue not working - in other words, an explicit option must be given to get sane behavior (?!).
3. it turned out that the first --container-init PR did not actually provide working coredumps
4. more patches were added (downstream SUSE only) to get working coredumps. These have not been upstreamed yet.
5. working coredumps (downstream SUSE) still require --container-init to be explicitly specified, but this is hidden inside ceph-salt so downstream users are not required to know this

Now, I'm not sure if that summary is completely correct - my memory is not so good, so maybe it is wrong in some respect? Assuming it is not wrong, then in my mind it would make sense to:

6. fix coredumps so they "just work" (regardless of whether --container-init is given or not)
7. deprecate the --container-init option (i.e., make it an option that does nothing, but is left in for some time to ensure backwards compatibility)

#6 Updated by Sebastian Wagner 2 months ago

  • Pull request ID set to 37648

#7 Updated by Michael Fritch 2 months ago

  • Backport set to 39423

#8 Updated by Michael Fritch about 1 month ago

  • Pull request ID changed from 37648 to 39914

#9 Updated by Sebastian Wagner about 1 month ago

  • Status changed from New to Fix Under Review

#11 Updated by Sebastian Wagner 8 days ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF