Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-11-15T14:30:27ZCeph
Redmine Orchestrator - Bug #53269 (Resolved): store container registry credentials in config-keyhttps://tracker.ceph.com/issues/532692021-11-15T14:30:27ZDaniel Pivonka
<p>provides a more restricted level of access</p> Orchestrator - Bug #52866 (Resolved): removal of iscsi causes mgr module to failhttps://tracker.ceph.com/issues/528662021-10-07T20:46:18ZDaniel Pivonka
<p>this does not happen all the time maybe 10% of the time. could potentially effect of daemon type that have post actions</p>
<p>health: HEALTH_ERR<br /> Module 'cephadm' has failed: dashboard iscsi-gateway-rm failed: iSCSI gateway 'ceph-pnataraj-7ypsv7-node3' does not exist retval: -2</p>
<p>sequence of events: <a class="external" href="https://pastebin.com/EA5peQVt">https://pastebin.com/EA5peQVt</a></p>
<p>traceback: <a class="external" href="https://pastebin.com/hfjvK5m7">https://pastebin.com/hfjvK5m7</a></p>
<p>what happens is 3 iscsi daemons are made and the iscsi type is added to self.mgr.requires_post_actions <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1128">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1128</a></p>
<p>then _check_daemons is ran and gets the list of daemons from the cache <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L920">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L920</a></p>
<p>it possible the the cache does not contain all 3 new iscsi daemons</p>
<p>so when the post deamon actions are ran its only passed 2 deamondescriptions instead of 3 <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1006">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1006</a></p>
<p>the dashboard iscsi-gateway list then only contains 2 of the 3 daemons <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/iscsi.py#L120">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/iscsi.py#L120</a></p>
<p>and eventually when the service is removed it will try to remove 3 entries from the dashboard iscsi-gateway list but only 2 exist and it crashes <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/iscsi.py#L162">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/services/iscsi.py#L162</a></p>
<p>so the cause of the crash is the dashboard iscsi-gateway list not begin setup correctly when the daemons are originally deployed because they are not all in the cache when the post actions are ran</p> Orchestrator - Bug #52692 (Resolved): Iscsi gateways are not showing "UP" in dashboardhttps://tracker.ceph.com/issues/526922021-09-21T20:06:19ZDaniel Pivonka
<p>trusted_ip_list in iscsi-gateway.cfg is not populated with mgr ips by default like it was in ceph-ansible</p>
<p><a class="external" href="https://github.com/ceph/ceph-ansible/commit/d050391cbbe8a56d1cf44e744a02e4aa3f0583e5">https://github.com/ceph/ceph-ansible/commit/d050391cbbe8a56d1cf44e744a02e4aa3f0583e5</a></p> sepia - Support #52061 (Resolved): Sepia Lab Access Requesthttps://tracker.ceph.com/issues/520612021-08-04T17:25:45ZDaniel Pivonka
<p>1) Do you just need VPN access or will you also be running teuthology jobs?<br />both</p>
<p>2) Desired Username: <br />dpiovnka</p>
<p>3) Alternate e-mail address(es) we can reach you at: <br /><a class="email" href="mailto:dpivonka@redhat.com">dpivonka@redhat.com</a></p>
<p>4) If you don't already have an established history of code contributions to Ceph, is there an existing community or core developer you've worked with who has reviewed your work and can vouch for your access request?<br />i contribute to ceph</p>
<p style="padding-left:2em;">If you answered "No" to # 4, please answer the following (paste directly below the question to keep indentation):</p>
<p style="padding-left:2em;">4a) Paste a link to a Blueprint or planning doc of yours that was reviewed at a Ceph Developer Monthly.</p>
<p style="padding-left:2em;">4b) Paste a link to an accepted pull request for a major patch or feature.</p>
<p style="padding-left:2em;">4c) If applicable, include a link to the current project (planning doc, dev branch, or pull request) that you are looking to test.</p>
<p>5) Paste your SSH public key(s) between the <code>pre</code> tags<br /><pre>ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDg7xW5+Mk7pG5MjLxOc7GTvDTCPBJj2L3MEQZIapmmo6t8uPODwDewXncSzrKDH6zFVE4TTTD9fJdu+BhyT+jCFgXrEOMMabj7vJsg0wO0SWqr+4NECFPhD9y4ewfYiewBeXisclA5Bto91v93AtYpZBCdmPYSi0BEcPVc6S3iBUD0PzIrL62I/kBAtm0f/bScRiT7XvQJXvkbEObW2yS2XhUv2qgq1VlFvO1jl3B85KMTHVb8+LrKpQWOKfRfxJ74c51y6tnEjZLFO6Xn0JevqOhd8gPQIXueFZOM0g0pNp1mbtGaaBlEIibbrzo2p9kzsdvJeHMq1XuMIi1/Qyj9 dpivonka@dhcp-41-144.bos.redhat.com</pre></p>
<p>6) Paste your hashed VPN credentials between the <code>pre</code> tags (Format: <code>user@hostname 22CharacterSalt 65CharacterHashedPassword</code>)<br /><pre>dpivonka@thinkpad 6vBnL22Dfp8d9grCNtSoHw c8270453301dbec76f8070809078087ff892c74360b80a82002d58c50af19ec4</pre></p> Orchestrator - Bug #51733 (Resolved): offline host hangs serve loop for 15 minshttps://tracker.ceph.com/issues/517332021-07-19T20:43:39ZDaniel Pivonka
<p>when a host in your cluster goes offline the next time the serve loop starts _refresh_hosts_and_daemons() will be called and eventually _run_cephadm(gather-facts) will be called cause cephadm doesnt know its offline yet.</p>
<p>in _run_cephadm() _remote_connection() will be called to get a connection to the host.<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1166">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1166</a></p>
<p>_remote_connection() calls _get_connection() which will return the current connection if it has one or will open a new connection. if it cant make a connection it then marks the host as offline.<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1347">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L1347</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/64dbe17fdbb27abd89755c61ef01744da5d683cc/src/pybind/mgr/cephadm/module.py#L1301">https://github.com/ceph/ceph/blob/64dbe17fdbb27abd89755c61ef01744da5d683cc/src/pybind/mgr/cephadm/module.py#L1301</a></p>
<p>unfortunately its returning a old current connection to the host that is actually offline and trys to run gather facts on the host through that connection.<br />it then takes 15 mins for it to error out cause that connection is not going to work cause the host is actually offline. during that time the serve loop is stuck.</p>
<p>once it errors out the next time the serve loop starts the host is marked as offline correctly.</p>
<p>ive attached a log of this happening. vm-03 is the offline host</p> Orchestrator - Bug #51446 (Resolved): Module 'dashboard' has failed: Timeout('Port 8443 not bound...https://tracker.ceph.com/issues/514462021-06-30T13:53:11ZDaniel Pivonka
<p>standby dashboard is binding to 127.0.1.1 becasue get_mgr_ip() always returns hostname for standby mgr and with podmans /etc/host file the hostname resolves to 127.0.1.1 when failing over to the standby mgr it crashes cause cherry py is still binded to 127.0.1.1 when it expects it to be binded to the mgr ip</p> Orchestrator - Bug #51056 (Resolved): manage cephadm log with logrotatedhttps://tracker.ceph.com/issues/510562021-06-01T19:16:34ZDaniel Pivonka
<p>other ceph services logs are managed by logrotated when logging to file. make the cephadm binary log the same way.</p> Orchestrator - Bug #50444 (Resolved): host labels order is randomhttps://tracker.ceph.com/issues/504442021-04-20T19:44:31ZDaniel Pivonka
<p>host labels are not stored in the order entered or a logical order like alphabetically. they stored in a randomized order.</p>
<pre>
[ceph: root@vm-00 /]# ceph orch host add vm-01 --labels=a,b,c,d,c
Added host 'vm-01'
[ceph: root@vm-00 /]# ceph orch host ls
HOST ADDR LABELS STATUS
vm-00 192.168.122.220
vm-01 vm-01 b a d c
vm-02 vm-02
[ceph: root@vm-00 /]#
</pre>
<p>the list of labels is stored as set temporarily to remove duplicates here <a class="external" href="https://github.com/ceph/ceph/blob/master/src/python-common/ceph/deployment/hostspec.py#L48,#L57">https://github.com/ceph/ceph/blob/master/src/python-common/ceph/deployment/hostspec.py#L48,#L57</a></p>
<p>this is the side effect of using sets to remove duplicates it randomizes the order.</p> Orchestrator - Documentation #50362 (Duplicate): pacific curl-based-installation docs link to oct...https://tracker.ceph.com/issues/503622021-04-14T17:39:33ZDaniel Pivonka
<p>the link in the curl command here <a class="external" href="https://docs.ceph.com/en/pacific/cephadm/install/#curl-based-installation">https://docs.ceph.com/en/pacific/cephadm/install/#curl-based-installation</a> currently points to <a class="external" href="https://github.com/ceph/ceph/raw/****octopus****/src/cephadm/cephadm">https://github.com/ceph/ceph/raw/****octopus****/src/cephadm/cephadm</a></p>
<p>should be <a class="external" href="https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm">https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm</a></p> Orchestrator - Documentation #50273 (Resolved): remove keepalived_user from haproxy docshttps://tracker.ceph.com/issues/502732021-04-09T21:07:03ZDaniel Pivonka
<p>keepalived_user is not used and not required</p>
<p>putting it in the spec results in an error</p> Orchestrator - Bug #50267 (Resolved): rgw service can be deploy with realm and no zone or vise versahttps://tracker.ceph.com/issues/502672021-04-09T14:32:36ZDaniel Pivonka
<p>--realm and --zone both need to be supplied when doing 'orch apply rgw'</p>
<p>if just --realm is supplied the rgw service starts in default mode (non multisite)</p>
<pre>
[ceph: root@vm-00 /]# ceph orch apply rgw service_id --realm=test_realm
Scheduled rgw.service_id update...
[ceph: root@vm-00 /]#
[ceph: root@vm-00 /]# ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE VERSION IMAGE ID CONTAINER ID
alertmanager.vm-00 vm-00 *:9093 *:9094 running (2m) 1s ago 5m 0.20.0 0881eb8f169f 9f31e4f3f365
crash.vm-00 vm-00 - running (5m) 1s ago 5m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 06f3f7547dc8
crash.vm-01 vm-01 - running (3m) 5s ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 e6b72ff098fd
crash.vm-02 vm-02 - running (3m) 57s ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 31875a4606a2
grafana.vm-00 vm-00 *:3000 running (2m) 1s ago 4m 6.7.4 80728b29ad3f 4f708b3f3f87
mgr.vm-00.tlevml vm-00 *:9283 running (5m) 1s ago 5m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 25cc9cc842f5
mgr.vm-01.opwgkn vm-01 *:8443 *:9283 running (3m) 5s ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 b5679848bfd0
mon.vm-00 vm-00 - running (5m) 1s ago 5m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 a46de779161b
mon.vm-01 vm-01 - running (2m) 5s ago 2m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 48734e63d1ff
mon.vm-02 vm-02 - running (2m) 57s ago 2m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 5788bb55eb46
node-exporter.vm-00 vm-00 *:9100 running (4m) 1s ago 4m 0.18.1 e5a616e4b9cf 2c3f608f9121
node-exporter.vm-01 vm-01 *:9100 running (2m) 5s ago 2m 0.18.1 e5a616e4b9cf ad6c66b445b5
node-exporter.vm-02 vm-02 *:9100 running (2m) 57s ago 2m 0.18.1 e5a616e4b9cf 9e71ade67df0
osd.0 vm-01 - running (2m) 5s ago 2m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 de25e84657f2
osd.1 vm-00 - running (2m) 1s ago 2m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 3f83cb9b125d
osd.2 vm-02 - running (2m) 57s ago 2m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 8d38622bf238
prometheus.vm-00 vm-00 *:9095 running (2m) 1s ago 4m 2.18.1 de242295e225 97d3b9ebbdc3
rgw.service_id.vm-00.tanryt vm-00 *:80 running (8s) 1s ago 7s 17.0.0-2956-g9d7a42e7 2a85b08e9f32 4a3195cfb42b
rgw.service_id.vm-01.poxqal vm-01 *:80 running (10s) 5s ago 10s 17.0.0-2956-g9d7a42e7 2a85b08e9f32 7bdcf373611a
[ceph: root@vm-00 /]#
[ceph: root@vm-00 /]# ceph osd pool ls
device_health_metrics
.rgw.root
default.rgw.log
default.rgw.control
default.rgw.meta
[ceph: root@vm-00 /]#
</pre>
<p>you can see it created the default rgw pools</p>
<p>alternatively if only --zone is supplied the service fails</p>
<pre>
[ceph: root@vm-00 /]# ceph orch apply rgw service_id --zone=test_zone
Scheduled rgw.service_id update...
[ceph: root@vm-00 /]#
[ceph: root@vm-00 /]# ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE VERSION IMAGE ID CONTAINER ID
alertmanager.vm-00 vm-00 *:9093 *:9094 running (3m) 7s ago 6m 0.20.0 0881eb8f169f 9f31e4f3f365
crash.vm-00 vm-00 - running (6m) 7s ago 6m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 06f3f7547dc8
crash.vm-01 vm-01 - running (4m) 8s ago 4m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 e6b72ff098fd
crash.vm-02 vm-02 - running (4m) 2m ago 4m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 31875a4606a2
grafana.vm-00 vm-00 *:3000 running (3m) 7s ago 5m 6.7.4 80728b29ad3f 4f708b3f3f87
mgr.vm-00.tlevml vm-00 *:9283 running (7m) 7s ago 7m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 25cc9cc842f5
mgr.vm-01.opwgkn vm-01 *:8443 *:9283 running (4m) 8s ago 4m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 b5679848bfd0
mon.vm-00 vm-00 - running (7m) 7s ago 7m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 a46de779161b
mon.vm-01 vm-01 - running (4m) 8s ago 4m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 48734e63d1ff
mon.vm-02 vm-02 - running (4m) 2m ago 4m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 5788bb55eb46
node-exporter.vm-00 vm-00 *:9100 running (5m) 7s ago 5m 0.18.1 e5a616e4b9cf 2c3f608f9121
node-exporter.vm-01 vm-01 *:9100 running (3m) 8s ago 3m 0.18.1 e5a616e4b9cf ad6c66b445b5
node-exporter.vm-02 vm-02 *:9100 running (3m) 2m ago 3m 0.18.1 e5a616e4b9cf 9e71ade67df0
osd.0 vm-01 - running (3m) 8s ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 de25e84657f2
osd.1 vm-00 - running (3m) 7s ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 3f83cb9b125d
osd.2 vm-02 - running (3m) 2m ago 3m 17.0.0-2956-g9d7a42e7 2a85b08e9f32 8d38622bf238
prometheus.vm-00 vm-00 *:9095 running (3m) 7s ago 5m 2.18.1 de242295e225 97d3b9ebbdc3
rgw.service_id.vm-00.gilldo vm-00 *:80 unknown 7s ago 10s <unknown> <unknown> <unknown>
rgw.service_id.vm-01.ctthxk vm-01 *:80 unknown 8s ago 12s <unknown> <unknown> <unknown>
[ceph: root@vm-00 /]#
</pre>
<p>journald logs from the rgw container show "cannot find zone id= (name=test_zone)"</p>
<p>both of these flags are required using one or the other causes unexpected behavior/failure</p> Orchestrator - Bug #50248 (Resolved): rgw-nfs daemons marked as stray https://tracker.ceph.com/issues/502482021-04-08T18:15:39ZDaniel Pivonka
<pre>
[ceph: root@vm-00 /]# ceph -s
cluster:
id: ede3e474-9890-11eb-9175-5254007bd8c8
health: HEALTH_WARN
1 stray daemon(s) not managed by cephadm
services:
mon: 3 daemons, quorum vm-00,vm-02,vm-01 (age 5m)
mgr: vm-00.gpoxjw(active, since 8m), standbys: vm-02.epkyvb
osd: 3 osds: 3 up (since 5m), 3 in (since 5m)
rgw: 2 daemons active (2 hosts, 1 zones)
rgw-nfs: 1 daemon active (1 hosts, 1 zones)
data:
pools: 7 pools, 193 pgs
objects: 349 objects, 28 KiB
usage: 40 MiB used, 450 GiB / 450 GiB avail
pgs: 193 active+clean
io:
client: 824 B/s rd, 1 op/s rd, 0 op/s wr
[ceph: root@vm-00 /]# ceph health detail
HEALTH_WARN 1 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
stray daemon 14523 on host vm-01 not managed by cephadm
[ceph: root@vm-00 /]# ceph service status
{
"rgw": {
"14469": {
"status_stamp": "2021-04-08T17:46:34.515799+0000",
"last_beacon": "2021-04-08T17:46:34.515799+0000",
"status": {
"current_sync": "[]"
}
},
"24272": {
"status_stamp": "2021-04-08T17:46:29.569272+0000",
"last_beacon": "2021-04-08T17:46:34.569691+0000",
"status": {
"current_sync": "[]"
}
}
},
"rgw-nfs": {
"14523": {
"status_stamp": "2021-04-08T17:46:31.862084+0000",
"last_beacon": "2021-04-08T17:46:36.862415+0000",
"status": {
"current_sync": "[]"
}
}
}
}
[ceph: root@vm-00 /]# ceph orch ps
NAME HOST PORTS STATUS REFRESHED AGE VERSION IMAGE ID CONTAINER ID
alertmanager.vm-00 vm-00 *:9093 *:9094 running (6m) 2m ago 9m 0.20.0 0881eb8f169f eab0ec8a7e2e
crash.vm-00 vm-00 - running (9m) 2m ago 9m 17.0.0-2394-gc553763e 6376430f9659 df94ec196887
crash.vm-01 vm-01 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 8b34f031b436
crash.vm-02 vm-02 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 b8d161b39f56
grafana.vm-00 vm-00 *:3000 running (5m) 2m ago 8m 6.7.4 80728b29ad3f 3049ae9860bc
mgr.vm-00.gpoxjw vm-00 *:9283 running (10m) 2m ago 10m 17.0.0-2394-gc553763e 6376430f9659 67319da0d51b
mgr.vm-02.epkyvb vm-02 *:8443 *:9283 running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 07017c8df192
mon.vm-00 vm-00 - running (10m) 2m ago 10m 17.0.0-2394-gc553763e 6376430f9659 0cd485da81bd
mon.vm-01 vm-01 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 0dd0ac6ef48e
mon.vm-02 vm-02 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 9f2ec5545c33
nfs.foo.vm-01 vm-01 *:2049 running (3m) 3m ago 3m 3.5 6376430f9659 4645204b975e
node-exporter.vm-00 vm-00 *:9100 running (8m) 2m ago 8m 0.18.1 e5a616e4b9cf b6b1b299f964
node-exporter.vm-01 vm-01 *:9100 running (6m) 3m ago 6m 0.18.1 e5a616e4b9cf 77b15460b6b6
node-exporter.vm-02 vm-02 *:9100 running (6m) 3m ago 6m 0.18.1 e5a616e4b9cf 0c00946ad0b0
osd.0 vm-02 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 6851928922cf
osd.1 vm-01 - running (6m) 3m ago 6m 17.0.0-2394-gc553763e 6376430f9659 4d9ea63707f8
osd.2 vm-00 - running (6m) 2m ago 6m 17.0.0-2394-gc553763e 6376430f9659 454b4f605360
prometheus.vm-00 vm-00 *:9095 running (6m) 2m ago 8m 2.18.1 de242295e225 e8644f0802f7
rgw.example_service_id.vm-00.dlfbug vm-00 *:80 running (3m) 2m ago 3m 17.0.0-2394-gc553763e 6376430f9659 e27b6ee4b0fe
rgw.example_service_id.vm-01.oqigli vm-01 *:80 running (3m) 3m ago 3m 17.0.0-2394-gc553763e 6376430f9659 16f7c564ab0f
[ceph: root@vm-00 /]#
</pre>
<p>looks like the logic here is not working <a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L413-#L427">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/serve.py#L413-#L427</a></p>
<p>rgw-nfs needs to be included in the daemon_id to metadata['id'] conversion logic</p>
<p>related prs: <a class="external" href="https://github.com/ceph/ceph/pull/40220">https://github.com/ceph/ceph/pull/40220</a> <a class="external" href="https://github.com/ceph/ceph/pull/37397">https://github.com/ceph/ceph/pull/37397</a></p> Orchestrator - Bug #50102 (Resolved): spec jsons that expect a list in a field dont verify that a...https://tracker.ceph.com/issues/501022021-04-01T18:08:00ZDaniel Pivonka
<p>examples labels in a hostspec, networks in servicespec</p>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# cat spec.yaml <br />service_type: host<br />addr: 192.168.0.123 <br />hostname: vm-00<br />labels: MON</p>
</blockquote></blockquote></blockquote>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch apply -i spec.yaml <br />Added host 'vm-00'</p>
</blockquote></blockquote></blockquote>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch host ls <del>f yaml <br />--</del><br />addr: 192.168.0.123<br />hostname: vm-00<br />labels:<br />- N<br />- O<br />- M<br />status: ''</p>
</blockquote></blockquote></blockquote>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# cat spec.yaml <br />service_type: rgw<br />service_id: india<br />placement:<br />hosts:<br />- vm-00<br />rgw_frontend_port: 8083<br />networks: 10.0.208.0/22</p>
</blockquote></blockquote></blockquote>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch apply -i spec.yaml <br />Error EINVAL: Cannot parse network 1: '1' does not appear to be an IPv4 or IPv6 network</p>
</blockquote></blockquote></blockquote>
<p>in both cases a list was expected but a string was given. in case of the hostspec no error was given and the data was stored incorrectly, in the case of the servicespec a misleading error was given</p> Orchestrator - Bug #50062 (Resolved): orch host add with multiple labels and no addr https://tracker.ceph.com/issues/500622021-03-30T20:17:28ZDaniel Pivonka
<p>Host add operation throws error saying connection issue as below, if add operation is executed with labels and skipped address with it.</p>
No host address is provided, it takes 'mon' as the addr but the error doesnt really tell you thats the problem<br />-----------------------------
<ol>
<li>ceph orch host add <host> [<labels>]</li>
</ol>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch host add vm-01 mon osd<br />Error EINVAL: Failed to connect to vm-01 (mon).<br />Please make sure that the host is reachable and accepts connections using the cephadm SSH key</p>
<p>To add the cephadm SSH key to the host:</p>
<blockquote>
<p>ceph cephadm get-pub-key > ~/ceph.pub<br />ssh-copy-id -f -i ~/ceph.pub root@vm-01</p>
</blockquote>
<p>To check that the host is reachable:</p>
<blockquote>
<p>ceph cephadm get-ssh-config > ssh_config<br />ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key<br />ssh -F ssh_config -i ~/cephadm_private_key root@vm-01</p>
</blockquote>
</blockquote></blockquote></blockquote>
<p>Worked when address is provided:<br />--------------------------------</p>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch host add vm-01 192.168.0.123 mon osd<br />Added host 'vm-01'</p>
</blockquote></blockquote></blockquote>
<p>Worked when empty address sting address is provided:<br />-----------------------------------------------------</p>
<blockquote><blockquote><blockquote>
<p>[ceph: root@vm-00 /]# ceph orch host add vm-01 '' mon osd<br />Added host 'vm-01'</p>
</blockquote></blockquote></blockquote>
<p>The node was reachable, which is proved when adding it to the cluster when addr positional argument is filled.<br />But Host add with label(s) and not having address positional argument fails by throwing ssh connection error message, here error message is misleading.</p>
<p>the orch host add command can not add a host with multiple labels and no addr without using a empty string in the addr positional argument <br />ceph host add vm-01 --labels=label1,label2 saves a single label "label1,label2"</p> Orchestrator - Bug #50041 (Resolved): cephadm bootstrap with apply-spec anmd ssh-user option fail...https://tracker.ceph.com/issues/500412021-03-29T17:57:29ZDaniel Pivonka
<p>ssh-copy-id is being run as the root user because cephadm requires sudo<br />so it is trying to use the root users ssh keys to copy the cephadm ssh key to the hosts in the spec<br />the user will be prompted for the host being added password if the root user did not has passwordless ssh to the host being added</p>
<p>ssh-copy-id needs to use the ssh keys of the user passed in by --ssh-user</p>