Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2022-01-24T15:39:53Z
Ceph
Redmine
Orchestrator - Cleanup #54000 (New): cephadm: upgrade commands should return yaml
https://tracker.ceph.com/issues/54000
2022-01-24T15:39:53Z
Sebastian Wagner
<p>Right now, commands like</p>
<ul>
<li>ceph orch upgrade ls</li>
<li>ceph orch upgrade status</li>
</ul>
<p>are returning json. YAML is much more readable. Let's return yaml instead!</p>
<p><strong>Note</strong>, now that needs a <strong>--format=...</strong> argument.</p>
Orchestrator - Cleanup #53999 (New): orch interface: cephadm contains a lot of special apply methods
https://tracker.ceph.com/issues/53999
2022-01-24T15:28:49Z
Sebastian Wagner
<pre><code class="python syntaxhl"><span class="CodeRay"> <span class="decorator">@handle_orch_error</span>
<span class="keyword">def</span> <span class="function">apply_rgw</span>(<span class="predefined-constant">self</span>, spec: ServiceSpec) -> <span class="predefined">str</span>:
<span class="keyword">return</span> <span class="predefined-constant">self</span>._apply(spec)
</span></code></pre>
<p>Let's remove them! It's getting out of hand by now. Just use <strong>apply</strong> for everything.</p>
Orchestrator - Feature #53562 (New): cephadm doesn't support osd crush_location_hook
https://tracker.ceph.com/issues/53562
2021-12-09T11:59:53Z
Sebastian Wagner
<p>crush_location_hook is a path to an executable that is executed in order to update the current OSD's crush location. Executed like so:</p>
<pre>
$crush_location_hook --cluster {cluster-name} --id {ID} --type {daemon-type}
</pre>
<p>and prints out the current crush locations.</p>
<p>Workarounds:</p>
<ul>
<li>For a per-host based location, we have: <a class="external" href="https://docs.ceph.com/en/latest/cephadm/host-management/#setting-the-initial-crush-location-of-host">https://docs.ceph.com/en/latest/cephadm/host-management/#setting-the-initial-crush-location-of-host</a> which should cover a lot of use cases.</li>
<li>Build a new container image locally and add the crush_location_hook executable to it. Then set the config option to the file path within the container</li>
</ul>
Orchestrator - Feature #53539 (New): ceph orch exposes no way to see what’s queued or remove work...
https://tracker.ceph.com/issues/53539
2021-12-08T13:50:27Z
Sebastian Wagner
<p>ceph orch exposes no way to see what’s queued or remove work from the queue</p>
<p>We should probably expose the Queue of scheduled ops in <strong>cpeh orch status</strong>.</p>
ceph-volume - Bug #53524 (New): CEPHADM_APPLY_SPEC_FAIL is very verbose
https://tracker.ceph.com/issues/53524
2021-12-08T11:02:11Z
Sebastian Wagner
<pre>
root@service-01-08020:~# ceph health detail
HEALTH_WARN Failed to apply 1 service(s): osd.hybrid; OSD count 0 < osd_pool_default_size 3
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 1 service(s): osd.hybrid
osd.hybrid: cephadm exited with an error code: 1, stderr:Non-zero exit code 2 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
/usr/bin/docker: stderr usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
/usr/bin/docker: stderr [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
/usr/bin/docker: stderr [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
/usr/bin/docker: stderr [--auto] [--no-auto] [--bluestore] [--filestore]
/usr/bin/docker: stderr [--report] [--yes]
/usr/bin/docker: stderr [--format {json,json-pretty,pretty}] [--dmcrypt]
/usr/bin/docker: stderr [--crush-device-class CRUSH_DEVICE_CLASS]
/usr/bin/docker: stderr [--no-systemd]
/usr/bin/docker: stderr [--osds-per-device OSDS_PER_DEVICE]
/usr/bin/docker: stderr [--data-slots DATA_SLOTS]
/usr/bin/docker: stderr [--data-allocate-fraction DATA_ALLOCATE_FRACTION]
/usr/bin/docker: stderr [--block-db-size BLOCK_DB_SIZE]
/usr/bin/docker: stderr [--block-db-slots BLOCK_DB_SLOTS]
/usr/bin/docker: stderr [--block-wal-size BLOCK_WAL_SIZE]
/usr/bin/docker: stderr [--block-wal-slots BLOCK_WAL_SLOTS]
/usr/bin/docker: stderr [--journal-size JOURNAL_SIZE]
/usr/bin/docker: stderr [--journal-slots JOURNAL_SLOTS] [--prepare]
/usr/bin/docker: stderr [--osd-ids [OSD_IDS [OSD_IDS ...]]]
/usr/bin/docker: stderr [DEVICES [DEVICES ...]]
/usr/bin/docker: stderr ceph-volume lvm batch: error: GPT headers found, they must be removed on: /dev/sda
Traceback (most recent call last):
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8331, in <module>
main()
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8319, in main
r = ctx.func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1735, in _infer_config
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1676, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1763, in _infer_image
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1663, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 5285, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1465, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
[WRN] TOO_FEW_OSDS: OSD count 0 < osd_pool_default_size 3
root@service-01-08020:~# ceph health detail
HEALTH_WARN Failed to apply 1 service(s): osd.hybrid; OSD count 0 < osd_pool_default_size 3
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 1 service(s): osd.hybrid
osd.hybrid: cephadm exited with an error code: 1, stderr:Non-zero exit code 2 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
/usr/bin/docker: stderr usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
/usr/bin/docker: stderr [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
/usr/bin/docker: stderr [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
/usr/bin/docker: stderr [--auto] [--no-auto] [--bluestore] [--filestore]
/usr/bin/docker: stderr [--report] [--yes]
/usr/bin/docker: stderr [--format {json,json-pretty,pretty}] [--dmcrypt]
/usr/bin/docker: stderr [--crush-device-class CRUSH_DEVICE_CLASS]
/usr/bin/docker: stderr [--no-systemd]
/usr/bin/docker: stderr [--osds-per-device OSDS_PER_DEVICE]
/usr/bin/docker: stderr [--data-slots DATA_SLOTS]
/usr/bin/docker: stderr [--data-allocate-fraction DATA_ALLOCATE_FRACTION]
/usr/bin/docker: stderr [--block-db-size BLOCK_DB_SIZE]
/usr/bin/docker: stderr [--block-db-slots BLOCK_DB_SLOTS]
/usr/bin/docker: stderr [--block-wal-size BLOCK_WAL_SIZE]
/usr/bin/docker: stderr [--block-wal-slots BLOCK_WAL_SLOTS]
/usr/bin/docker: stderr [--journal-size JOURNAL_SIZE]
/usr/bin/docker: stderr [--journal-slots JOURNAL_SLOTS] [--prepare]
/usr/bin/docker: stderr [--osd-ids [OSD_IDS [OSD_IDS ...]]]
/usr/bin/docker: stderr [DEVICES [DEVICES ...]]
/usr/bin/docker: stderr ceph-volume lvm batch: error: GPT headers found, they must be removed on: /dev/sda
Traceback (most recent call last):
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8331, in <module>
main()
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8319, in main
r = ctx.func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1735, in _infer_config
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1676, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1763, in _infer_image
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1663, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 5285, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1465, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
[WRN] TOO_FEW_OSDS: OSD count 0 < osd_pool_default_size 3
</pre>
Orchestrator - Bug #53422 (New): tasks.cephfs.test_nfs.TestNFS.test_export_create_with_non_existi...
https://tracker.ceph.com/issues/53422
2021-11-29T08:47:58Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-11-26_13:52:15-orch:cephadm-wip-swagner2-testing-2021-11-26-1129-distro-default-smithi/6528237">https://pulpito.ceph.com/swagner-2021-11-26_13:52:15-orch:cephadm-wip-swagner2-testing-2021-11-26-1129-distro-default-smithi/6528237</a></p>
<pre>
teuthology.orchestra.run.smithi145:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph orch ps --service_name=nfs.test
teuthology.orchestra.run.smithi145.stdout:No daemons reported
test_export_create_with_non_existing_fsname (tasks.cephfs.test_nfs.TestNFS) ... FAIL
======================================================================
FAIL: test_export_create_with_non_existing_fsname (tasks.cephfs.test_nfs.TestNFS)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 412, in test_export_create_with_non_existing_fsname
self._test_create_cluster()
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 125, in _test_create_cluster
self._check_nfs_cluster_status('running', 'NFS Ganesha cluster deployment failed')
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 89, in _check_nfs_cluster_status
self.fail(fail_msg)
AssertionError: NFS Ganesha cluster deployment failed
</pre>
<p>Looking at the log, we see the <strong>mfs.test</strong> cluster being created multiple times successfully, but it was never created right before the last test case:</p>
<pre>
2021-11-26T16:46:09: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:46:09: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.phdkqk
2021-11-26T16:46:09: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:46:09: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:46:09: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.phdkqk-rgw
2021-11-26T16:46:09: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.phdkqk on smithi145
2021-11-26T16:46:41: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:46:41: cephadm [INF] Remove service nfs.test
2021-11-26T16:46:41: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.phdkqk...
2021-11-26T16:46:41: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.phdkqk from smithi145
2021-11-26T16:46:45: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.phdkqk
2021-11-26T16:46:45: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.phdkqk-rgw
2021-11-26T16:46:45: cephadm [INF] Purge service nfs.test
2021-11-26T16:46:45: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:46:57: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:46:57: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.vgdnuw
2021-11-26T16:46:57: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:46:57: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:46:57: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.vgdnuw-rgw
2021-11-26T16:46:57: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.vgdnuw on smithi145
2021-11-26T16:47:51: cephadm [INF] Restart service nfs.test
2021-11-26T16:48:22: cephadm [INF] Restart service nfs.test
2021-11-26T16:49:02: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:49:02: cephadm [INF] Remove service nfs.test
2021-11-26T16:49:15: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.vgdnuw...
2021-11-26T16:49:15: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.vgdnuw from smithi145
2021-11-26T16:49:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.vgdnuw
2021-11-26T16:49:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.vgdnuw-rgw
2021-11-26T16:49:19: cephadm [INF] Purge service nfs.test
2021-11-26T16:49:19: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:49:47: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:49:47: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.heyhit
2021-11-26T16:49:47: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:49:47: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:49:47: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.heyhit-rgw
2021-11-26T16:49:47: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.heyhit on smithi145
2021-11-26T16:50:18: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:50:18: cephadm [INF] Remove service nfs.test
2021-11-26T16:50:18: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.heyhit...
2021-11-26T16:50:18: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.heyhit from smithi145
2021-11-26T16:50:22: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.heyhit
2021-11-26T16:50:22: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.heyhit-rgw
2021-11-26T16:50:22: cephadm [INF] Purge service nfs.test
2021-11-26T16:50:22: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:50:32: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:50:32: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.woegpi
2021-11-26T16:50:32: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:50:32: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:50:32: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.woegpi-rgw
2021-11-26T16:50:32: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.woegpi on smithi145
2021-11-26T16:51:19: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:51:19: cephadm [INF] Remove service nfs.test
2021-11-26T16:51:19: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.woegpi...
2021-11-26T16:51:19: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.woegpi from smithi145
2021-11-26T16:51:34: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.woegpi
2021-11-26T16:51:34: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.woegpi-rgw
2021-11-26T16:51:34: cephadm [INF] Purge service nfs.test
2021-11-26T16:51:34: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:51:56: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:51:56: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.uwhrnz
2021-11-26T16:51:56: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:51:56: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:51:56: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.uwhrnz-rgw
2021-11-26T16:51:56: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.uwhrnz on smithi145
2021-11-26T16:52:27: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:52:27: cephadm [INF] Remove service nfs.test
2021-11-26T16:52:27: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.uwhrnz...
2021-11-26T16:52:27: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.uwhrnz from smithi145
2021-11-26T16:52:32: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.uwhrnz
2021-11-26T16:52:32: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.uwhrnz-rgw
2021-11-26T16:52:32: cephadm [INF] Purge service nfs.test
2021-11-26T16:52:32: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:52:41: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:52:41: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.ccuxek
2021-11-26T16:52:41: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:52:41: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:52:41: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.ccuxek-rgw
2021-11-26T16:52:41: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.ccuxek on smithi145
2021-11-26T16:53:15: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:53:15: cephadm [INF] Remove service nfs.test
2021-11-26T16:53:15: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.ccuxek...
2021-11-26T16:53:15: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.ccuxek from smithi145
2021-11-26T16:53:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.ccuxek
2021-11-26T16:53:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.ccuxek-rgw
2021-11-26T16:53:19: cephadm [INF] Purge service nfs.test
2021-11-26T16:53:19: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:53:28: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:53:28: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.dduovn
2021-11-26T16:53:28: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:53:28: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:53:28: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.dduovn-rgw
2021-11-26T16:53:28: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.dduovn on smithi145
2021-11-26T16:54:10: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:54:10: cephadm [INF] Remove service nfs.test
2021-11-26T16:54:10: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.dduovn...
2021-11-26T16:54:10: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.dduovn from smithi145
2021-11-26T16:54:14: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.dduovn
2021-11-26T16:54:14: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.dduovn-rgw
2021-11-26T16:54:14: cephadm [INF] Purge service nfs.test
2021-11-26T16:54:14: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:54:23: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.dduovn...
2021-11-26T16:54:23: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.dduovn from smithi145
2021-11-26T16:54:24: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:54:25: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.mcpayf
2021-11-26T16:54:25: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:54:25: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:54:25: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.mcpayf-rgw
2021-11-26T16:54:25: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.mcpayf on smithi145
2021-11-26T16:55:00: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:55:00: cephadm [INF] Remove service nfs.test
2021-11-26T16:55:00: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.mcpayf...
2021-11-26T16:55:00: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.mcpayf from smithi145
2021-11-26T16:55:04: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.mcpayf
2021-11-26T16:55:04: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.mcpayf-rgw
2021-11-26T16:55:04: cephadm [INF] Purge service nfs.test
2021-11-26T16:55:04: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:55:17: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.mcpayf...
2021-11-26T16:55:17: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.mcpayf from smithi145
</pre>
Orchestrator - Bug #53154 (New): t8y: cephadm: error: unrecognized arguments: --keep-logs
https://tracker.ceph.com/issues/53154
2021-11-04T09:38:54Z
Sebastian Wagner
<pre>
2021-11-03T13:15:09.452 DEBUG:teuthology.orchestra.run.smithi191:> sudo /home/ubuntu/cephtest/cephadm rm-cluster --fsid f2abfd4e-3ca4-11ec-8c28-001a4aab830c --force --keep-logs
2021-11-03T13:15:09.584 INFO:teuthology.orchestra.run.smithi191.stderr:usage: cephadm [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
2021-11-03T13:15:09.584 INFO:teuthology.orchestra.run.smithi191.stderr: [--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: [--unit-dir UNIT_DIR] [--verbose] [--timeout TIMEOUT]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: [--retry RETRY] [--env ENV] [--no-container-init]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: {version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,unit,logs,bootstrap,deplo
y,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts}
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: ...
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr:cephadm: error: unrecognized arguments: --keep-logs
2021-11-03T13:15:09.595 DEBUG:teuthology.orchestra.run:got remote process result: 2
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-11-03_11:47:26-orch:cephadm-wip-swagner-testing-2021-11-03-0958-distro-basic-smithi/6481219">https://pulpito.ceph.com/swagner-2021-11-03_11:47:26-orch:cephadm-wip-swagner-testing-2021-11-03-0958-distro-basic-smithi/6481219</a></p>
Orchestrator - Bug #51361 (New): KillMode=none is deprecated
https://tracker.ceph.com/issues/51361
2021-06-25T09:05:39Z
Sebastian Wagner
<p>We chaged systemd unit file killmode to none in <a class="external" href="https://github.com/ceph/ceph/pull/33162#issuecomment-584183316">https://github.com/ceph/ceph/pull/33162#issuecomment-584183316</a></p>
<p>Now we're getting a new warning:</p>
<pre>
Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
</pre>
Orchestrator - Bug #49287 (New): podman: setting cgroup config for procHooks process caused: Unit...
https://tracker.ceph.com/issues/49287
2021-02-13T00:54:57Z
Sebastian Wagner
<pre>
2021-02-12T16:27:55.195 INFO:teuthology.orchestra.run.smithi014.stderr:Non-zero exit code 127 from /bin/podman run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph:52fc503cf18cf3bb446b840ba00be073017b8373 -e NODE_NAME=smithi014 quay.ceph.io/ceph-ci/ceph:52fc503cf18cf3bb446b840ba0
0be073017b8373 -c %u %g /var/lib/ceph
2021-02-12T16:27:55.195 INFO:teuthology.orchestra.run.smithi014.stderr:stat: stderr Error: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: process_linux.go:422: setting cgroup config for procHooks process caused: Unit libpod-056038e1126191fba41d8a037275136f2d7aeec9710b9ee
ff792c06d8544b983.scope not found.: OCI runtime error
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr:Traceback (most recent call last):
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 7697, in <module>
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: main()
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 7686, in main
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: r = ctx.func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1566, in _infer_fsid
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1603, in _infer_config
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1650, in _infer_image
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 4128, in command_shell
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: make_log_dir(ctx, ctx.fsid)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1752, in make_log_dir
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: uid, gid = extract_uid_gid(ctx)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 2428, in extract_uid_gid
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: raise RuntimeError('uid/gid not found')
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr:RuntimeError: uid/gid not found
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-02-11_11:00:52-rados:cephadm-wip-swagner3-testing-2021-02-10-1322-distro-basic-smithi/5874630">https://pulpito.ceph.com/swagner-2021-02-11_11:00:52-rados:cephadm-wip-swagner3-testing-2021-02-10-1322-distro-basic-smithi/5874630</a></p>
Orchestrator - Bug #49233 (New): cephadm shell: TLS handshake timeout
https://tracker.ceph.com/issues/49233
2021-02-10T11:26:38Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-02-09_10:28:14-rados:cephadm-wip-swagner2-testing-2021-02-08-1109-pacific-distro-basic-smithi/5871391">https://pulpito.ceph.com/swagner-2021-02-09_10:28:14-rados:cephadm-wip-swagner2-testing-2021-02-08-1109-pacific-distro-basic-smithi/5871391</a></p>
<pre>
2021-02-10T06:43:35.452 INFO:teuthology.orchestra.run.smithi132.stderr:Non-zero exit code 125 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint stat -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969bc4e36afefcc -e NODE_NAME=smithi132 quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969b
c4e36afefcc -c %u %g /var/lib/ceph
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr Unable to find image 'quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969bc4e36afefcc' locally
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr /usr/bin/docker: Error response from daemon: Get https://quay.ceph.io/v2/ceph-ci/ceph/manifests/282cc83b9d6c73ac8a35502bb969bc4e36afefcc: net/http: TLS handshake timeout.
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr See '/usr/bin/docker run --help'.
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr:Traceback (most recent call last):
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 7639, in <module>
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: main()
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 7628, in main
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: r = ctx.func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1616, in _infer_fsid
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1653, in _infer_config
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1700, in _infer_image
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 4115, in command_shell
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: make_log_dir(ctx, ctx.fsid)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1802, in make_log_dir
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: uid, gid = extract_uid_gid(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 2469, in extract_uid_gid
2021-02-10T06:43:35.467 INFO:teuthology.orchestra.run.smithi132.stderr: raise RuntimeError('uid/gid not found')
2021-02-10T06:43:35.467 INFO:teuthology.orchestra.run.smithi132.stderr:RuntimeError: uid/gid not found
2
</pre>
<p>Looks like we need to use the ignore list form</p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/40d5c37930f9b7f883c3b6da57be481f1fe6fb6c/src/cephadm/cephadm#L3078-L3086">https://github.com/ceph/ceph/blob/40d5c37930f9b7f883c3b6da57be481f1fe6fb6c/src/cephadm/cephadm#L3078-L3086</a></p>
<p>also for command_shell()</p>
rbd - Bug #46875 (New): TestLibRBD.TestPendingAio: test_librbd.cc:4539: Failure or SIGSEGV
https://tracker.ceph.com/issues/46875
2020-08-10T01:03:45Z
Sebastian Wagner
<pre>
[ RUN ] TestLibRBD.TestPendingAio
using new format!
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/test_librbd.cc:4539: Failure
Expected equality of these values:
1
rbd_aio_is_complete(comps[i])
Which is: 0
[ FAILED ] TestLibRBD.TestPendingAio (68 ms)
</pre>
<p><a class="external" href="https://jenkins.ceph.com/job/ceph-pull-requests/57209/consoleFull#-361705261e840cee4-f4a4-4183-81dd-42855615f2c1">https://jenkins.ceph.com/job/ceph-pull-requests/57209/consoleFull#-361705261e840cee4-f4a4-4183-81dd-42855615f2c1</a></p>
Orchestrator - Feature #45876 (New): cephadm: handle port conflicts gracefully
https://tracker.ceph.com/issues/45876
2020-06-04T10:10:45Z
Sebastian Wagner
<pre>
INFO:cephadm:Verifying port 9100 ...
WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use
ERROR: TCP Port(s) '9100' required for node-exporter is already in use
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 1638, in _run_cephadm
code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying daemon node-exporter.ceph-mon ...
INFO:cephadm:Verifying port 9100 ...
WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use
ERROR: TCP Port(s) '9100' required for node-exporter is already in use
2020-05-15T13:33:46.966159+0000 mgr.ceph-mgr.dixgvy (mgr.14161) 678 : cephadm [WRN] Failed to apply node-exporter spec ServiceSpec(
{'placement': PlacementSpec(host_pattern='*'), 'service_type': 'node-exporter', 'service_id': None, 'unmanaged': False}
): cephadm exited with an error code: 1, stderr:INFO:cephadm:Deploying daemon node-exporter.ceph-mon ...
INFO:cephadm:Verifying port 9100 ...
WARNING:cephadm:Cannot bind to IP 0.0.0.0 port 9100: [Errno 98] Address already in use
ERROR: TCP Port(s) '9100' required for node-exporter is already in use
</pre>
<p>Important bits are:</p>
<ul>
<li><strong>We already know which services want which ports.</strong> </li>
<li>we can easily prevent port conflicts for known daemons.</li>
<li>open Q: how to handle unknown daemons (i.e. pre-existing node expoter)</li>
</ul>
RADOS - Feature #45079 (New): HEALTH_WARN, if require-osd-release is < mimic and OSD wants to joi...
https://tracker.ceph.com/issues/45079
2020-04-14T10:25:33Z
Sebastian Wagner
<p>When upgrading a cluster to octopus, users should get a warning, if require-osd-release is < mimic as this prevents osds from joining the cluster.</p>
<p>Right now, we get a INF in the logs:</p>
<pre>
cluster [INF] disallowing boot of octopus+ OSD osd.1 v1:172.16.1.25:6800/3051821808 because require_osd
</pre>
<p>this should be a HEALTH_WARN instead.</p>
Orchestrator - Feature #44864 (New): cephadm: garbage collect old container images
https://tracker.ceph.com/issues/44864
2020-03-31T15:45:58Z
Sebastian Wagner
<p>cephadm: garbage collect old container images</p>
Orchestrator - Cleanup #43710 (New): Refactor k8sevents module
https://tracker.ceph.com/issues/43710
2020-01-20T15:40:36Z
Sebastian Wagner
<ul>
<li>Unify Rook env vars with mgr/rook</li>
<li>Unify events watcher with mgr/rook</li>
<li>Add to CI</li>
<li>Add Documentation</li>
</ul>