Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2022-01-21T15:53:15Z
Ceph
Redmine
Orchestrator - Bug #53965 (New): cephadm: RGW container is crashing at 'rados_nobjects_list_next2...
https://tracker.ceph.com/issues/53965
2022-01-21T15:53:15Z
Sebastian Wagner
<p>when upgrading from ceph-ansible, we're getting:</p>
<pre>
Jan 12 18:30:14 host conmon[12897]: debug 2022-01-12T17:30:14.112+0000 0 starting handler: beast
Jan 12 18:30:14 host conmon[12897]: ceph version 16.2.0-146.x (sha) pacific (stable)
Jan 12 18:30:14 host conmon[12897]: 1: /lib64/libpthread.so.0(+0x12c20) [0x7fb4b1e86c20]
Jan 12 18:30:14 host conmon[12897]: 2: gsignal()
Jan 12 18:30:14 host conmon[12897]: 3: abort()
Jan 12 18:30:14 host conmon[12897]: 4: /lib64/libstdc++.so.6(+0x9009b) [0x7fb4b0e7a09b]
Jan 12 18:30:14 host conmon[12897]: 5: /lib64/libstdc++.so.6(+0x9653c) [0x7fb4b0e8053c]
Jan 12 18:30:14 host conmon[12897]: 6: /lib64/libstdc++.so.6(+0x96597) [0x7fb4b0e80597]
Jan 12 18:30:14 host conmon[12897]: 7: /lib64/libstdc++.so.6(+0x967f8) [0x7fb4b0e807f8]
Jan 12 18:30:14 host conmon[12897]: 8: /lib64/librados.so.2(+0x3a4c0) [0x7fb4bc3f74c0]
Jan 12 18:30:14 host conmon[12897]: 9: /lib64/librados.so.2(+0x809d2) [0x7fb4bc43d9d2]
Jan 12 18:30:14 host conmon[12897]: 10: (librados::v14_2_0::IoCtx::nobjects_begin(librados::v14_2_0::ObjectCursor const&, ceph::buffer::v15_2_0::list const&)+0x5c) [0x7fb4bc43da5c]
Jan 12 18:30:14 host conmon[12897]: 11: (RGWSI_RADOS::Pool::List::init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWAccessListFilter*)+0x115) [0x7fb4bd26d8e5]
Jan 12 18:30:14 host conmon[12897]: 12: (RGWSI_SysObj_Core::pool_list_objects_init(rgw_pool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWSI_SysObj::Pool::ListCtx*)+0x24b) [0x7fb4bcd9b0bb]
Jan 12 18:30:14 host conmon[12897]: 13: (RGWSI_MetaBackend_SObj::list_init(RGWSI_MetaBackend::Context*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1e0) [0x7fb4bd2609d0]
Jan 12 18:30:14 host conmon[12897]: 14: (RGWMetadataHandler_GenericMetaBE::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x40) [0x7fb4bcead940]
Jan 12 18:30:14 host conmon[12897]: 15: (RGWMetadataManager::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x73) [0x7fb4bceaff13]
Jan 12 18:30:14 host conmon[12897]: 16: (RGWMetadataManager::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x3e) [0x7fb4bceaffce]
Jan 12 18:30:14 host conmon[12897]: 17: (RGWUserStatsCache::sync_all_users(optional_yield)+0x71) [0x7fb4bd052361]
Jan 12 18:30:14 host conmon[12897]: 18: (RGWUserStatsCache::UserSyncThread::entry()+0x10f) [0x7fb4bd05984f]
Jan 12 18:30:14 host conmon[12897]: 19: /lib64/libpthread.so.0(+0x817a) [0x7fb4b1e7c17a]
Jan 12 18:30:14 host conmon[12897]: 20: clone()
</pre>
<p>One possibility is that the pools might not have the rgw tag enabled.</p>
<p><strong>possible workaround</strong></p>
<p>add the rgw tag to all the rgw pools.</p>
<p><strong>known workaround</strong></p>
<p>change the cephx caps of all affected daemons from</p>
<pre>
mon allow *
mgr allow rw
osd allow rwx tag rgw *=*
</pre>
<p>to</p>
<pre>
mon allow *
mgr allow rw
osd allow rwx
</pre>
Orchestrator - Bug #53531 (New): cephadm: how agent finds the active mgr after mgr failover
https://tracker.ceph.com/issues/53531
2021-12-08T11:32:23Z
Sebastian Wagner
<p>how agent finds the active mgr after mgr failover</p>
<p>1) new mgr pokes old agents to provide new mgr endpoint<br />could be parallelized even<br />no cephx key needed</p>
<p>2) agent asks mon for active mgr if it hasn't successfully reported in last N second<br />'ceph mgr services' or similar<br />woudl need cephx key, something like mon 'allow r'<br />agent would need a ceph.conf, and have it updated on ceph.conf changes, etc.</p>
<p>3) agent knows all knowns MGRs and standby modules redirect to the active MGR<br />agent just tries a different MGRs and detects if it gets a redirect to the newly active MGR<br />needs hostname + port for each MGR</p>
<p>Let's go with <strong>(1)</strong>?</p>
Orchestrator - Bug #53529 (New): ceph orch apply ... --dry-run: Table not properly formatted
https://tracker.ceph.com/issues/53529
2021-12-08T11:28:20Z
Sebastian Wagner
<pre>
root@service-01-08020:~# ceph orch apply -i cadvisor.yaml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any on these conditions changes, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.
####################
SERVICESPEC PREVIEWS
####################
+-----------+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
|SERVICE |NAME |ADD_TO |REMOVE_FROM |
+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +-------------+
|container |container.cadvisor |hosta08016 hosta08014 hosta08006 hosta08005 hosta08004 hosta08007 hosta08009 hosta08010 hosta08008 hosta08015 hosta08003 hosta08013 hosta08012 hosta08011 hosta08002 hostb08035 hostb08033 hostb08030 hostb08026 hostb08024 hostb08025 hostb08023 hostb08032 hostb08036 hostb08029 hostb08031 hostb08027 hostb08028 hostb08022 hostb08056 hostb08055 hostb08053 hostb08051 hostb08050 hostb08048 hostb08047 hostb08045 hostb08043 hostb08044 hostb08042 hostb08052 hostb08049 hostd08076 hostd08075 hostd08074 hostd08073 hostd08072 hostd08071 hostd08070 hostd08069 hostd08068 hostd08066 hostd08067 hostd08065 hostd08064 hostd08063 hostd08062 hoste08096 hoste08092 hoste08091 hoste08090 hoste08095 hoste08094 hoste08093 hoste08087 hoste08085 hoste08084 hoste08089 hoste08088 hoste08086 hoste08082 hoste08083 hostf08116 hostf08115 hostf08112 hostf08114 hostf08113 hostf08111 hostf08110 hostf08109 hostf08108 hostf08106 hostf08107 hostf08104 hostf08105 hostf08103 hostf08102 hostg08135 hostg08136 hostg08124 hostg08123 hostg08134 hostg08133 hostg08132 hostg08131 hostg08130 hostg08129 hostg08128 hostg08122 hostg08126 hostg08125 hostg08127 hosth08153 hosth08155 hosth08154 hosth08151 hosth08149 hosth08146 hosth08148 hosth08147 hosth08145 hosth08143 hosth08156 hosth08142 hosth08150 hosth08144 hosth08152 hosti08173 hosti08170 hosti08169 hosti08168 hosti08166 hosti08164 hosti08163 hosti08165 hosti08175 hosti08171 hosti08167 hosti08172 hosti08162 hostk08192 hostk08184 hostk08191 hostk08196 hostk08193 hostk08194 hostk08195 hostk08188 hostk08186 hostk08189 hostk08190 hostk08183 hostk08187 hostk08182 hostk08185 hostm08216 hostm08214 hostm08215 hostm08213 hostm08206 hostm08209 hostm08211 hostm08212 hostm08210 hostm08208 hostm08207 hostm08205 hostm08204 hostm08203 hostm08202 hostn08224 hostn08236 hostn08233 hostn08232 hostn08234 hostn08230 hostn08231 hostn08235 hostn08229 hostn08227 hostn08226 hostn08228 hostn08225 hostn08222 hostn08223 | |
+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +-------------+
################
OSDSPEC PREVIEWS
################
+---------+------+------+------+----+-----+
|SERVICE |NAME |HOST |DATA |DB |WAL |
+---------+------+------+------+----+-----+
+---------+------+------+------+----+-----+
</pre>
Orchestrator - Bug #53525 (New): In case a configcheck alert is firing, you cannot suppress it
https://tracker.ceph.com/issues/53525
2021-12-08T11:04:57Z
Sebastian Wagner
<p>In case the alert “CephNodeInconsistentMTU” is firing all the time - this could be a nuisance alert that customers see. the only way to suppress is with a silence (which has an end date). In addition the monitoring window by default does not filter by active - so you still see all the suppressed alerts!</p>
ceph-volume - Bug #53524 (New): CEPHADM_APPLY_SPEC_FAIL is very verbose
https://tracker.ceph.com/issues/53524
2021-12-08T11:02:11Z
Sebastian Wagner
<pre>
root@service-01-08020:~# ceph health detail
HEALTH_WARN Failed to apply 1 service(s): osd.hybrid; OSD count 0 < osd_pool_default_size 3
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 1 service(s): osd.hybrid
osd.hybrid: cephadm exited with an error code: 1, stderr:Non-zero exit code 2 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
/usr/bin/docker: stderr usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
/usr/bin/docker: stderr [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
/usr/bin/docker: stderr [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
/usr/bin/docker: stderr [--auto] [--no-auto] [--bluestore] [--filestore]
/usr/bin/docker: stderr [--report] [--yes]
/usr/bin/docker: stderr [--format {json,json-pretty,pretty}] [--dmcrypt]
/usr/bin/docker: stderr [--crush-device-class CRUSH_DEVICE_CLASS]
/usr/bin/docker: stderr [--no-systemd]
/usr/bin/docker: stderr [--osds-per-device OSDS_PER_DEVICE]
/usr/bin/docker: stderr [--data-slots DATA_SLOTS]
/usr/bin/docker: stderr [--data-allocate-fraction DATA_ALLOCATE_FRACTION]
/usr/bin/docker: stderr [--block-db-size BLOCK_DB_SIZE]
/usr/bin/docker: stderr [--block-db-slots BLOCK_DB_SLOTS]
/usr/bin/docker: stderr [--block-wal-size BLOCK_WAL_SIZE]
/usr/bin/docker: stderr [--block-wal-slots BLOCK_WAL_SLOTS]
/usr/bin/docker: stderr [--journal-size JOURNAL_SIZE]
/usr/bin/docker: stderr [--journal-slots JOURNAL_SLOTS] [--prepare]
/usr/bin/docker: stderr [--osd-ids [OSD_IDS [OSD_IDS ...]]]
/usr/bin/docker: stderr [DEVICES [DEVICES ...]]
/usr/bin/docker: stderr ceph-volume lvm batch: error: GPT headers found, they must be removed on: /dev/sda
Traceback (most recent call last):
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8331, in <module>
main()
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8319, in main
r = ctx.func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1735, in _infer_config
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1676, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1763, in _infer_image
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1663, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 5285, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1465, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
[WRN] TOO_FEW_OSDS: OSD count 0 < osd_pool_default_size 3
root@service-01-08020:~# ceph health detail
HEALTH_WARN Failed to apply 1 service(s): osd.hybrid; OSD count 0 < osd_pool_default_size 3
[WRN] CEPHADM_APPLY_SPEC_FAIL: Failed to apply 1 service(s): osd.hybrid
osd.hybrid: cephadm exited with an error code: 1, stderr:Non-zero exit code 2 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
/usr/bin/docker: stderr usage: ceph-volume lvm batch [-h] [--db-devices [DB_DEVICES [DB_DEVICES ...]]]
/usr/bin/docker: stderr [--wal-devices [WAL_DEVICES [WAL_DEVICES ...]]]
/usr/bin/docker: stderr [--journal-devices [JOURNAL_DEVICES [JOURNAL_DEVICES ...]]]
/usr/bin/docker: stderr [--auto] [--no-auto] [--bluestore] [--filestore]
/usr/bin/docker: stderr [--report] [--yes]
/usr/bin/docker: stderr [--format {json,json-pretty,pretty}] [--dmcrypt]
/usr/bin/docker: stderr [--crush-device-class CRUSH_DEVICE_CLASS]
/usr/bin/docker: stderr [--no-systemd]
/usr/bin/docker: stderr [--osds-per-device OSDS_PER_DEVICE]
/usr/bin/docker: stderr [--data-slots DATA_SLOTS]
/usr/bin/docker: stderr [--data-allocate-fraction DATA_ALLOCATE_FRACTION]
/usr/bin/docker: stderr [--block-db-size BLOCK_DB_SIZE]
/usr/bin/docker: stderr [--block-db-slots BLOCK_DB_SLOTS]
/usr/bin/docker: stderr [--block-wal-size BLOCK_WAL_SIZE]
/usr/bin/docker: stderr [--block-wal-slots BLOCK_WAL_SLOTS]
/usr/bin/docker: stderr [--journal-size JOURNAL_SIZE]
/usr/bin/docker: stderr [--journal-slots JOURNAL_SLOTS] [--prepare]
/usr/bin/docker: stderr [--osd-ids [OSD_IDS [OSD_IDS ...]]]
/usr/bin/docker: stderr [DEVICES [DEVICES ...]]
/usr/bin/docker: stderr ceph-volume lvm batch: error: GPT headers found, they must be removed on: /dev/sda
Traceback (most recent call last):
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8331, in <module>
main()
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 8319, in main
r = ctx.func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1735, in _infer_config
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1676, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1763, in _infer_image
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1663, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 5285, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/fsid/cephadm.f28d191a6bebefc928858ad3a798620c4b2dcb13f2c9f454ba758f88d7664da6", line 1465, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d -e NODE_NAME=storage-01-08002 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=hybrid -v /var/run/ceph/fsid:/var/run/ceph:z -v /var/log/ceph/fsid:/var/log/ceph:z -v /var/lib/ceph/fsid/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmpfteczv3s:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmp6_nx8uhw:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.ceph.io/ceph-ci/ceph@sha256:94aec5086f8d9581e861925e04d6ca74dd1397e6c721e0576a3defcf0a25377d lvm batch --no-auto /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx /dev/sdy --db-devices /dev/nvme0n1 /dev/nvme1n1 --yes --no-systemd
[WRN] TOO_FEW_OSDS: OSD count 0 < osd_pool_default_size 3
</pre>
Orchestrator - Bug #53422 (New): tasks.cephfs.test_nfs.TestNFS.test_export_create_with_non_existi...
https://tracker.ceph.com/issues/53422
2021-11-29T08:47:58Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-11-26_13:52:15-orch:cephadm-wip-swagner2-testing-2021-11-26-1129-distro-default-smithi/6528237">https://pulpito.ceph.com/swagner-2021-11-26_13:52:15-orch:cephadm-wip-swagner2-testing-2021-11-26-1129-distro-default-smithi/6528237</a></p>
<pre>
teuthology.orchestra.run.smithi145:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph orch ps --service_name=nfs.test
teuthology.orchestra.run.smithi145.stdout:No daemons reported
test_export_create_with_non_existing_fsname (tasks.cephfs.test_nfs.TestNFS) ... FAIL
======================================================================
FAIL: test_export_create_with_non_existing_fsname (tasks.cephfs.test_nfs.TestNFS)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 412, in test_export_create_with_non_existing_fsname
self._test_create_cluster()
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 125, in _test_create_cluster
self._check_nfs_cluster_status('running', 'NFS Ganesha cluster deployment failed')
File "/home/teuthworker/src/git.ceph.com_ceph-c_c9fd0972675c568f5d517be49820a12e77fab497/qa/tasks/cephfs/test_nfs.py", line 89, in _check_nfs_cluster_status
self.fail(fail_msg)
AssertionError: NFS Ganesha cluster deployment failed
</pre>
<p>Looking at the log, we see the <strong>mfs.test</strong> cluster being created multiple times successfully, but it was never created right before the last test case:</p>
<pre>
2021-11-26T16:46:09: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:46:09: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.phdkqk
2021-11-26T16:46:09: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:46:09: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:46:09: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.phdkqk-rgw
2021-11-26T16:46:09: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.phdkqk on smithi145
2021-11-26T16:46:41: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:46:41: cephadm [INF] Remove service nfs.test
2021-11-26T16:46:41: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.phdkqk...
2021-11-26T16:46:41: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.phdkqk from smithi145
2021-11-26T16:46:45: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.phdkqk
2021-11-26T16:46:45: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.phdkqk-rgw
2021-11-26T16:46:45: cephadm [INF] Purge service nfs.test
2021-11-26T16:46:45: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:46:57: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:46:57: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.vgdnuw
2021-11-26T16:46:57: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:46:57: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:46:57: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.vgdnuw-rgw
2021-11-26T16:46:57: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.vgdnuw on smithi145
2021-11-26T16:47:51: cephadm [INF] Restart service nfs.test
2021-11-26T16:48:22: cephadm [INF] Restart service nfs.test
2021-11-26T16:49:02: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:49:02: cephadm [INF] Remove service nfs.test
2021-11-26T16:49:15: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.vgdnuw...
2021-11-26T16:49:15: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.vgdnuw from smithi145
2021-11-26T16:49:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.vgdnuw
2021-11-26T16:49:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.vgdnuw-rgw
2021-11-26T16:49:19: cephadm [INF] Purge service nfs.test
2021-11-26T16:49:19: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:49:47: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:49:47: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.heyhit
2021-11-26T16:49:47: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:49:47: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:49:47: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.heyhit-rgw
2021-11-26T16:49:47: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.heyhit on smithi145
2021-11-26T16:50:18: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:50:18: cephadm [INF] Remove service nfs.test
2021-11-26T16:50:18: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.heyhit...
2021-11-26T16:50:18: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.heyhit from smithi145
2021-11-26T16:50:22: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.heyhit
2021-11-26T16:50:22: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.heyhit-rgw
2021-11-26T16:50:22: cephadm [INF] Purge service nfs.test
2021-11-26T16:50:22: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:50:32: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:50:32: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.woegpi
2021-11-26T16:50:32: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:50:32: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:50:32: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.woegpi-rgw
2021-11-26T16:50:32: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.woegpi on smithi145
2021-11-26T16:51:19: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:51:19: cephadm [INF] Remove service nfs.test
2021-11-26T16:51:19: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.woegpi...
2021-11-26T16:51:19: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.woegpi from smithi145
2021-11-26T16:51:34: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.woegpi
2021-11-26T16:51:34: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.woegpi-rgw
2021-11-26T16:51:34: cephadm [INF] Purge service nfs.test
2021-11-26T16:51:34: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:51:56: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:51:56: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.uwhrnz
2021-11-26T16:51:56: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:51:56: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:51:56: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.uwhrnz-rgw
2021-11-26T16:51:56: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.uwhrnz on smithi145
2021-11-26T16:52:27: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:52:27: cephadm [INF] Remove service nfs.test
2021-11-26T16:52:27: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.uwhrnz...
2021-11-26T16:52:27: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.uwhrnz from smithi145
2021-11-26T16:52:32: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.uwhrnz
2021-11-26T16:52:32: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.uwhrnz-rgw
2021-11-26T16:52:32: cephadm [INF] Purge service nfs.test
2021-11-26T16:52:32: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:52:41: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:52:41: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.ccuxek
2021-11-26T16:52:41: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:52:41: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:52:41: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.ccuxek-rgw
2021-11-26T16:52:41: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.ccuxek on smithi145
2021-11-26T16:53:15: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:53:15: cephadm [INF] Remove service nfs.test
2021-11-26T16:53:15: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.ccuxek...
2021-11-26T16:53:15: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.ccuxek from smithi145
2021-11-26T16:53:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.ccuxek
2021-11-26T16:53:19: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.ccuxek-rgw
2021-11-26T16:53:19: cephadm [INF] Purge service nfs.test
2021-11-26T16:53:19: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:53:28: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:53:28: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.dduovn
2021-11-26T16:53:28: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:53:28: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:53:28: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.dduovn-rgw
2021-11-26T16:53:28: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.dduovn on smithi145
2021-11-26T16:54:10: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:54:10: cephadm [INF] Remove service nfs.test
2021-11-26T16:54:10: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.dduovn...
2021-11-26T16:54:10: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.dduovn from smithi145
2021-11-26T16:54:14: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.dduovn
2021-11-26T16:54:14: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.dduovn-rgw
2021-11-26T16:54:14: cephadm [INF] Purge service nfs.test
2021-11-26T16:54:14: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:54:23: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.dduovn...
2021-11-26T16:54:23: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.dduovn from smithi145
2021-11-26T16:54:24: cephadm [INF] Saving service nfs.test spec with placement count:1
2021-11-26T16:54:25: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.mcpayf
2021-11-26T16:54:25: cephadm [INF] Ensuring nfs.test.0 is in the ganesha grace table
2021-11-26T16:54:25: cephadm [INF] Rados config object exists: conf-nfs.test
2021-11-26T16:54:25: cephadm [INF] Creating key for client.nfs.test.0.0.smithi145.mcpayf-rgw
2021-11-26T16:54:25: cephadm [INF] Deploying daemon nfs.test.0.0.smithi145.mcpayf on smithi145
2021-11-26T16:55:00: cephadm [INF] Remove service ingress.nfs.test
2021-11-26T16:55:00: cephadm [INF] Remove service nfs.test
2021-11-26T16:55:00: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.mcpayf...
2021-11-26T16:55:00: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.mcpayf from smithi145
2021-11-26T16:55:04: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.mcpayf
2021-11-26T16:55:04: cephadm [INF] Removing key for client.nfs.test.0.0.smithi145.mcpayf-rgw
2021-11-26T16:55:04: cephadm [INF] Purge service nfs.test
2021-11-26T16:55:04: cephadm [INF] Removing grace file for nfs.test
2021-11-26T16:55:17: cephadm [INF] Removing orphan daemon nfs.test.0.0.smithi145.mcpayf...
2021-11-26T16:55:17: cephadm [INF] Removing daemon nfs.test.0.0.smithi145.mcpayf from smithi145
</pre>
Orchestrator - Bug #53154 (New): t8y: cephadm: error: unrecognized arguments: --keep-logs
https://tracker.ceph.com/issues/53154
2021-11-04T09:38:54Z
Sebastian Wagner
<pre>
2021-11-03T13:15:09.452 DEBUG:teuthology.orchestra.run.smithi191:> sudo /home/ubuntu/cephtest/cephadm rm-cluster --fsid f2abfd4e-3ca4-11ec-8c28-001a4aab830c --force --keep-logs
2021-11-03T13:15:09.584 INFO:teuthology.orchestra.run.smithi191.stderr:usage: cephadm [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
2021-11-03T13:15:09.584 INFO:teuthology.orchestra.run.smithi191.stderr: [--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: [--unit-dir UNIT_DIR] [--verbose] [--timeout TIMEOUT]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: [--retry RETRY] [--env ENV] [--no-container-init]
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: {version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,unit,logs,bootstrap,deplo
y,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts}
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr: ...
2021-11-03T13:15:09.585 INFO:teuthology.orchestra.run.smithi191.stderr:cephadm: error: unrecognized arguments: --keep-logs
2021-11-03T13:15:09.595 DEBUG:teuthology.orchestra.run:got remote process result: 2
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-11-03_11:47:26-orch:cephadm-wip-swagner-testing-2021-11-03-0958-distro-basic-smithi/6481219">https://pulpito.ceph.com/swagner-2021-11-03_11:47:26-orch:cephadm-wip-swagner-testing-2021-11-03-0958-distro-basic-smithi/6481219</a></p>
Orchestrator - Bug #51366 (New): cephadm: Super hard to use loopback devices for OSDs
https://tracker.ceph.com/issues/51366
2021-06-25T10:48:09Z
Sebastian Wagner
<a name="Bootstrap-the-clusterh3-losetup"></a>
<h3 >Bootstrap the cluster<br />h3. losetup<a href="#Bootstrap-the-clusterh3-losetup" class="wiki-anchor">¶</a></h3>
<pre>
# fallocate -l 6G disk1.img
# fallocate -l 6G disk2.img
# fallocate -l 6G disk3.img
# sudo losetup $(sudo losetup -f) disk1.img
# sudo losetup $(sudo losetup -f) disk2.img
# sudo losetup $(sudo losetup -f) disk3.img
# sudo wipefs -a /dev/loop0
# sudo wipefs -a /dev/loop1
# sudo wipefs -a /dev/loop2
# lsblk
</pre>
<a name="start-ceph-volume-container"></a>
<h3 >start ceph-volume container<a href="#start-ceph-volume-container" class="wiki-anchor">¶</a></h3>
<p>Run</p>
<pre>
sudo ./cephadm ceph-volume lvm create --data /dev/loop0
</pre>
<p>This won't work. Instead copy the podman command and:</p>
<ul>
<li>Remove <strong>--entrypoint /usr/sbin/ceph-volume</strong></li>
<li>Add <strong>-it</strong></li>
<li>Add <strong>-v /etc/ceph:/etc/ceph</strong></li>
</ul>
<p>execute the podman command</p>
<a name="patch-ceph-volume"></a>
<h3 >patch ceph-volume<a href="#patch-ceph-volume" class="wiki-anchor">¶</a></h3>
<p>In file <strong>/usr/lib/python3.6/site-packages/ceph_volume/util/disk.py * add *loop</strong></p>
<pre>
[root@sebastians-laptop /]# grep -C 5 loop /usr/lib/python3.6/site-packages/ceph_volume/util/disk.py
if not os.path.exists(dev):
return False
# use lsblk first, fall back to using stat
TYPE = lsblk(dev).get('TYPE')
if TYPE:
return TYPE in ['disk', 'mpath', 'loop']
</pre>
<a name="ceph-volume-lvm-create"></a>
<h3 >ceph-volume lvm create<a href="#ceph-volume-lvm-create" class="wiki-anchor">¶</a></h3>
<pre>
# mkdir -p /var/lib/ceph/bootstrap-osd/
# ceph auth get client.bootstrap-osd > /var/lib/ceph/bootstrap-osd/ceph.keyring
# CEPH_VOLUME_DEBUG=true ceph-volume lvm create --data /dev/loop0 --no-systemd
</pre>
<a name="activate-doesnt-work"></a>
<h3 >activate doesn't work:<a href="#activate-doesnt-work" class="wiki-anchor">¶</a></h3>
<pre>
[root@sebastians-laptop /]# ceph cephadm osd activate sebastians-laptop
Created osd(s) 0 on host 'sebastians-laptop'
</pre>
<p>Instead, you need to follow the manual steps: <a class="external" href="https://tracker.ceph.com/issues/46691#note-1">https://tracker.ceph.com/issues/46691#note-1</a></p>
Orchestrator - Bug #51361 (New): KillMode=none is deprecated
https://tracker.ceph.com/issues/51361
2021-06-25T09:05:39Z
Sebastian Wagner
<p>We chaged systemd unit file killmode to none in <a class="external" href="https://github.com/ceph/ceph/pull/33162#issuecomment-584183316">https://github.com/ceph/ceph/pull/33162#issuecomment-584183316</a></p>
<p>Now we're getting a new warning:</p>
<pre>
Unit configured to use KillMode=none. This is unsafe, as it disables systemd's process lifecycle management for the service. Please update your service to use a safer KillMode=, such as 'mixed' or 'control-group'. Support for KillMode=none is deprecated and will eventually be removed.
</pre>
Orchestrator - Bug #50776 (New): cephadm: CRUSH uses bare host names
https://tracker.ceph.com/issues/50776
2021-05-12T15:01:00Z
Sebastian Wagner
<p><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L1411">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L1411</a></p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/04d9194fd36551b0b45abc06816449c1c0ff248b/src/pybind/mgr/cephadm/services/osd.py#L325">https://github.com/ceph/ceph/blob/04d9194fd36551b0b45abc06816449c1c0ff248b/src/pybind/mgr/cephadm/services/osd.py#L325</a></p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/04d9194fd36551b0b45abc06816449c1c0ff248b/src/pybind/mgr/cephadm/services/osd.py#L50">https://github.com/ceph/ceph/blob/04d9194fd36551b0b45abc06816449c1c0ff248b/src/pybind/mgr/cephadm/services/osd.py#L50</a></p>
<pre>
def bare_name_to_hostname(bare)
...
</pre>
Orchestrator - Bug #49287 (New): podman: setting cgroup config for procHooks process caused: Unit...
https://tracker.ceph.com/issues/49287
2021-02-13T00:54:57Z
Sebastian Wagner
<pre>
2021-02-12T16:27:55.195 INFO:teuthology.orchestra.run.smithi014.stderr:Non-zero exit code 127 from /bin/podman run --rm --ipc=host --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph:52fc503cf18cf3bb446b840ba00be073017b8373 -e NODE_NAME=smithi014 quay.ceph.io/ceph-ci/ceph:52fc503cf18cf3bb446b840ba0
0be073017b8373 -c %u %g /var/lib/ceph
2021-02-12T16:27:55.195 INFO:teuthology.orchestra.run.smithi014.stderr:stat: stderr Error: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: process_linux.go:422: setting cgroup config for procHooks process caused: Unit libpod-056038e1126191fba41d8a037275136f2d7aeec9710b9ee
ff792c06d8544b983.scope not found.: OCI runtime error
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr:Traceback (most recent call last):
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 7697, in <module>
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: main()
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 7686, in main
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: r = ctx.func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1566, in _infer_fsid
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1603, in _infer_config
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1650, in _infer_image
2021-02-12T16:27:55.201 INFO:teuthology.orchestra.run.smithi014.stderr: return func(ctx)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 4128, in command_shell
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: make_log_dir(ctx, ctx.fsid)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 1752, in make_log_dir
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: uid, gid = extract_uid_gid(ctx)
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: File "/home/ubuntu/cephtest/cephadm", line 2428, in extract_uid_gid
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr: raise RuntimeError('uid/gid not found')
2021-02-12T16:27:55.202 INFO:teuthology.orchestra.run.smithi014.stderr:RuntimeError: uid/gid not found
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-02-11_11:00:52-rados:cephadm-wip-swagner3-testing-2021-02-10-1322-distro-basic-smithi/5874630">https://pulpito.ceph.com/swagner-2021-02-11_11:00:52-rados:cephadm-wip-swagner3-testing-2021-02-10-1322-distro-basic-smithi/5874630</a></p>
Orchestrator - Bug #49233 (New): cephadm shell: TLS handshake timeout
https://tracker.ceph.com/issues/49233
2021-02-10T11:26:38Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2021-02-09_10:28:14-rados:cephadm-wip-swagner2-testing-2021-02-08-1109-pacific-distro-basic-smithi/5871391">https://pulpito.ceph.com/swagner-2021-02-09_10:28:14-rados:cephadm-wip-swagner2-testing-2021-02-08-1109-pacific-distro-basic-smithi/5871391</a></p>
<pre>
2021-02-10T06:43:35.452 INFO:teuthology.orchestra.run.smithi132.stderr:Non-zero exit code 125 from /usr/bin/docker run --rm --ipc=host --net=host --entrypoint stat -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969bc4e36afefcc -e NODE_NAME=smithi132 quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969b
c4e36afefcc -c %u %g /var/lib/ceph
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr Unable to find image 'quay.ceph.io/ceph-ci/ceph:282cc83b9d6c73ac8a35502bb969bc4e36afefcc' locally
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr /usr/bin/docker: Error response from daemon: Get https://quay.ceph.io/v2/ceph-ci/ceph/manifests/282cc83b9d6c73ac8a35502bb969bc4e36afefcc: net/http: TLS handshake timeout.
2021-02-10T06:43:35.453 INFO:teuthology.orchestra.run.smithi132.stderr:stat: stderr See '/usr/bin/docker run --help'.
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr:Traceback (most recent call last):
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 7639, in <module>
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: main()
2021-02-10T06:43:35.465 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 7628, in main
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: r = ctx.func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1616, in _infer_fsid
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1653, in _infer_config
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1700, in _infer_image
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: return func(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 4115, in command_shell
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: make_log_dir(ctx, ctx.fsid)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 1802, in make_log_dir
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: uid, gid = extract_uid_gid(ctx)
2021-02-10T06:43:35.466 INFO:teuthology.orchestra.run.smithi132.stderr: File "/usr/sbin/cephadm", line 2469, in extract_uid_gid
2021-02-10T06:43:35.467 INFO:teuthology.orchestra.run.smithi132.stderr: raise RuntimeError('uid/gid not found')
2021-02-10T06:43:35.467 INFO:teuthology.orchestra.run.smithi132.stderr:RuntimeError: uid/gid not found
2
</pre>
<p>Looks like we need to use the ignore list form</p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/40d5c37930f9b7f883c3b6da57be481f1fe6fb6c/src/cephadm/cephadm#L3078-L3086">https://github.com/ceph/ceph/blob/40d5c37930f9b7f883c3b6da57be481f1fe6fb6c/src/cephadm/cephadm#L3078-L3086</a></p>
<p>also for command_shell()</p>
Orchestrator - Bug #48742 (New): cephadm bootstrap --docker doesn't actually work. It's just that...
https://tracker.ceph.com/issues/48742
2021-01-04T13:56:35Z
Sebastian Wagner
<p>If you run</p>
<pre>
cephadm bootstrap --docker
</pre>
<p>we're going to use docker for the bootstrap host. But we're <strong>not storing this persistently</strong>. Thus<br />we're actually using <strong>podman on all other hosts</strong>.</p>
<p>To fix this, we I see two options:</p>
<ol>
<li>persistently store the <code>docker</code> flag as a module option</li>
<li>drop <code>--docker</code> from <code>cehpadm bootstrap</code></li>
</ol>
<p>both options are IMO valid.</p>
Dashboard - Bug #47231 (New): ERROR: setUpClass (tasks.mgr.dashboard.test_cephfs.CephfsTest)
https://tracker.ceph.com/issues/47231
2020-09-01T10:42:27Z
Sebastian Wagner
<p><a class="external" href="https://jenkins.ceph.com/job/ceph-api/2144/">https://jenkins.ceph.com/job/ceph-api/2144/</a><br /><a class="external" href="https://jenkins.ceph.com/job/ceph-api/3204/">https://jenkins.ceph.com/job/ceph-api/3204/</a></p>
<pre>
2020-09-01 09:18:35,239.239 INFO:__main__:----------------------------------------------------------------------
2020-09-01 09:18:35,239.239 INFO:__main__:Traceback (most recent call last):
2020-09-01 09:18:35,240.240 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/dashboard/helper.py", line 149, in setUpClass
2020-09-01 09:18:35,240.240 INFO:__main__: cls._assign_ports("dashboard", "ssl_server_port")
2020-09-01 09:18:35,240.240 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/mgr/mgr_test_case.py", line 218, in _assign_ports
2020-09-01 09:18:35,240.240 INFO:__main__: cls.wait_until_true(is_available, timeout=30)
2020-09-01 09:18:35,240.240 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-api/qa/tasks/ceph_test_case.py", line 196, in wait_until_true
2020-09-01 09:18:35,240.240 INFO:__main__: raise TestTimeoutError("Timed out after {0}s".format(elapsed))
2020-09-01 09:18:35,240.240 INFO:__main__:tasks.ceph_test_case.TestTimeoutError: Timed out after 30s
2020-09-01 09:18:35,241.241 INFO:__main__:
2020-09-01 09:18:35,241.241 INFO:__main__:----------------------------------------------------------------------
2020-09-01 09:18:35,241.241 INFO:__main__:Ran 14 tests in 1278.060s
2020-09-01 09:18:35,241.241 INFO:__main__:
2020-09-01 09:18:35,241.241 INFO:__main__:
</pre>
rbd - Bug #46875 (New): TestLibRBD.TestPendingAio: test_librbd.cc:4539: Failure or SIGSEGV
https://tracker.ceph.com/issues/46875
2020-08-10T01:03:45Z
Sebastian Wagner
<pre>
[ RUN ] TestLibRBD.TestPendingAio
using new format!
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/librbd/test_librbd.cc:4539: Failure
Expected equality of these values:
1
rbd_aio_is_complete(comps[i])
Which is: 0
[ FAILED ] TestLibRBD.TestPendingAio (68 ms)
</pre>
<p><a class="external" href="https://jenkins.ceph.com/job/ceph-pull-requests/57209/consoleFull#-361705261e840cee4-f4a4-4183-81dd-42855615f2c1">https://jenkins.ceph.com/job/ceph-pull-requests/57209/consoleFull#-361705261e840cee4-f4a4-4183-81dd-42855615f2c1</a></p>