Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2022-01-21T15:53:15Z
Ceph
Redmine
Orchestrator - Bug #53965 (New): cephadm: RGW container is crashing at 'rados_nobjects_list_next2...
https://tracker.ceph.com/issues/53965
2022-01-21T15:53:15Z
Sebastian Wagner
<p>when upgrading from ceph-ansible, we're getting:</p>
<pre>
Jan 12 18:30:14 host conmon[12897]: debug 2022-01-12T17:30:14.112+0000 0 starting handler: beast
Jan 12 18:30:14 host conmon[12897]: ceph version 16.2.0-146.x (sha) pacific (stable)
Jan 12 18:30:14 host conmon[12897]: 1: /lib64/libpthread.so.0(+0x12c20) [0x7fb4b1e86c20]
Jan 12 18:30:14 host conmon[12897]: 2: gsignal()
Jan 12 18:30:14 host conmon[12897]: 3: abort()
Jan 12 18:30:14 host conmon[12897]: 4: /lib64/libstdc++.so.6(+0x9009b) [0x7fb4b0e7a09b]
Jan 12 18:30:14 host conmon[12897]: 5: /lib64/libstdc++.so.6(+0x9653c) [0x7fb4b0e8053c]
Jan 12 18:30:14 host conmon[12897]: 6: /lib64/libstdc++.so.6(+0x96597) [0x7fb4b0e80597]
Jan 12 18:30:14 host conmon[12897]: 7: /lib64/libstdc++.so.6(+0x967f8) [0x7fb4b0e807f8]
Jan 12 18:30:14 host conmon[12897]: 8: /lib64/librados.so.2(+0x3a4c0) [0x7fb4bc3f74c0]
Jan 12 18:30:14 host conmon[12897]: 9: /lib64/librados.so.2(+0x809d2) [0x7fb4bc43d9d2]
Jan 12 18:30:14 host conmon[12897]: 10: (librados::v14_2_0::IoCtx::nobjects_begin(librados::v14_2_0::ObjectCursor const&, ceph::buffer::v15_2_0::list const&)+0x5c) [0x7fb4bc43da5c]
Jan 12 18:30:14 host conmon[12897]: 11: (RGWSI_RADOS::Pool::List::init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWAccessListFilter*)+0x115) [0x7fb4bd26d8e5]
Jan 12 18:30:14 host conmon[12897]: 12: (RGWSI_SysObj_Core::pool_list_objects_init(rgw_pool const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, RGWSI_SysObj::Pool::ListCtx*)+0x24b) [0x7fb4bcd9b0bb]
Jan 12 18:30:14 host conmon[12897]: 13: (RGWSI_MetaBackend_SObj::list_init(RGWSI_MetaBackend::Context*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1e0) [0x7fb4bd2609d0]
Jan 12 18:30:14 host conmon[12897]: 14: (RGWMetadataHandler_GenericMetaBE::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x40) [0x7fb4bcead940]
Jan 12 18:30:14 host conmon[12897]: 15: (RGWMetadataManager::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x73) [0x7fb4bceaff13]
Jan 12 18:30:14 host conmon[12897]: 16: (RGWMetadataManager::list_keys_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, void**)+0x3e) [0x7fb4bceaffce]
Jan 12 18:30:14 host conmon[12897]: 17: (RGWUserStatsCache::sync_all_users(optional_yield)+0x71) [0x7fb4bd052361]
Jan 12 18:30:14 host conmon[12897]: 18: (RGWUserStatsCache::UserSyncThread::entry()+0x10f) [0x7fb4bd05984f]
Jan 12 18:30:14 host conmon[12897]: 19: /lib64/libpthread.so.0(+0x817a) [0x7fb4b1e7c17a]
Jan 12 18:30:14 host conmon[12897]: 20: clone()
</pre>
<p>One possibility is that the pools might not have the rgw tag enabled.</p>
<p><strong>possible workaround</strong></p>
<p>add the rgw tag to all the rgw pools.</p>
<p><strong>known workaround</strong></p>
<p>change the cephx caps of all affected daemons from</p>
<pre>
mon allow *
mgr allow rw
osd allow rwx tag rgw *=*
</pre>
<p>to</p>
<pre>
mon allow *
mgr allow rw
osd allow rwx
</pre>
Orchestrator - Bug #53939 (Resolved): ceph-nfs-upgrade, pacific: Upgrade Paused due to UPGRADE_RE...
https://tracker.ceph.com/issues/53939
2022-01-19T16:07:48Z
Sebastian Wagner
<pre>
mon[102341]: : cluster [WRN] Health check failed: Upgrading daemon osd.0 on host smithi103 failed. (UPGRADE_REDEPLOY_DAEMON)
mon[66897]: cephadm 2022-01-18T16:27:48.439275+0000 mgr.smithi103.wyeocw (mgr.14712) 129 : cephadm [ERR] cephadm exited with an error code: 1, stderr:Redeploy daemon osd.0 ...
mon[66897]: Non-zero exit code 1 from systemctl start ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0
mon[66897]: systemctl: stderr Job for ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service failed because a timeout was exceeded.
mon[66897]: systemctl: stderr See "systemctl status ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service" and "journalctl -xe" for details.
mon[66897]: Traceback (most recent call last):
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 8615, in <module>
mon[66897]: main()
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 8603, in main
mon[66897]: r = ctx.func(ctx)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 1790, in _default_image
mon[66897]: return func(ctx)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 4603, in command_deploy
mon[66897]: ports=daemon_ports)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 2715, in deploy_daemon
mon[66897]: c, osd_fsid=osd_fsid, ports=ports)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 2960, in deploy_daemon_units
mon[66897]: call_throws(ctx, ['systemctl', 'start', unit_name])
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 1469, in call_throws
mon[66897]: raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')
mon[66897]: RuntimeError: Failed command: systemctl start ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0: Job for ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service failed because a timeout was exceeded.
mon[66897]: See "systemctl status ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service" and "journalctl -xe" for details.
mon[66897]: Traceback (most recent call last):
mon[66897]: File "/usr/share/ceph/mgr/cephadm/serve.py", line 1402, in _remote_connection
mon[66897]: yield (conn, connr)
mon[66897]: File "/usr/share/ceph/mgr/cephadm/serve.py", line 1295, in _run_cephadm
mon[66897]: code, '\n'.join(err)))
mon[66897]: orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Redeploy daemon osd.0 ...
mon[66897]: Non-zero exit code 1 from systemctl start ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0
mon[66897]: systemctl: stderr Job for ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service failed because a timeout was exceeded.
mon[66897]: systemctl: stderr See "systemctl status ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service" and "journalctl -xe" for details.
mon[66897]: Traceback (most recent call last):
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 8615, in <module>
mon[66897]: main()
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 8603, in main
mon[66897]: r = ctx.func(ctx)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 1790, in _default_image
mon[66897]: return func(ctx)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 4603, in command_deploy
mon[66897]: ports=daemon_ports)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 2715, in deploy_daemon
mon[66897]: c, osd_fsid=osd_fsid, ports=ports)
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 2960, in deploy_daemon_units
mon[66897]: call_throws(ctx, ['systemctl', 'start', unit_name])
mon[66897]: File "/var/lib/ceph/e287ac0e-7879-11ec-8c34-001a4aab830c/cephadm.c659ab77cc705b8440c5bb10bf729dd981addbc618204d30ac82f427ecc4779d", line 1469, in call_throws
mon[66897]: raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')
mon[66897]: RuntimeError: Failed command: systemctl start ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0: Job for ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service failed because a timeout was exceeded.
mon[66897]: See "systemctl status ceph-e287ac0e-7879-11ec-8c34-001a4aab830c@osd.0.service" and "journalctl -xe" for details.
...
cephadm 2022-01-18T16:27:48.439412+0000 mgr.smithi103.wyeocw (mgr.14712) 130 : cephadm [ERR] Upgrade: Paused due to UPGRADE_REDEPLOY_DAEMON: Upgrading daemon osd.0 on host smithi103 failed.
</pre>
<p><a class="external" href="https://pulpito.ceph.com/swagner-2022-01-18_15:34:53-rados:cephadm-wip-swagner2-testing-2022-01-18-1242-pacific-distro-default-smithi/6624255">https://pulpito.ceph.com/swagner-2022-01-18_15:34:53-rados:cephadm-wip-swagner2-testing-2022-01-18-1242-pacific-distro-default-smithi/6624255</a></p>
Orchestrator - Bug #53904 (Duplicate): cephadm: ingress jobs stuck
https://tracker.ceph.com/issues/53904
2022-01-17T16:07:38Z
Sebastian Wagner
<p><a class="external" href="https://pulpito.ceph.com/swagner-2022-01-17_12:42:04-orch:cephadm-wip-swagner-testing-2022-01-17-1014-distro-default-smithi/">https://pulpito.ceph.com/swagner-2022-01-17_12:42:04-orch:cephadm-wip-swagner-testing-2022-01-17-1014-distro-default-smithi/</a></p>
<pre>
2022-01-17T13:17:17.053 DEBUG:teuthology.orchestra.run.smithi155:> sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:1cdf02ebbbdd98a055173cbac4d0171328a564dc shell -c /etc/ceph/ceph.conf -k />
2022-01-17T13:17:17.054 DEBUG:teuthology.orchestra.run.smithi155:> for haproxy in `ceph orch ps | grep ^haproxy.nfs.foo. | awk '"'"'{print $1}'"'"'`; do
2022-01-17T13:17:17.054 DEBUG:teuthology.orchestra.run.smithi155:> ceph orch daemon stop $haproxy
2022-01-17T13:17:17.054 DEBUG:teuthology.orchestra.run.smithi155:> while ! ceph orch ps | grep $haproxy | grep stopped; do sleep 1 ; done
2022-01-17T13:17:17.055 DEBUG:teuthology.orchestra.run.smithi155:> cat /mnt/foo/testfile
2022-01-17T13:17:17.055 DEBUG:teuthology.orchestra.run.smithi155:> echo $haproxy > /mnt/foo/testfile
2022-01-17T13:17:17.055 DEBUG:teuthology.orchestra.run.smithi155:> sync
2022-01-17T13:17:17.055 DEBUG:teuthology.orchestra.run.smithi155:> ceph orch daemon start $haproxy
2022-01-17T13:17:17.056 DEBUG:teuthology.orchestra.run.smithi155:> while ! ceph orch ps | grep $haproxy | grep running; do sleep 1 ; done
2022-01-17T13:17:17.056 DEBUG:teuthology.orchestra.run.smithi155:> done
2022-01-17T13:17:17.056 DEBUG:teuthology.orchestra.run.smithi155:> '
</pre><br />...snip...<br /><pre>
2022-01-17T13:17:20.571 INFO:teuthology.orchestra.run.smithi155.stdout:Check with each haproxy down in turn...
2022-01-17T13:17:21.281 INFO:teuthology.orchestra.run.smithi155.stdout:Scheduled to stop haproxy.nfs.foo.smithi155.xhswck on host 'smithi155'
</pre><br />...snip...
<pre>
2022-01-17T13:17:36.893 INFO:teuthology.orchestra.run.smithi155.stdout:haproxy.nfs.foo.smithi155.xhswck smithi155 *:2049,9002 stopped 0s ago 79s - - <unknown> <un>
2022-01-17T13:17:36.898 INFO:teuthology.orchestra.run.smithi155.stdout:test
2022-01-17T13:17:37.528 INFO:teuthology.orchestra.run.smithi155.stdout:Scheduled to start haproxy.nfs.foo.smithi155.xhswck on host 'smithi155'
</pre><br />...snip...<br /><pre>
2022-01-17T13:17:53.182 INFO:teuthology.orchestra.run.smithi155.stdout:haproxy.nfs.foo.smithi155.xhswck smithi155 *:2049,9002 running (5s) 0s ago 95s - - 2.3.17-d1c9119 14b>
2022-01-17T13:17:53.519 INFO:teuthology.orchestra.run.smithi155.stdout:Scheduled to stop haproxy.nfs.foo.smithi162.mahcqs on host 'smithi162'
</pre><br />...snip...<br /><pre>
2022-01-17T13:18:07.810 INFO:teuthology.orchestra.run.smithi155.stdout:haproxy.nfs.foo.smithi162.mahcqs smithi162 *:2049,9002 stopped 0s ago 102s - - <unknown> <unk>
</pre><br />...snip..<br /><pre>
h[14066]: cephadm 2022-01-17T13:17:53.516345+0000 mgr.smithi155.uoijyc (mgr.14206) 339 : cephadm [INF] Schedule stop daemon haproxy.nfs.foo.smithi162.mahcqs
</pre>
<p>But I never see a start of haproxy.nfs.foo.smithi162.mahcqs again.</p>
Orchestrator - Bug #53706 (Resolved): cephadm: Module 'cephadm' has failed: dashboard iscsi-gatew...
https://tracker.ceph.com/issues/53706
2021-12-22T20:20:30Z
Sebastian Wagner
<pre>
MGR_MODULE_ERROR: Module 'cephadm' has failed: dashboard iscsi-gateway-rm failed: iSCSI gateway 'iscsi-gw' does not exist retval: -2
</pre>
<p>This is a new one actually.</p>
<p><strong>workaround 1</strong></p>
<p>1. create a new iscsi gateway "iscsi-gw" in the dashboard<br />2. fail over the mgr (ceph mgr fail <currently active mgr>)</p>
<p><strong>workaround 2</strong></p>
<p>1. <strong>ceph config-key dump mgr/cephadm</strong> and look for the daemon iscsi-gw in the json output. <br />2. <strong>ceph config-key set <key> -i data.json</strong> and apply he json without the daemon. (just remove the daemon from the the json data) <br />3. fail over the mgr (ceph mgr fail <currently active mgr>)</p>
<p><strong>workaround 3</strong></p>
<p>Temporarily disable the dashboard during the removal operation.</p>
Orchestrator - Bug #53652 (Closed): cephadm "Verifying IP <ip> port 3300" ... -> "OSError: [Errno...
https://tracker.ceph.com/issues/53652
2021-12-17T11:21:27Z
Sebastian Wagner
<p>We have to get better at returning better error messages:</p>
<pre>
ubuntu@admin:~$ sudo cephadm --docker --image "admin:5000/ceph/ceph:v16" bootstrap --skip-monitoring-stack --mon-ip 192.168.122.51
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chrony.service is enabled and running
Repeating the final host check...
systemctl is present
lvcreate is present
Unit chrony.service is enabled and running
Host looks OK
Cluster fsid: fsid
Verifying IP <ip> port 3300 ...
Traceback (most recent call last):
File "/usr/sbin/cephadm", line 6242, in <module>
r = args.func()
File "/usr/sbin/cephadm", line 1451, in _default_image
return func()
File "/usr/sbin/cephadm", line 2923, in command_bootstrap
check_ip_port(args.mon_ip, 3300)
File "/usr/sbin/cephadm", line 771, in check_ip_port
attempt_bind(s, ip, port)
File "/usr/sbin/cephadm", line 731, in attempt_bind
raise e
File "/usr/sbin/cephadm", line 724, in attempt_bind
s.bind((address, port))
OSError: [Errno 99] Cannot assign requested address
</pre>
<ol>
<li>This is <strong>not</strong> helpful and users need better guidance: E.g. a better error message with more details</li>
<li>Why port 3000? </li>
<li>Also, we shouldn't print a Traceback here.</li>
</ol>
Orchestrator - Bug #53624 (Resolved): cephadm agent: set_store mon returned -27: error: entry siz...
https://tracker.ceph.com/issues/53624
2021-12-16T09:21:34Z
Sebastian Wagner
<pre>
+p08UJQl6kAB3NtutBnSLaukwDQYJKoZIhvcNAQEL\nBQAwFzEVMBMGA1UEAwwMY2VwaGFkbS1yb290MB4XDTIxMTIxNjAyMjAzOFoXDTMx\nMTIxNzAyMjAzOFowFzEVMBMGA1UEAwwMY2VwaGFkbS1yb290MIICIjANBgkqhkiG\n9w0BAQEFAAOCAg8AMIICCgKCAgEAzqwANUb3
zFx4981X8YjxlvEPjLITotzyfgu0\neX6FdQTEfEiJjpv8Ikgq7QSKj0mf43J0uy/Y/+OENuHIdPSvtE3/ICQ7qX8HeOAS\noYS41vpAb8la9AODkohMlZa7OogPeoUpnU2Da2E4gUkxMcduMw3CsbF6Url8tiMY\nAqJyNUjQyZN9ah2NUsU4Bqhd2bXBSObDuKqA8M1M/nGwouOyQ
eVZmn1A4yQ3AOS8\nYb+fA/EWQtHZsdyNNBIU1bI68BaSSEtx00Fc1TFPiYf9JTZo5hT2sO3mBUKI/6uP\nwlIr3w23R2vP80WhbpLIZSSbZuQ0uyKc9XWqGvC4rX6VgWgs8xXW2FQGPfhzSWoL\nK4t9tafT8mLlokP5uelA4Da/JyrMf+U+ntEHbKjt/Jns8cR3dEEQQgzKh1197I
NZ\nq3lMHKOwVrkko2rwXaR4BTiv/9YDX1G/10Wc38U7V4Nyuc1MgShaBoy2GyjKl3Cy\nmiz3Jg0pehvqOjFTAfhqsbbocESyxWqZTZRK688NH7+EDOZT6gCEcCmhYJoNFhi1\nc2MJ2Zb4WgAanRcOto6LEYZ0kQQqZhce94RXKkJ265VZOFKcFqymGKAoxBhuBIAK\nJrmYvd1oE
GYROpHVL16tOzlyXxiNR+8zEMUWcRz9JmyE1ZwUhGqB4I2y9xGSsVlC\nK3ckXsECAwEAAaMkMCIwDwYDVR0RBAgwBocErBUCZjAPBgNVHRMBAf8EBTADAQH/\nMA0GCSqGSIb3DQEBCwUAA4ICAQAee7nTsfD5oQMYiZQA0qjeiAYDpG1bPFtyHxZL\n/TY96maIvBxqEx+haOmKvV
RTrV080NIo9VsftiLoFYj5M19bufkMlM764viIKOPE\n0fIHqCDvIresYuOIwB7WGqwiIzp4vQv/7QPqZdXRRW5WUN2fh2EhsSPlxBWoOwMf\nBNqsRwrYOxBzcXjeLJGGO4wmRa5AsFBwZC8cho+bc1W7FJUpXkZVnXmUPXx6ZzA8\nxvXBZfJgQ/2ya0uDBvXXwkxX1/7FHK4IsI7
gCGWHv95b5N2nM8A0Y8p5YvDDHnX2\nhnBmr97LdIfqgzScjblnNeQfNQWJj2kcuY0hjtHxb1Pg/t24N8JsJS1Jp5vshcbS\n8jWWK8wpUIkihDQCezY/5hUfhvQz4RozrAXgQKxAvGaXqWw2RXkwf7cbcXxIlS6A\nrt4L2Vg3xWS/2DIhAFwk8W53F0du3MfFOLIgVuBTOgbghapo
wqppo0qYtvH3nb6Y\nXhK12rDLzQvueQyxZZkjQ6UtYk1++eL+PojYMWFYYFesSayTs3xGi/6wvouVQb6v\njW8GeO2cf1mDJFCNDxAmtDoPCOJfT/j8dy8gCvkT0ypzdm69HqjO+sf8+Ik2VlrR\n3rJbNDd1WwP0o2CYLSIe2KOcQUvX505ojDFznTiLeUJ728Yd2qsnHTIOOK7NJ0x9\nd8hCvw==\n-----END CERTIFICATE-----\n", "172.21.2.102", "7150", "False"], "last_config": "2021-12-16T02:21:06.723847Z"}, "osd.1024": {"deps": [], "last_config": "2021-12-12T22:53:04.829290Z"}, "osd.1025": {"deps": [], "last_config": "2021-12-12T22:54:43.028992Z"}, "osd.1026": {"deps": [], "last_config": "2021-12-12T22:55:48.651847Z"}, "osd.1027": {"deps": [], "last_config": "2021-12-12T22:57:10.880016Z"}, "osd.1028": {"deps": [], "last_config": "2021-12-12T23:00:35.405950Z"}}, "last_daemon_update": "2021-12-16T02:27:35.437721Z", "last_device_update": "2021-12-12T22:52:34.049019Z", "last_network_update": "2021-12-16T02:27:15.196650Z", "last_device_change": "2021-12-12T22:44:01.089290Z", "networks_and_interfaces": {"172.21.0.0/20": {"enp1s0f0": ["172.21.2.145"]}, "fe80::/64": {"enp1s0f0": ["fe80::3eec:efff:fe3d:91b4"]}}, "last_host_check": "2021-12-16T02:22:44.926479Z", "last_client_files": {}, "scheduled_daemon_actions": {}, "agent_counter": 1, "agent_keys": "[client.agent.gibba045]\n\tkey = AQDuerZhDQN2GhAAMFdgQncZtlLlmjT/7IgaYA==\n"$` failed: (27) File too large
Dec 16 02:27:35 gibba002 ceph-mgr[421345]: mgr set_store mon returned -27: error: entry size limited to 65536 bytes. Use 'mon config key max entry size' to manually adjust
</pre>
Orchestrator - Bug #53594 (Resolved): mgr/cephadm/upgrade.py: normalize_image_digest has a hard c...
https://tracker.ceph.com/issues/53594
2021-12-13T10:14:08Z
Sebastian Wagner
<p><a class="external" href="https://github.com/ceph/ceph/blob/84f88eaec44103edd377817e264d5d376df8c554/src/pybind/mgr/cephadm/upgrade.py#L34">https://github.com/ceph/ceph/blob/84f88eaec44103edd377817e264d5d376df8c554/src/pybind/mgr/cephadm/upgrade.py#L34</a></p>
<p>I mean it's clearly wrong as this depends on the search-regiestries setting of the hosts and is not a constant.</p>
<p>Can we drop this "normalizing" step altogether?</p>
<p>Still, we have to avoid creating a regression to <a class="external" href="https://github.com/ceph/ceph/pull/40577">https://github.com/ceph/ceph/pull/40577</a></p>
Orchestrator - Bug #53541 (Resolved): permissions too open on the cephadm agent files (644) - inc...
https://tracker.ceph.com/issues/53541
2021-12-08T14:24:38Z
Sebastian Wagner
<p>permissions too open on the cephadm agent files (644) - includes certs and config</p>
mgr - Bug #53538 (Resolved): mgr/stats: ZeroDivisionError
https://tracker.ceph.com/issues/53538
2021-12-08T13:37:49Z
Sebastian Wagner
<pre>
root@service-01-08020:~# ceph osd status storage-01-08002
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1623, in _handle_command
return CLICommand.COMMANDS[cmd['prefix']].call(self, cmd, inbuf)
File "/usr/share/ceph/mgr/mgr_module.py", line 416, in call
return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/status/module.py", line 338, in handle_osd_status
wr_ops_rate = (self.get_rate("osd", osd_id.__str__(), "osd.op_w") +
File "/usr/share/ceph/mgr/status/module.py", line 28, in get_rate
return (data[-1][1] - data[-2][1]) // int(data[-1][0] - data[-2][0])
ZeroDivisionError: integer division or modulo by zero
</pre>
<p>Since those PRs:</p>
<ul>
<li><a class="external" href="https://github.com/ceph/ceph/pull/25337">https://github.com/ceph/ceph/pull/25337</a></li>
<li><a class="external" href="https://github.com/ceph/ceph/pull/26270">https://github.com/ceph/ceph/pull/26270</a></li>
<li><a class="external" href="https://github.com/ceph/ceph/pull/26270/files#diff-dc6485f717f4dce4863733896375af75963412ebb2abc4b62fcd1f5233eee07dR44">https://github.com/ceph/ceph/pull/26270/files#diff-dc6485f717f4dce4863733896375af75963412ebb2abc4b62fcd1f5233eee07dR44</a></li>
<li><a class="external" href="https://github.com/ceph/ceph/pull/28603">https://github.com/ceph/ceph/pull/28603</a> </li>
<li><a class="external" href="https://tracker.ceph.com/issues/43224#note-11">https://tracker.ceph.com/issues/43224#note-11</a></li>
</ul>
<p>no one had the patience to look into this all over again.</p>
Orchestrator - Bug #53531 (New): cephadm: how agent finds the active mgr after mgr failover
https://tracker.ceph.com/issues/53531
2021-12-08T11:32:23Z
Sebastian Wagner
<p>how agent finds the active mgr after mgr failover</p>
<p>1) new mgr pokes old agents to provide new mgr endpoint<br />could be parallelized even<br />no cephx key needed</p>
<p>2) agent asks mon for active mgr if it hasn't successfully reported in last N second<br />'ceph mgr services' or similar<br />woudl need cephx key, something like mon 'allow r'<br />agent would need a ceph.conf, and have it updated on ceph.conf changes, etc.</p>
<p>3) agent knows all knowns MGRs and standby modules redirect to the active MGR<br />agent just tries a different MGRs and detects if it gets a redirect to the newly active MGR<br />needs hostname + port for each MGR</p>
<p>Let's go with <strong>(1)</strong>?</p>
Orchestrator - Bug #53530 (Closed): centos’s cephadm shebang does not work on ubuntu
https://tracker.ceph.com/issues/53530
2021-12-08T11:30:36Z
Sebastian Wagner
<pre>
root@storage-01-08002:/var/lib/ceph/f148c330-47c9-11ec-9f19-1dfe2cdc6a6d# ./cephadm.4fd955d0dbf0cdd56104823ef3e950293acaef76608980c02d37021d4c51ee67
-bash: ./cephadm.4fd955d0dbf0cdd56104823ef3e950293acaef76608980c02d37021d4c51ee67: /usr/libexec/platform-python: bad interpreter: No such file or directory
</pre>
Orchestrator - Bug #53529 (New): ceph orch apply ... --dry-run: Table not properly formatted
https://tracker.ceph.com/issues/53529
2021-12-08T11:28:20Z
Sebastian Wagner
<pre>
root@service-01-08020:~# ceph orch apply -i cadvisor.yaml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound
to the current inventory setup. If any on these conditions changes, the
preview will be invalid. Please make sure to have a minimal
timeframe between planning and applying the specs.
####################
SERVICESPEC PREVIEWS
####################
+-----------+--------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------+
|SERVICE |NAME |ADD_TO |REMOVE_FROM |
+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +-------------+
|container |container.cadvisor |hosta08016 hosta08014 hosta08006 hosta08005 hosta08004 hosta08007 hosta08009 hosta08010 hosta08008 hosta08015 hosta08003 hosta08013 hosta08012 hosta08011 hosta08002 hostb08035 hostb08033 hostb08030 hostb08026 hostb08024 hostb08025 hostb08023 hostb08032 hostb08036 hostb08029 hostb08031 hostb08027 hostb08028 hostb08022 hostb08056 hostb08055 hostb08053 hostb08051 hostb08050 hostb08048 hostb08047 hostb08045 hostb08043 hostb08044 hostb08042 hostb08052 hostb08049 hostd08076 hostd08075 hostd08074 hostd08073 hostd08072 hostd08071 hostd08070 hostd08069 hostd08068 hostd08066 hostd08067 hostd08065 hostd08064 hostd08063 hostd08062 hoste08096 hoste08092 hoste08091 hoste08090 hoste08095 hoste08094 hoste08093 hoste08087 hoste08085 hoste08084 hoste08089 hoste08088 hoste08086 hoste08082 hoste08083 hostf08116 hostf08115 hostf08112 hostf08114 hostf08113 hostf08111 hostf08110 hostf08109 hostf08108 hostf08106 hostf08107 hostf08104 hostf08105 hostf08103 hostf08102 hostg08135 hostg08136 hostg08124 hostg08123 hostg08134 hostg08133 hostg08132 hostg08131 hostg08130 hostg08129 hostg08128 hostg08122 hostg08126 hostg08125 hostg08127 hosth08153 hosth08155 hosth08154 hosth08151 hosth08149 hosth08146 hosth08148 hosth08147 hosth08145 hosth08143 hosth08156 hosth08142 hosth08150 hosth08144 hosth08152 hosti08173 hosti08170 hosti08169 hosti08168 hosti08166 hosti08164 hosti08163 hosti08165 hosti08175 hosti08171 hosti08167 hosti08172 hosti08162 hostk08192 hostk08184 hostk08191 hostk08196 hostk08193 hostk08194 hostk08195 hostk08188 hostk08186 hostk08189 hostk08190 hostk08183 hostk08187 hostk08182 hostk08185 hostm08216 hostm08214 hostm08215 hostm08213 hostm08206 hostm08209 hostm08211 hostm08212 hostm08210 hostm08208 hostm08207 hostm08205 hostm08204 hostm08203 hostm08202 hostn08224 hostn08236 hostn08233 hostn08232 hostn08234 hostn08230 hostn08231 hostn08235 hostn08229 hostn08227 hostn08226 hostn08228 hostn08225 hostn08222 hostn08223 | |
+-----------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- +-------------+
################
OSDSPEC PREVIEWS
################
+---------+------+------+------+----+-----+
|SERVICE |NAME |HOST |DATA |DB |WAL |
+---------+------+------+------+----+-----+
+---------+------+------+------+----+-----+
</pre>
Orchestrator - Bug #53528 (Resolved): loopback devices are showing in gather-facts
https://tracker.ceph.com/issues/53528
2021-12-08T11:19:31Z
Sebastian Wagner
<p>loopback devices are showing in gather-facts and therefore the GUI. This is a feature of using snapd on ubuntu with software deployed as snaps.</p>
<p>See also <a class="external" href="https://github.com/ceph/ceph/pull/43628">https://github.com/ceph/ceph/pull/43628</a></p>
Orchestrator - Bug #53527 (Resolved): cephadm: orch upgrade ls: shows outdated major versions
https://tracker.ceph.com/issues/53527
2021-12-08T11:17:35Z
Sebastian Wagner
<p>orch upgrade ls … why does this show older versions 16.x down to 14.2? Isn’t that asking for trouble?</p>
Dashboard - Bug #53526 (Triaged): mgr/dashboard: dashboard: offline hosts showing UI bug
https://tracker.ceph.com/issues/53526
2021-12-08T11:14:59Z
Sebastian Wagner
<p>NAN - undefined?<br />Status - doesn’t fit in the column</p>
<p><img src="https://tracker.ceph.com/attachments/download/5806/unnamed.png" alt="" /></p>