Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2020-07-29T14:03:08ZCeph
Redmine Dashboard - Bug #46757 (New): mgr/dashboard: Only show identify action if inventory device can blinkhttps://tracker.ceph.com/issues/467572020-07-29T14:03:08ZStephan Müller
<p>If a device can't be blink but is manged by cephadm the action "Identify" will be shown in the inventory page. The problem is that the command doesn't throw an error if it fails on the dashboard. I observed the following error through running `ceph -W cephadm` in parallel to the execution.</p>
<pre>
2020-07-29T08:53:30.649950-0500 mgr.x [ERR] executing blink(([DeviceLightLoc(host='osd0', dev='/dev/vdb', path='/dev/vdb')],)) failed.
Traceback (most recent call last):
File "/ceph/src/pybind/mgr/cephadm/utils.py", line 67, in do_work
return f(*arg)
File "/ceph/src/pybind/mgr/cephadm/module.py", line 1591, in blink
raise OrchestratorError(
orchestrator._interface.OrchestratorError: Unable to affect ident light for osd0:/dev/vdb. Command: lsmcli local-disk-ident-led-on --path /dev/vdb
2020-07-29T08:53:30.653157-0500 mgr.x [ERR] _Promise failed
Traceback (most recent call last):
File "/ceph/src/pybind/mgr/orchestrator/_interface.py", line 292, in _finalize
next_result = self._on_complete(self._value)
File "/ceph/src/pybind/mgr/cephadm/module.py", line 102, in <lambda>
return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/ceph/src/pybind/mgr/cephadm/module.py", line 1599, in blink_device_light
return blink(locs)
File "/ceph/src/pybind/mgr/cephadm/utils.py", line 73, in forall_hosts_wrapper
return CephadmOrchestrator.instance._worker_pool.map(do_work, vals)
File "/usr/lib64/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib64/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
File "/usr/lib64/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib64/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/ceph/src/pybind/mgr/cephadm/utils.py", line 67, in do_work
return f(*arg)
File "/ceph/src/pybind/mgr/cephadm/module.py", line 1591, in blink
raise OrchestratorError(
orchestrator._interface.OrchestratorError: Unable to affect ident light for osd0:/dev/vdb. Command: lsmcli local-disk-ident-led-on --path /dev/vdb
</pre> Dashboard - Bug #46303 (Resolved): mgr/dashboard: ExpressionChangedAfterItHasBeenCheckedError in ...https://tracker.ceph.com/issues/463032020-07-01T14:22:14ZStephan Müller
<p>ExpressionChangedAfterItHasBeenCheckedError in device selection modal in OSD creation form. It looks like it has something to do with the modal switch <a class="issue tracker-2 status-3 priority-4 priority-default closed child" title="Feature: mgr/dashboard: Use ng-bootstrap for Modal (Resolved)" href="https://tracker.ceph.com/issues/45759">#45759</a>.</p> Orchestrator - Bug #45724 (Resolved): check-host should not fail using fqdn or not that hardhttps://tracker.ceph.com/issues/457242020-05-27T09:26:51ZStephan Müller
<p>I would suggest either identify that it's an FQDN or answer "Host not found. Use 'ceph orch host ls' to see all managed hosts."</p>
<pre>
# ceph cephadm check-host node3.ses7.com
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
return func(*args, **kwargs)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1482, in check_host
error_ok=True, no_fsid=True)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1569, in _run_cephadm
conn, connr = self._get_connection(addr)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1521, in _get_connection
n = self.ssh_user + '@' + host
TypeError: must be str, not NoneType
</pre> Dashboard - Bug #44620 (Resolved): mgr/dashboard: Pool form max sizehttps://tracker.ceph.com/issues/446202020-03-16T12:07:27ZStephan Müller
<p>Currently the pool form max size is determined by "max_size" of the selected rule or the maximum amount of available OSDs. The amount can be wrong if the failure domain of the rule is not OSD.</p>
<p>I'm also currently not sure if "max_size" and "min_size" are useful values to show, at least pools created on a vstart cluster always show the same min and max size values. Please make that sure that those values can still be used safely.</p> Dashboard - Bug #44223 (Duplicate): mgr/dashboard: Timeouts for rbd.py callshttps://tracker.ceph.com/issues/442232020-02-20T10:05:49ZStephan Müller
<p>As the corner cases are not implemented in many rbd methods, they can fail without a response on a specific pool (mostly bad pools).</p>
<p>If this is implemented remove the workaround that was implemented to fix <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>For details what known issue exists see <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a>.</p>
<p>For details about the discussion that was made look at the PR that fixed <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>Make sure that <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a> is still not addressed before starting with this issue.</p> Dashboard - Bug #43938 (Duplicate): mgr/dashboard: Test failure in test_safe_to_destroy in tasks....https://tracker.ceph.com/issues/439382020-01-31T15:14:52ZStephan Müller
<p>There is an API test failure in master not sure where it came from.</p>
<pre>
2020-01-29 15:05:17,789.789 INFO:__main__:Stopped test: test_safe_to_destroy (tasks.mgr.dashboard.test_osd.OsdTest) in 1.567141s
2020-01-29 15:05:17,789.789 INFO:__main__:
2020-01-29 15:05:17,790.790 INFO:__main__:======================================================================
2020-01-29 15:05:17,790.790 INFO:__main__:FAIL: test_safe_to_destroy (tasks.mgr.dashboard.test_osd.OsdTest)
2020-01-29 15:05:17,790.790 INFO:__main__:----------------------------------------------------------------------
2020-01-29 15:05:17,790.790 INFO:__main__:Traceback (most recent call last):
2020-01-29 15:05:17,790.790 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/test_osd.py", line 113, in test_safe_to_destroy
2020-01-29 15:05:17,790.790 INFO:__main__: 'stored_pgs': [],
2020-01-29 15:05:17,790.790 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/helper.py", line 343, in assertJsonBody
2020-01-29 15:05:17,790.790 INFO:__main__: self.assertEqual(body, data)
2020-01-29 15:05:17,790.790 INFO:__main__:AssertionError: {u'is_safe_to_destroy': False, u'message': u'[errno -11] 3 pgs have unknown stat [truncated]... != {'active': [], 'is_safe_to_destroy': True, 'stored_pgs': [], 'safe_to_destroy': [truncated]...
2020-01-29 15:05:17,790.790 INFO:__main__:+ {'active': [],
2020-01-29 15:05:17,791.791 INFO:__main__:- {u'is_safe_to_destroy': False,
2020-01-29 15:05:17,791.791 INFO:__main__:? ^^ ^^^^
2020-01-29 15:05:17,791.791 INFO:__main__:
2020-01-29 15:05:17,791.791 INFO:__main__:+ 'is_safe_to_destroy': True,
2020-01-29 15:05:17,791.791 INFO:__main__:? ^ ^^^
2020-01-29 15:05:17,791.791 INFO:__main__:
2020-01-29 15:05:17,791.791 INFO:__main__:- u'message': u'[errno -11] 3 pgs have unknown state; cannot draw any conclusions'}
2020-01-29 15:05:17,791.791 INFO:__main__:+ 'missing_stats': [],
2020-01-29 15:05:17,791.791 INFO:__main__:+ 'safe_to_destroy': [13],
2020-01-29 15:05:17,791.791 INFO:__main__:+ 'stored_pgs': []}
2020-01-29 15:05:17,791.791 INFO:__main__:
2020-01-29 15:05:17,792.792 INFO:__main__:----------------------------------------------------------------------
2020-01-29 15:05:17,792.792 INFO:__main__:Ran 93 tests in 1125.270s
2020-01-29 15:05:17,792.792 INFO:__main__:
2020-01-29 15:05:17,792.792 INFO:__main__:FAILED (failures=1)
2020-01-29 15:05:17,792.792 INFO:__main__:
2020-01-29 15:05:17,792.792 INFO:__main__:======================================================================
2020-01-29 15:05:17,792.792 INFO:__main__:FAIL: test_safe_to_destroy (tasks.mgr.dashboard.test_osd.OsdTest)
2020-01-29 15:05:17,792.792 INFO:__main__:----------------------------------------------------------------------
2020-01-29 15:05:17,792.792 INFO:__main__:Traceback (most recent call last):
2020-01-29 15:05:17,792.792 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/test_osd.py", line 113, in test_safe_to_destroy
2020-01-29 15:05:17,793.793 INFO:__main__: 'stored_pgs': [],
2020-01-29 15:05:17,793.793 INFO:__main__: File "/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa/tasks/mgr/dashboard/helper.py", line 343, in assertJsonBody
2020-01-29 15:05:17,793.793 INFO:__main__: self.assertEqual(body, data)
2020-01-29 15:05:17,793.793 INFO:__main__:AssertionError: {u'is_safe_to_destroy': False, u'message': u'[errno -11] 3 pgs have unknown stat [truncated]... != {'active': [], 'is_safe_to_destroy': True, 'stored_pgs': [], 'safe_to_destroy': [truncated]...
2020-01-29 15:05:17,793.793 INFO:__main__:+ {'active': [],
2020-01-29 15:05:17,793.793 INFO:__main__:- {u'is_safe_to_destroy': False,
2020-01-29 15:05:17,793.793 INFO:__main__:? ^^ ^^^^
2020-01-29 15:05:17,793.793 INFO:__main__:
2020-01-29 15:05:17,793.793 INFO:__main__:+ 'is_safe_to_destroy': True,
2020-01-29 15:05:17,793.793 INFO:__main__:? ^ ^^^
2020-01-29 15:05:17,793.793 INFO:__main__:
2020-01-29 15:05:17,794.794 INFO:__main__:- u'message': u'[errno -11] 3 pgs have unknown state; cannot draw any conclusions'}
2020-01-29 15:05:17,794.794 INFO:__main__:+ 'missing_stats': [],
2020-01-29 15:05:17,794.794 INFO:__main__:+ 'safe_to_destroy': [13],
2020-01-29 15:05:17,794.794 INFO:__main__:+ 'stored_pgs': []}
2020-01-29 15:05:17,794.794 INFO:__main__:
Using guessed paths /home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/build/lib/ ['/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/qa', '/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/build/lib/cython_modules/lib.3', '/home/jenkins-build/build/workspace/ceph-dashboard-pr-backend/src/pybind']
</pre> rbd - Bug #43771 (Rejected): pybind/rbd: config_list hangs if given an pool with a bad pg statehttps://tracker.ceph.com/issues/437712020-01-23T16:53:11ZStephan Müller
<p>If the dashboard tries to get the configuration of RBDs on a pool basis with a pool in the pg state 'creating+incomplete', it will stop working waiting for a response of `config_list` in `rbd.pyx`.</p>
<p>The pg state 'creating+incomplete' is an edge case as it will only appear if one creates a pool that needs more buckets as the cluster can provide. The current workaround in the dashboard is to omit this call if a pool is in this state.</p>
<p>Here is the manual stack trace found by debugging:<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/controllers/pool.py#L206">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/controllers/pool.py#L206</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/services/rbd.py#L104">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/services/rbd.py#L104</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2215">https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2215</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2935">https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2935</a></p> Dashboard - Bug #43765 (Resolved): mgr/dashboard: Dashboard breaks on the selection of a bad poolhttps://tracker.ceph.com/issues/437652020-01-23T13:34:10ZStephan Müller
<p>If a pool is created with a wrong rule that is oversized for the cluster it gets into the "creating+incomplete" state.<br />This wont break the dashboard yet, but when clicking on the pool to get it's details every API request will be left unanswerd. The first unanswered call is "/api/pool/$badPool/configuration".</p>
<p>To reproduce this you can create an erasure code profile with a greater value for k than you have OSDs.</p> Dashboard - Bug #43384 (Duplicate): mgr/dashboard: Pool size is calculated with data with differe...https://tracker.ceph.com/issues/433842019-12-19T14:01:20ZStephan Müller
<p>Currently we use the following calculation in the dashboard to calculate the usage:</p>
<pre><code>const avail = stats.bytes_used.latest + stats.max_avail.latest;<br /> pool['usage'] = avail > 0 ? stats.bytes_used.latest / avail : avail;</code></pre>
<p>[pool-list.component.ts:253-4]</p>
<p>The problem is that "max_avail" is calculated somewhat strangely as it does not show the real available space as it divides the available space at least through the number of replications for a replicated and by "(m+k)/k" for an ec pool.</p>
<p>To look the calculation up go to the following locations:<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6033">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6033</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6040">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6040</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6054">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6054</a></p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L909">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L909</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L920">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L920</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L924">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L924</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L947">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L947</a></p>
<p>This problem was found out by a user who notified us about the percentage mismatch, here is the link to the mail:<br /><a class="external" href="http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-December/037680.html">http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-December/037680.html</a></p>
<p>Also found this thread on the new mailing list regarding the used percentage and max_avail topic:<br /><a class="external" href="https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NH2LMMX5KVRWCURI3BARRUAETKE2T2QN/#JDHXOQKWF6NZLQMOGEPAQCLI44KB54A3">https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NH2LMMX5KVRWCURI3BARRUAETKE2T2QN/#JDHXOQKWF6NZLQMOGEPAQCLI44KB54A3</a></p> Dashboard - Bug #42624 (Resolved): mgr/dashboard: Remove compression unset modehttps://tracker.ceph.com/issues/426242019-11-04T17:53:47ZStephan Müller
<p>The pool form currently shows the options 'none' and 'unset'.</p>
<p>The final behavior doesn't differ for both selected options, but if unset is selected the whole compression sub menu is shown.</p>
<p>Please remove the unset option as it doesn't seem to be used.</p> Dashboard - Bug #42243 (Duplicate): mgr/dashboard: Fix unit test that is failing in a negative ti...https://tracker.ceph.com/issues/422432019-10-09T08:54:36ZStephan Müller
<p>Found a test that is failing in a negative timezone (Washington (-05:00))</p>
<pre>
● RbdSnapshotListComponent › snapshot modal dialog › should display suggested snapshot name
expect(received).toMatch(expected)
Expected pattern: /^image01_[\d-]+T[\d.:]+\+[\d:]+$/
Received string: "image01_2019-10-09T03:40:18.664-05:00"
204 | it('should display suggested snapshot name', () => {
205 | component.openCreateSnapshotModal();
> 206 | expect(component.modalRef.content.snapName).toMatch(
| ^
207 | RegExp(`^${component.rbdName}_[\\d-]+T[\\d.:]+\\+[\\d:]+\$`)
208 | );
209 | });
at src/app/ceph/block/rbd-snapshot-list/rbd-snapshot-list.component.spec.ts:206:51
at ZoneDelegate.Object.<anonymous>.ZoneDelegate.invoke (node_modules/zone.js/dist/zone.js:391:26)
at ProxyZoneSpec.Object.<anonymous>.ProxyZoneSpec.onInvoke (node_modules/zone.js/dist/proxy.js:129:39)
at ZoneDelegate.Object.<anonymous>.ZoneDelegate.invoke (node_modules/zone.js/dist/zone.js:390:52)
at Zone.Object.<anonymous>.Zone.run (node_modules/zone.js/dist/zone.js:150:43)
at Object.testBody.length (node_modules/jest-preset-angular/zone-patch/index.js:52:27)
</pre> mgr - Bug #41795 (New): mgr: Time series data of pool decreases itself when reducing the amount o...https://tracker.ceph.com/issues/417952019-09-12T14:18:54ZStephan Müller
<p>Time series data of pool decreases itself when reducing the amount of PGs of a pool.</p>
<p>Time series data should only increase, not decrease.</p>
<p>(I'm not sure if this is the right place for this bug.)</p> Dashboard - Bug #41166 (Resolved): mgr/dashboard: Cephfs chart wasn't displayedhttps://tracker.ceph.com/issues/411662019-08-08T13:50:03ZStephan Müller
<p>On my system the cephfs chart wasn't displayed, but all the data was available.</p> Dashboard - Bug #41165 (Resolved): mgr/dashboard: Switching to cephfs client tab impossiblehttps://tracker.ceph.com/issues/411652019-08-08T13:46:21ZStephan Müller
<p>Currently it seems to be not possible to change to the client tab inside the cephfs (Filesystems) detail view.</p> Dashboard - Bug #38932 (Resolved): mgr/dashboard: Fix tooltip behavior in RGW user formhttps://tracker.ceph.com/issues/389322019-03-25T11:28:30ZStephan Müller
<p>The problem is that if you hover a sub user and get to a button with a tooltip the rounded corner will disappear and replace with a straight line.</p>