Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2020-10-15T05:10:09ZCeph
Redmine Dashboard - Feature #47865 (New): mgr/dashboard: check client user capibilities for NFS exportshttps://tracker.ceph.com/issues/478652020-10-15T05:10:09ZKiefer Chang
<p>When the end-user creates an export via Dashboard, we let the user select a Ceph client user from the list.<br />It would be great if the Dashboard can check if the client user has enough capabilities to the exported FS.</p>
<p><img src="https://tracker.ceph.com/attachments/download/5187/auth.png" alt="" /></p> Dashboard - Cleanup #47595 (New): mgr/dashboard: move Device health pane to upper levelhttps://tracker.ceph.com/issues/475952020-09-23T03:13:09ZKiefer Chang
<p>A suggestion was proposed in this <a href="https://github.com/ceph/ceph/pull/37275#issuecomment-696917846" class="external">PR</a> to move the Device health pane (Currently it contains SMART data) to the upper level.<br />So users can see the information without navigating too deep.</p>
<p><img src="https://tracker.ceph.com/attachments/download/5153/93925033-cfc29d80-fd15-11ea-9b50-deffd38d5aa8.png" alt="" /></p> Dashboard - Bug #47510 (New): mgr/dashboard: container ID truncates in daemons table when using R...https://tracker.ceph.com/issues/475102020-09-17T06:18:11ZKiefer Chang
<p>We truncate the ID in the frontend to shorten the hash, but this rule doesn't apply to Rook containers.</p>
<p>Rook:<br /><img src="https://tracker.ceph.com/attachments/download/5142/rook.png" alt="" /></p>
<p>Cephadm:<br /><img src="https://tracker.ceph.com/attachments/download/5143/cephadm.png" alt="" /></p>
<p>cephadm now return shorten IDs, which means we can remove the hack from frontend.</p> Dashboard - Bug #46652 (New): mgr/dashboard: exception raised when collapsing OSD detailhttps://tracker.ceph.com/issues/466522020-07-21T10:11:37ZKiefer Chang
<p>An exception is raised when collapsing the OSD detail pane if the backend API call is too slow.</p>
<pre>
Uncaught TypeError: this.osd is undefined
refresh osd-details.component.ts:44
__tryOrUnsub Subscriber.ts:265
next Subscriber.ts:207
_next Subscriber.ts:139
next Subscriber.ts:99
observe Notification.ts:47
dispatch delay.ts:100
_execute AsyncAction.ts:122
execute AsyncAction.ts:97
flush AsyncScheduler.ts:58
</pre>
<p>The cause is when the detail is collapsed, `this.osd` become undefined and got assigned.</p>
<p><img src="https://tracker.ceph.com/attachments/download/4999/osd_collapse.gif" alt="" /></p>
<p>It's not easy to reproduce if the API call is fast. But it can be artificially created by applying the following patch (collapse the detail when seeing `start refresh`):<br /><pre>
diff --git a/src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.ts b/src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-det
ails.component.ts
index 2ed5e0fe1f..f5c47d4a27 100644
--- a/src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.ts
+++ b/src/pybind/mgr/dashboard/frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.ts
@@ -38,7 +38,9 @@ export class OsdDetailsComponent implements OnChanges {
}
refresh() {
+ console.log('start refresh');
this.osdService.getDetails(this.osd.id).subscribe((data) => {
+ console.log('done refresh');
this.osd.details = data;
this.osd.histogram_failed = '';
if (!_.isObject(data.histogram)) {
diff --git a/src/pybind/mgr/dashboard/frontend/src/app/shared/api/osd.service.ts b/src/pybind/mgr/dashboard/frontend/src/app/shared/api/osd.service.ts
index cc088d0e95..db6851b054 100644
--- a/src/pybind/mgr/dashboard/frontend/src/app/shared/api/osd.service.ts
+++ b/src/pybind/mgr/dashboard/frontend/src/app/shared/api/osd.service.ts
@@ -2,7 +2,7 @@ import { HttpClient } from '@angular/common/http';
import { Injectable } from '@angular/core';
import * as _ from 'lodash';
-import { map } from 'rxjs/operators';
+import { map, delay } from 'rxjs/operators';
import { CdDevice } from '../models/devices';
import { SmartDataResponseV1 } from '../models/smart';
@@ -81,7 +81,7 @@ export class OsdService {
histogram: { [key: string]: object };
smart: { [device_identifier: string]: any };
}
- return this.http.get<OsdData>(`${this.path}/${id}`);
+ return this.http.get<OsdData>(`${this.path}/${id}`).pipe(delay(4000));
}
/**
</pre></p> Dashboard - Bug #46147 (New): mgr/dashboard: table actions and column headers are not displayed i...https://tracker.ceph.com/issues/461472020-06-23T08:31:17ZKiefer Chang
<ul>
<li>Switch to a language other than English.</li>
<li>The table actions and column headers are still in English.</li>
</ul>
<p>For example:<br /><img src="https://tracker.ceph.com/attachments/download/4930/Screenshot_20200623_160603.png" alt="" /></p>
<p>It works in Octopus:</p>
<p><img src="https://tracker.ceph.com/attachments/download/4931/Screenshot_20200623_162244.png" alt="" /></p> Dashboard - Feature #45718 (New): mgr/dashboard: display more information about services in Servi...https://tracker.ceph.com/issues/457182020-05-27T02:17:39ZKiefer Chang
<p>In the Services page, we display services and their daemons.</p>
<p><img src="https://tracker.ceph.com/attachments/download/4888/Screenshot_20200527_101641.png" alt="" /></p>
<p>The Orchestrator provides more information about services:</p>
<ul>
<li>Placement</li>
<li>Service-specific parameters, for example. iSCSI service container `api_user` and `api_password` parameters.</li>
<li>Management state (`unmanaged` flag)</li>
</ul>
<p>We should display this information to users.</p> Dashboard - Bug #45526 (New): mgr/dashboard: race conditions might occur in NFSGaneshaExports con...https://tracker.ceph.com/issues/455262020-05-13T07:25:05ZKiefer Chang
Here are some observation:
<ul>
<li>Cherrypy spawns a 10-threads pool by default to serve requests, shared resources between threads should be handle carefully (with locks).</li>
<li>There is a shared resource between backend threads: `GaneshaConf.exports`. For each new request, current existing exports are loaded from RADOS objects. If thread A is manipulating exports and thread B is spawned to serve another request, the exports that thread B sees might not be reflected yet, which might cause problems.</li>
<li>(Guesswork) Problems might be seen more easily when a system is under heavy workloads because loading and saving exports/configs involved I/O to the RADOS.</li>
</ul>
Here are some race condition examples in the implementation (can be artificially created):
<ul>
<li>Creating multiple exports simultaneously<br /> Say we have no exports and two create requests are received. Two threads are spawned and each one sees no existing exports. For the export creation, each thread then generates a new export ID 1 for the new export and the final result might be that there is only one export is created (Both writes to `cephfs_data/EXPORT-1`)</li>
<li>Deleting export and creating at the same time<br /> If the deleting takes longer, the result might be that new export is created, but daemon doesn't reference it because the delete call erases the reference.</li>
</ul> Dashboard - Feature #44865 (New): mgr/dashboard: support zapping deviceshttps://tracker.ceph.com/issues/448652020-03-31T15:56:44ZKiefer Chang
<p>The orchestrator support zapping devices by:</p>
<pre>
ceph orch device zap <hostname> <path>
</pre>
<p>We can support this as a new action in the inventory page or a post step after removing OSDs.</p> Dashboard - Bug #44808 (New): mgr/dashboard: Allow users to specify an unmanaged ServiceSpec when...https://tracker.ceph.com/issues/448082020-03-30T09:57:43ZKiefer Chang
<p>The ServiceSpec (DriveGroup) sent from Dashboard is always managed, which means the spec is continuously applied by Orchestrator backends.<br />In some cases, this might not what users want: like new/spare disks are enrolled automatically.<br />We should allow the user to specify an unmanaged ServiceSpec.</p>
<p>An idea is to add a new checkbox and making a dg be unmanaged by default.</p> Dashboard - Feature #44016 (New): mgr/dashboard: support device_id filter when creating OSDhttps://tracker.ceph.com/issues/440162020-02-06T14:18:26ZKiefer Chang
<p>We should allow user filtering a specific device when creating OSD.<br />The current proposal is to add device_id attribute in orchestrator.</p> Dashboard - Feature #42454 (New): mgr/dashboard: add flexible size filter for deviceshttps://tracker.ceph.com/issues/424542019-10-24T07:13:49ZKiefer Chang
<p>This is a following up item of <a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: mgr/dashboard: Create OSD on spare disks (Resolved)" href="https://tracker.ceph.com/issues/40335">#40335</a>.</p>
<p>Changes in <a class="issue tracker-1 status-1 priority-4 priority-default" title="Bug: smoke.sh failing in jenkins "make check" test randomly (New)" href="https://tracker.ceph.com/issues/40035">#40035</a> allow filtering devices with a specified size:<br /><img src="https://tracker.ceph.com/attachments/download/4514/size_filter.png" alt="" /></p>
<p>The size filter in the <strong>Inventory Devices table</strong> should support filtering size in a range.<br />See <a class="external" href="https://docs.ceph.com/docs/master/mgr/orchestrator_modules/#ceph.deployment.drive_group.DeviceSelection.size">https://docs.ceph.com/docs/master/mgr/orchestrator_modules/#ceph.deployment.drive_group.DeviceSelection.size</a> for more information.<br />One idea is to use range slider as in <a class="external" href="https://jqueryui.com/slider/#range">https://jqueryui.com/slider/#range</a></p> mgr - Bug #41549 (New): mgr: ActivePyModules::list_servers_python() returns mds with empty hostnamehttps://tracker.ceph.com/issues/415492019-08-28T07:46:24ZKiefer Chang
<ul>
<li>Use vstart.sh to create a testing cluster (test on master e81ef76cae66d95af4725cdd81743f68f2e0593d), following is result of cluster status:<br /><pre>
bin/ceph -s
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
2019-08-28T07:31:23.492+0000 7fbf1a5b2700 -1 WARNING: all dangerous and experimental features are enabled.
2019-08-28T07:31:23.496+0000 7fbf1a5b2700 -1 WARNING: all dangerous and experimental features are enabled.
cluster:
id: b7d4c8ca-9a64-4721-852e-587960f1a475
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,b,c (age 12m)
mgr: x(active, since 12m)
mds: a:1 {0=a=up:active} 2 up:standby
osd: 3 osds: 3 up (since 11m), 3 in (since 11m)
rgw: 1 daemon active (8000)
task status:
scrub status:
mds.0: idle
data:
pools: 6 pools, 56 pgs
objects: 225 objects, 6.4 KiB
usage: 6.0 GiB used, 3.0 TiB / 3.0 TiB avail
pgs: 56 active+clean
</pre></li>
<li>Load a MGR module: e.g. prometheus (or disable and enable dashboard module) to make MGR daemon respawn.<br /><pre>
bin/ceph mgr module enable prometheus
</pre></li>
<li>On Dashboard's <strong>Cluster->Hosts</strong> page, we can see a row with empty hostname and it's service is <strong>mds.0</strong><br /><img src="https://tracker.ceph.com/attachments/download/4381/mds_empty_hostname_01.png" alt="" /></li>
</ul>
Some notes:
<ul>
<li>This happens when mgr daemon is respawned.</li>
<li><strong>mds.0</strong> is a task for mds scrubbing (seems introduced in this <a href="https://github.com/ceph/ceph/pull/28855" class="external">change</a>)</li>
<li>In this test, we have 1 active mds daemon <strong>mds.a</strong>, two standby daemons <strong>mds.b</strong> and <strong>mds.c</strong>. They are not reported in <strong>list_servers_python()</strong>. Occasionally, I can see <strong>mds.b</strong> and <strong>mds.c</strong> are reported, but not <strong>mds.a</strong></li>
<li>There is an <a href="https://tracker.ceph.com/issues/23286" class="external">issue</a> about <strong>list_servers_python()</strong> reports mgr with empty hostname some time ago.</li>
<li>In mgr daemon's log, there are some unhandled messages (these messages can be seen without restarting MGR daemon)<br /><pre>
2019-08-28T07:43:47.620+0000 7f7d65da4700 0 ms_deliver_dispatch: unhandled message 0x558705430e00 mgrreport(mds.b +0-0 packed 6) v8 from mds.? v2:192.168.15.191:6828/1605008353
2019-08-28T07:43:47.620+0000 7f7d65da4700 0 ms_deliver_dispatch: unhandled message 0x558705441880 mgrreport(mds.c +0-0 packed 6) v8 from mds.? v2:192.168.15.191:6830/2039538527
2019-08-28T07:43:48.420+0000 7f7d64da2700 0 log_channel(cluster) log [DBG] : pgmap v313: 56 pgs: 56 active+clean; 7.5 KiB data, 3.0 GiB used, 3.0 TiB / 3.0 TiB avail
2019-08-28T07:43:50.420+0000 7f7d64da2700 0 log_channel(cluster) log [DBG] : pgmap v314: 56 pgs: 56 active+clean; 7.5 KiB data, 3.0 GiB used, 3.0 TiB / 3.0 TiB avail
2019-08-28T07:43:52.424+0000 7f7d64da2700 0 log_channel(cluster) log [DBG] : pgmap v315: 56 pgs: 56 active+clean; 7.5 KiB data, 3.0 GiB used, 3.0 TiB / 3.0 TiB avail
2019-08-28T07:43:52.624+0000 7f7d65da4700 0 ms_deliver_dispatch: unhandled message 0x558705520000 mgrreport(mds.b +0-0 packed 6) v8 from mds.? v2:192.168.15.191:6828/1605008353
2019-08-28T07:43:52.624+0000 7f7d65da4700 0 ms_deliver_dispatch: unhandled message 0x55870534bc00 mgrreport(mds.c +0-0 packed 6) v8 from mds.? v2:192.168.15.191:6830/2039538527
2019-08-28T07:43:54.424+0000 7f7d64da2700 0 log_channel(cluster) log [DBG] : pgmap v316: 56 pgs: 56 active+clean; 7.5 KiB data, 3.0 GiB used, 3.0 TiB / 3.0 TiB avail
</pre></li>
</ul> Orchestrator - Feature #41239 (New): mgr/rook: support creating OSDs on Persistent Volumeshttps://tracker.ceph.com/issues/412392019-08-14T04:55:26ZKiefer Chang
<p>Rook backend now supports <a href="https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#pvc-based-cluster" class="external">creating OSDs with PVs.</a></p>
<p>orchestrator and orchestrator_cli can support this.</p> Dashboard - Feature #41237 (New): mgr/dashboard: Create OSDs on Persistent Volumeshttps://tracker.ceph.com/issues/412372019-08-14T04:43:25ZKiefer Chang
<p>Rook backend now supports <a href="https://github.com/rook/rook/blob/master/Documentation/ceph-cluster-crd.md#pvc-based-cluster" class="external">creating OSDs with PVs.</a><br />Dashboard might need to support his too.</p>
<p>Related to <a class="issue tracker-2 status-3 priority-4 priority-default closed" title="Feature: mgr/dashboard: Create OSD on spare disks (Resolved)" href="https://tracker.ceph.com/issues/40335">#40335</a>.</p> Dashboard - Feature #40556 (New): Replace MDS counter chart with Grafana dashboard in Filesystems...https://tracker.ceph.com/issues/405562019-06-26T02:52:00ZKiefer Chang
<p>Currently when clicking a fs in Filesystems page, performance counter charts of MDS daemons will be displayed.</p>
<p><img src="https://tracker.ceph.com/attachments/download/4249/cephfs_md_counter_chart.png" style="width: 60%;" alt="" /></p>
<p>It would be nice to migrate these charts to Grafana dashboard, since fs metrics are exported by Prometheus module.</p>
<p>Example of metrics:</p>
<pre>
# HELP ceph_mds_mem_dn Dentries
# TYPE ceph_mds_mem_dn gauge
ceph_mds_mem_dn{ceph_daemon="mds.a"} 1354.0
ceph_mds_mem_dn{ceph_daemon="mds.b"} 10.0
ceph_mds_mem_dn{ceph_daemon="mds.c"} 10.0
</pre>