Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-11-12T08:38:45ZCeph
Redmine Dashboard - Bug #53241 (Fix Under Review): mgr/prometheus: fix mgr/prometheus default standby beh...https://tracker.ceph.com/issues/532412021-11-12T08:38:45ZPatrick Seidensal
<p>Since the merge of <a class="external" href="https://github.com/ceph/ceph/pull/43464">https://github.com/ceph/ceph/pull/43464</a>, the standby mgr modules return an error instead of HTML containing a link to the active mgr/prometheus instance.</p> Dashboard - Bug #48465 (Triaged): mgr/dashboard: Using FQDNs fails for embedding of Grafanahttps://tracker.ceph.com/issues/484652020-12-04T16:26:47ZPatrick Seidensal
<a name="Description-of-problem"></a>
<h3 >Description of problem<a href="#Description-of-problem" class="wiki-anchor">¶</a></h3>
<p>When FQDNs are used for Ceph Dashboard, loading the embedded Grafana will fail, as the URL used for the iframe is not a FQDN. The dashboard will show</p>
<a name="Environment"></a>
<h3 >Environment<a href="#Environment" class="wiki-anchor">¶</a></h3>
<ul>
<li><code>ceph version</code> string: Octopus</li>
<li>Browser used (e.g.: <code>Version 86.0.4240.198 (Official Build) (64-bit)</code>): Firefox</li>
</ul>
<a name="How-reproducible"></a>
<h3 >How reproducible<a href="#How-reproducible" class="wiki-anchor">¶</a></h3>
<p>Deploy a cluster in a network which requires FQDNs.</p>
<a name="Actual-results"></a>
<h3 >Actual results<a href="#Actual-results" class="wiki-anchor">¶</a></h3>
<p><img src="https://tracker.ceph.com/attachments/download/5280/grafana_cant_connect.png" alt="" /></p>
<a name="Expected-results"></a>
<h3 >Expected results<a href="#Expected-results" class="wiki-anchor">¶</a></h3>
<p>A more helpful error message, indicating to the user probable causes of or potential solutions to the issue.</p> Orchestrator - Feature #47970 (New): cephadm: enable user to retrieve configuration templates for...https://tracker.ceph.com/issues/479702020-10-23T10:29:50ZPatrick Seidensal
<p>Due to the necessity to migrate configuration files of monitoring components manually, the user should be able to conveniently retrieve the Ninja2 template used by cephadm to create these configuration files, to be able to determine how the configuration file to be migrated has to be adapted.</p> mgr - Bug #46703 (Fix Under Review): mgr/prometheus: introduce metric for collection timehttps://tracker.ceph.com/issues/467032020-07-24T10:12:23ZPatrick Seidensal
<p>To be able to be warned by an Prometheus alert when the time it takes to collect metrics becomes critical, it its required to export this time as a metric. This would also allow the the Grafana dashboards to have a graph of how long it takes to gather all metrics in the prometheus manager module.</p> Dashboard - Bug #45448 (New): mgr/dashboard: automated tests for Prometheus configurationhttps://tracker.ceph.com/issues/454482020-05-08T12:17:54ZPatrick Seidensal
<p>Currently, there's no mechanism in place that prevents to use breaking changes or syntax errors in the Prometheus configuration, which also includes the Prometheus alerts.</p>
<p>By utilizing `promtool`, we can have a very cheap but working test to ensure that compatibility or syntax isn't broken.<br /> The scope of this tool is limited to Prometheus' configuration and will not prevent any breaking changes in Grafana dashboards.</p>
<p>promtool for nautilus and octopus should be used to ensure compatiblity for Prometheus v2.7.2. Pacific may be tested by the most recent promtool.</p> Dashboard - Documentation #45406 (In Progress): mgr/dashboard: revise monitoring documentationhttps://tracker.ceph.com/issues/454062020-05-06T14:07:49ZPatrick Seidensal
<p>The monitoring documentation needs to be revised as things mentioned there are not up-to-date or could be enhanced with additional information.</p> Orchestrator - Cleanup #43700 (In Progress): cephadm: make it a proper python packagehttps://tracker.ceph.com/issues/437002020-01-20T15:03:30ZSebastian Wagner
<p>Having everything in a single file in the source tree has some disadvantages:</p>
<ul>
<li>we cannot reference things from the rest of the Ceph tree<br />python tool support is more awkward</li>
<li>We don't have the possibility to structure the code properly (whatever that means)</li>
<li>no one in the community can install c-d via pip</li>
<li>nothing can have c-d as a python dependency</li>
<li>we're breaking IDE support</li>
</ul>
<p>That would be resolvable by making c-d a proper python package and then create a zip that can then be curled from somewhere else and then directly executed.</p> Dashboard - Bug #43641 (New): mgr/dashboard: monitoring: grouped alerts do not indicate how many ...https://tracker.ceph.com/issues/436412020-01-16T21:09:32ZPatrick Seidensal
<p>Although 2 of 3 OSDs were up and a "nearly full OSD" alert has been triggered for two OSDs, it is nowhere displayed that two alerts have been fired and hence two OSDs are affected by the alert.</p>
<p>It may additionally be advisable to show the annotation `description` of each alert, otherwise maybe no details about those alerts will ever be shown as common annotations are seldom.</p>
<p><img src="https://tracker.ceph.com/attachments/download/4665/1.png" alt="" /></p>
<p><img src="https://tracker.ceph.com/attachments/download/4666/2.png" alt="" /></p> Dashboard - Bug #43604 (Fix Under Review): mgr/dashboard: monitoring: improve generic "Could not ...https://tracker.ceph.com/issues/436042020-01-15T08:56:31ZPatrick Seidensal
<p>"Could not reach external API" could possibly mean that either the Prometheus API or the Alertmanager's API couldn't be reached. There's no way to find out the difference in the front-end. This makes it unnecessarily difficult for users to configure monitoring correctly in the dashboard.</p> Dashboard - Bug #43449 (Fix Under Review): mgr/dashboard: front-end Grafana dashboard verificatio...https://tracker.ceph.com/issues/434492020-01-02T17:43:00ZPatrick Seidensal
<p>The message says that the Grafana dashboard doesn't exist when the back-end returns anything other than "200", but that is not always correct. The same message is, for instance, also shown when the SSL configuration doesn't match the host or Grafana returns a 401, which makes it incorrect in those cases.</p>
<p>It would generally be more appropriate to have the message say that the dashboard couldn't be validated.</p>
<p><del>In addition to that, it might be helpful for the user to be able to display the response from Grafana for debugging purposes.</del></p> Dashboard - Cleanup #43003 (New): mgr/dashboard: selectionType can be any stringhttps://tracker.ceph.com/issues/430032019-11-25T10:24:39ZPatrick Seidensal
<p>For clarity and consistency, selectionType shouldn't be allowed to be any string for 'single' in CdTable class.</p> Dashboard - Cleanup #42907 (New): mgr/dashboard: make CdTableSelection generichttps://tracker.ceph.com/issues/429072019-11-20T15:32:50ZPatrick Seidensal
<p>The CdTableSelection class of the dashboards frontend is a perfect candidate to become a generic class.</p>
<pre>
export class CdTableSelection<T> {
selected: T[] = [];
hasMultiSelection: boolean;
hasSingleSelection: boolean;
hasSelection: boolean;
constructor() {
this.update();
}
/**
* Recalculate the variables based on the current number
* of selected rows.
*/
update() {
this.hasSelection = this.selected.length > 0;
this.hasSingleSelection = this.selected.length === 1;
this.hasMultiSelection = this.selected.length > 1;
}
/**
* Get the first selected row.
* @return {T | null}
*/
first(): T {
return this.hasSelection ? this.selected[0] : null;
}
}
</pre> Dashboard - Bug #42224 (New): mgr/dashboard: move smart data integration to devices tabhttps://tracker.ceph.com/issues/422242019-10-08T09:53:00ZPatrick Seidensal
<p>After the implementation of the <a href="https://github.com/ceph/ceph/pull/30759" class="external">device list</a>, the best place for the smart data integration is inside the `Devices` tab of the aforementioned implementation and shall hence be moved there.</p>
<p>This will clean up the UI from having two tabs, namely `Devices` (`ceph device ls`) and `Device Health` (`smartctl`). The device health information shall be shown below the table of the `Device` tab and when a particular device is selected.</p>
<p>This is how it might look like:</p>
<p><img src="https://tracker.ceph.com/attachments/download/4474/fixed_proposal.png" alt="" /></p> Dashboard - Documentation #42068 (New): mgr/dashboard: add requirements of showing smart datahttps://tracker.ceph.com/issues/420682019-09-26T14:03:43ZPatrick Seidensal
<p>To be able to see the smart data of devices on the OSD and Host page of Ceph Dashboard, it is required for Ceph</p>
<p>1. to be able to determine information about the device using udev<br />2. to have the `smartmontools` (`smartctl` binary) package installed on the hosts</p>
<p>This shall be added to the documentation of Ceph Dashboard.</p> Dashboard - Feature #40708 (Need More Info): mgr/dashboard: Enable simultaneous E2E test suite ex...https://tracker.ceph.com/issues/407082019-07-10T09:52:18ZPatrick Seidensal
<p>A suite is a single file. That means it's about creating a configuration that runs multiple browser instances (of the same browser) on those files simultaneously.</p>
<p>Starting such a browser instance takes a few seconds. Currently, the complete test with very simplistic checks runs in few seconds, too. That means, enabling the simultaneous test suite execution now would result in in an overhead and more time will be needed for the completion of the tests. Yet, I recommend to enable simultaneous E2E test suite execution as soon as possible to ensure the tests are written the way they are supposed to be written. According to the <a href="https://www.protractortest.org/#/style-guide" class="external">protractor style guide</a>, tests should at least be independent on the file level.</p>
<p>To make the E2E tests run simultaneously, it is sufficient to make a small adaption in the <em>protractor.conf.js</em>.</p>
<p><strong>Things to be aware of</strong></p>
<p><strong>Placement groups</strong></p>
<p>After enabling the parallel execution of multiple suites and after the creation of some tests (for instance, a Pool creation and RBD creation test), complications might already arise. On a vstart cluster with three OSDs (default), it is not possible to create two additional pools (one for the pool test, one for the rbd test) with 128 placement groups. The placement groups need therefore be limited for a single suite, although more would be available, to ensure other suites have enough placement groups left and this won't become an issue. It might make sense to create a PlacementGroup class which takes care of managing the amount of placement groups available on a cluster sooner or later.</p>
<p>Benefits of that class would be:</p>
<p>1. Realistic values for placement groups can be used and hence, the modifications to the cluster while running E2E tests could be more realistic<br />2. Adaptation to other values of placement groups, depending on the cluster size are easier<br />(3). Eventually, the available amount of placement groups could be determined automatically, depending on the cluster to be tested.</p>
<p><strong>Adding more browsers to the tests</strong></p>
<p>If more browser are to be added to the test (which is not in scope of this issue), we will need to be aware that a single suite might be run simultaneously on different browsers. All tests written to contain a certain name (for instance of a pool, bucket, user, rbd image, etc) could cause a clash and would need to adapted before a new browser is added to the tests. This won't be an issue if the tests run simultaneously.</p>