Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2020-10-01T10:03:53ZCeph
Redmine Dashboard - Bug #47714 (New): mgr/dashboard: Implement an expert settinghttps://tracker.ceph.com/issues/477142020-10-01T10:03:53ZStephan Müller
<p>To simplify forms implement an expert setting that if disabled hides not mandatory fields in the forms as the first step.</p>
<p>How should it look like?<br />Maybe on the top right an expert slider on the form and on the top panel. As it will have impact on what a users sees in the future.</p>
<p>Tip before getting to deep into the implementation please ask in the stand up if that's path we want to go down.</p> Orchestrator - Tasks #46551 (Resolved): cephadm: Add better a better hint how to add a hosthttps://tracker.ceph.com/issues/465512020-07-15T14:14:32ZStephan Müller
<p>Currently:</p>
<pre>
master:~ # ceph orch host add mgr0 192.168.121.230
Error ENOENT: Failed to connect to mgr0 (192.168.121.230).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@mgr0
</pre>
<p>What actually needs to be done:<br /><pre>
master:~ # ceph config-key get mgr/cephadm/ssh_identity_pub > key.pub
master:~ # ssh-copy-id -i "key.pub" root@mgr0
</pre></p>
<p>What the message should look like in the end:<br /><pre>
master:~ # ceph orch host add mgr0 192.168.121.230
Error ENOENT: Failed to connect to mgr0 (192.168.121.230).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to add the SSH key to the host:
> ceph config-key get mgr/cephadm/ssh_identity_pub > ~/cephadm_ssh_key.pub
> ssh-copy-id -i ~/cephadm_ssh_key.pub root@mgr0
you may want to check that everything works, before rerunning the command:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_ssh_key
> ssh -F ssh_config -i ~/cephadm_ssh_key root@mgr0
</pre></p> Orchestrator - Support #46547 (Resolved): cephadm: Exception adding host via FQDN if host was alr...https://tracker.ceph.com/issues/465472020-07-15T12:17:02ZStephan Müller
<p>To reproduce you need nodes that have a subdomain (not like in current Vagrantfile). I used sesdev to find this issue.</p>
<pre>
master:~ # ceph orch host add node1.pacific.test
Error ENOENT: New host node1.pacific.test (node1.pacific.test) failed check: [
'INFO:cephadm:podman|docker (/usr/bin/podman) is present',
'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present',
'INFO:cephadm:Unit chronyd.service is enabled and running',
'INFO:cephadm:Hostname "node1.pacific.test" matches what is expected.',
'ERROR: hostname "node1" does not match expected hostname "node1.pacific.test"'
]
</pre>
<p>With `ceph -W cephadm` one observes</p>
<pre>
2020-07-15T13:24:21.159126+0200 mgr.node1.zybwkb [ERR] _Promise failed
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 277, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/cephadm/module.py", line 132, in <lambda>
return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1098, in add_host
return self._add_host(spec)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1087, in _add_host
spec.hostname, spec.addr, err))
orchestrator._interface.OrchestratorError: New host node1.pacific.test (node1.pacific.test) failed check: ['INFO:cephadm:podman|docker (/usr/bin/podman) is present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', 'INFO:cephadm:Unit chronyd.service is enabled and running', 'INFO:cephadm:Hostname "node1.pacific.test" matches what is expected.', 'ERROR: hostname "node1" does not match expected hostname "node1.pacific.test"']
</pre> Orchestrator - Documentation #46377 (Resolved): cephadm: Missing 'service_id' in last example in ...https://tracker.ceph.com/issues/463772020-07-06T15:33:59ZStephan Müller
<p>Missing 'service_id' in last example in orchestrator#service-specification. Example can be found right above <a class="external" href="https://docs.ceph.com/docs/master/mgr/orchestrator/#placement-specification">https://docs.ceph.com/docs/master/mgr/orchestrator/#placement-specification</a> and it should look like specified in <a class="external" href="https://docs.ceph.com/docs/master/cephadm/drivegroups/#osd-service-specification">https://docs.ceph.com/docs/master/cephadm/drivegroups/#osd-service-specification</a> .</p> Orchestrator - Tasks #46376 (Resolved): cephadm: Make vagrant usage more comfortablehttps://tracker.ceph.com/issues/463762020-07-06T15:28:51ZStephan Müller
<p>Currently you can only use a big scale factor using the vagrant setup. You can have x * (mgr, mon, osd with 2 disks). It would be nicer to use the same constants as vstart is using to select how many mgr, mons and osds one likes to have. I would go further and add a disks constant two.</p>
<p>This would make the creation a lot more flexible. Another thing that is missing is an script to easily snapshot the created vm's and recreate them</p> Ceph - Documentation #45874 (Fix Under Review): doc: Extend resolving conflict section in "Submit...https://tracker.ceph.com/issues/458742020-06-04T08:09:20ZStephan Müller
<p>Currently it's not clear how to easily continue with the backport script when a conflict is encountered.</p> Dashboard - Cleanup #45433 (Resolved): mgr/dashboard: Don't have two different unit test mechanicshttps://tracker.ceph.com/issues/454332020-05-07T15:13:20ZStephan Müller
<p>Use the <em>DEV</em> test method as default. To be able to do that our workaround needs to include a new statement. It has to set `testBedApi._instantiated` to false after every test.</p>
<p>This will look like the following in the end:</p>
<pre>
diff --git a/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts b/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts
index b7de47f303..e94c1a3df1 100644
--- a/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts
+++ b/src/pybind/mgr/dashboard/frontend/src/testing/unit-test-helper.ts
@@ -1,5 +1,5 @@
import { LOCALE_ID, TRANSLATIONS, TRANSLATIONS_FORMAT, Type } from '@angular/core';
-import { async, ComponentFixture, TestBed } from '@angular/core/testing';
+import { ComponentFixture, getTestBed, TestBed } from '@angular/core/testing';
import { AbstractControl } from '@angular/forms';
import { By } from '@angular/platform-browser';
@@ -18,31 +18,28 @@ import {
AlertmanagerNotificationAlert,
PrometheusRule
} from '../app/shared/models/prometheus-alerts';
-import { _DEV_ } from '../unit-test-configuration';
import { CrushNode } from '../app/shared/models/crush-node';
import { CrushRule, CrushRuleConfig } from '../app/shared/models/crush-rule';
-export function configureTestBed(configuration: any, useOldMethod?: boolean) {
- if (_DEV_ && !useOldMethod) {
- const resetTestingModule = TestBed.resetTestingModule;
- beforeAll((done) =>
- (async () => {
- TestBed.resetTestingModule();
- TestBed.configureTestingModule(configuration);
- // prevent Angular from resetting testing module
- TestBed.resetTestingModule = () => TestBed;
- })()
- .then(done)
- .catch(done.fail)
- );
- afterAll(() => {
- TestBed.resetTestingModule = resetTestingModule;
- });
- } else {
- beforeEach(async(() => {
+export function configureTestBed(configuration: any) {
+ const testBedApi: any = getTestBed();
+ const resetTestingModule = TestBed.resetTestingModule;
+ beforeAll((done) =>
+ (async () => {
+ // prevent Angular from resetting testing module
+ TestBed.resetTestingModule = () => TestBed;
+ TestBed.resetTestingModule();
TestBed.configureTestingModule(configuration);
- }));
- }
+ })()
+ .then(done)
+ .catch(done.fail)
+ );
+ afterEach(() => {
+ testBedApi._instantiated = false;
+ });
+ afterAll(() => {
+ TestBed.resetTestingModule = resetTestingModule;
+ });
}
export class PermissionHelper {
</pre>
<p>This means to remove the unit test configuration and amending the documentation as well as the unit test script (And remove the usage of the second parameter).</p> Dashboard - Feature #45306 (New): mgr/dashboard: asynchronous back-end: Use HTTP2 or websocketshttps://tracker.ceph.com/issues/453062020-04-28T13:11:52ZStephan Müller
<p>In order to determine what we want to use in future. I will compare both HTTP2 and websockets.</p>
<p>First a bunch of information.</p>
<p>Currently we use the protocol HTTP1.1, which only allows one request per connection.</p>
<p>With HTTP2 and websockets it is possible to allow an unlimited amount of request per connection.</p>
<p>What does one request per connection mean? For example a client asks the server for a file, this will open a connection telling the server GET me something, the server will respond and close the connection. As our dashboard does not only consist of one file, a lot of connections are made. To meet the demand of any modern site of so many connections all modern browsers will do 8 connections simultaneously. On every connection also the same header is send.</p>
<p>What does unlimited amount of requests per connection mean? For example a client asks for a (whole) website. The client sends the first request like in HTTP1.1, the server responds with an HTTP1.1 Upgrade header, client and server negotiate which protocol to use (handshake). A connection is established and left open for requests. The client sends requests for multiple files while the server already responds with the files. This maxes out the established connection, as both participants can send at the same time (for example a video chat). As the connection is left open the server can PUSH data to the client even if he had not explicitly asked for (removes polling). To save data, only the headers during the handshake are send, they will not be send multiple times.</p>
<p>Whats the difference between HTTP2 (released as standard 2015) and websockets (released as standard 2011)?<br />Both only need one connection. Websockets can run insecure using port 80 and both can run secure using port 443. Websockets use a different URL prefix <strong>ws://</strong> for insecure connections or <strong>wss://</strong> for secure ones, HTTP2 uses only <strong>https://</strong> as prefix. If HTTP2 is used data will automatically be compressed and the handshake is easier to implement than with websockets.</p>
<p>Sure HTTP2 is the better one as the protocol is much newer, but can we use it with cherrypy?<br />Currently I only found a <a href="https://docs.cherrypy.org/en/latest/advanced.html#websocket-support" class="external">plugin</a> for cherrypy to allow websockets.<br />I've not found one for HTTP2 yet but I'm still collecting information.</p> Dashboard - Bug #44753 (New): mgr/dashboard: Secure the Alertmanger receiver endpointhttps://tracker.ceph.com/issues/447532020-03-25T14:10:42ZStephan Müller
<p>Currently it is possible send push notification unauthenticated to the dashboard and the push notifications are not verified if they actually are coming from an Alertmanager instance.</p>
<p>To see whats configurable see <a class="external" href="https://prometheus.io/docs/alerting/configuration/#http_config">https://prometheus.io/docs/alerting/configuration/#http_config</a></p>
<p>Removing the endpoint is not a solution to be considered as ceph orchestrator is configuring every Alertmanager instance to talk to the receiver of the dashboard.</p>
<p>The receiver is at the moment the only part that can handle multiple Altermanger instances.</p> Dashboard - Bug #44224 (New): mgr/dashboard: Timeouts for rbd.py callshttps://tracker.ceph.com/issues/442242020-02-20T10:10:30ZStephan Müller
<p>As the corner cases are not implemented in many rbd.py methods, they can fail without a response on a specific pool (mostly bad pools).</p>
<p>If this is implemented remove the workaround that was implemented to fix <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>For details what known issue exists see <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a>.</p>
<p>For details about the discussion that was made look at the PR that fixed <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>Make sure that <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a> is still not addressed before starting with this issue.</p>
<p>For details how this was implemented in openATTIC look <a href="https://bitbucket.org/openattic/openattic/pull-requests/682/add-librados-command-name-to-external/diff" class="external">here</a></p> Dashboard - Bug #44223 (Duplicate): mgr/dashboard: Timeouts for rbd.py callshttps://tracker.ceph.com/issues/442232020-02-20T10:05:49ZStephan Müller
<p>As the corner cases are not implemented in many rbd methods, they can fail without a response on a specific pool (mostly bad pools).</p>
<p>If this is implemented remove the workaround that was implemented to fix <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>For details what known issue exists see <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a>.</p>
<p>For details about the discussion that was made look at the PR that fixed <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: mgr/dashboard: Dashboard breaks on the selection of a bad pool (Resolved)" href="https://tracker.ceph.com/issues/43765">#43765</a>.</p>
<p>Make sure that <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: pybind/rbd: config_list hangs if given an pool with a bad pg state (Rejected)" href="https://tracker.ceph.com/issues/43771">#43771</a> is still not addressed before starting with this issue.</p> Dashboard - Feature #43930 (New): mgr/dashboard: Make user creation with password change on logon...https://tracker.ceph.com/issues/439302020-01-31T11:37:12ZStephan Müller
<p>After <a class="issue tracker-2 status-5 priority-6 priority-high2 closed child" title="Feature: mgr/dashboard: Enforce password change upon first login (Closed)" href="https://tracker.ceph.com/issues/24655">#24655</a> has been resolved, I can create a new user without a password an check "User must change password at next logon".</p>
<p>But I can't click the logon button without typing a password, but the user has no password as it should be set on next logon.</p>
<p>There are 3 ways to solve this:<br />1. Enable logon without a password<br />2. Enable passwords without password rule if the user has to change his PW anyway on his first logon according to the rules.<br />3. Button to automatically generate a password that complies with the rules and copy into to the clipboard.</p> rbd - Bug #43771 (Rejected): pybind/rbd: config_list hangs if given an pool with a bad pg statehttps://tracker.ceph.com/issues/437712020-01-23T16:53:11ZStephan Müller
<p>If the dashboard tries to get the configuration of RBDs on a pool basis with a pool in the pg state 'creating+incomplete', it will stop working waiting for a response of `config_list` in `rbd.pyx`.</p>
<p>The pg state 'creating+incomplete' is an edge case as it will only appear if one creates a pool that needs more buckets as the cluster can provide. The current workaround in the dashboard is to omit this call if a pool is in this state.</p>
<p>Here is the manual stack trace found by debugging:<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/controllers/pool.py#L206">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/controllers/pool.py#L206</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/services/rbd.py#L104">https://github.com/ceph/ceph/blob/master/src/pybind/mgr/dashboard/services/rbd.py#L104</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2215">https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2215</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2935">https://github.com/ceph/ceph/blob/master/src/pybind/rbd/rbd.pyx#L2935</a></p> Dashboard - Bug #43384 (Duplicate): mgr/dashboard: Pool size is calculated with data with differe...https://tracker.ceph.com/issues/433842019-12-19T14:01:20ZStephan Müller
<p>Currently we use the following calculation in the dashboard to calculate the usage:</p>
<pre><code>const avail = stats.bytes_used.latest + stats.max_avail.latest;<br /> pool['usage'] = avail > 0 ? stats.bytes_used.latest / avail : avail;</code></pre>
<p>[pool-list.component.ts:253-4]</p>
<p>The problem is that "max_avail" is calculated somewhat strangely as it does not show the real available space as it divides the available space at least through the number of replications for a replicated and by "(m+k)/k" for an ec pool.</p>
<p>To look the calculation up go to the following locations:<br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6033">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6033</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6040">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6040</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6054">https://github.com/ceph/ceph/blob/master/src/osd/OSDMap.cc#L6054</a></p>
<p><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L909">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L909</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L920">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L920</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L924">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L924</a><br /><a class="external" href="https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L947">https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L947</a></p>
<p>This problem was found out by a user who notified us about the percentage mismatch, here is the link to the mail:<br /><a class="external" href="http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-December/037680.html">http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-December/037680.html</a></p>
<p>Also found this thread on the new mailing list regarding the used percentage and max_avail topic:<br /><a class="external" href="https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NH2LMMX5KVRWCURI3BARRUAETKE2T2QN/#JDHXOQKWF6NZLQMOGEPAQCLI44KB54A3">https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NH2LMMX5KVRWCURI3BARRUAETKE2T2QN/#JDHXOQKWF6NZLQMOGEPAQCLI44KB54A3</a></p> Dashboard - Feature #43351 (New): mgr/dashboard: [RFC] Actions assistanthttps://tracker.ceph.com/issues/433512019-12-17T10:34:10ZStephan Müller
<p>Not sure if this is needed for all pages but it could help users.</p>
<p>I just looked at the OSD page which is crowed by actions.</p>
<p>There are two ways to implement it,<br />as an modal that triggers the action modal,<br />or as an modal that describes what to do in order to trigger the action.</p>
<p>The second approach could be implemented globally and not only for a specific page as it could search through every page actions that are available.</p>
<p>How the modal should look like?<br />It should be pretty straight forward like an FAQ search.<br />If you open it you will see a big input field to type in words that describe what you want to do.</p>
<p>The string will be used to calculate a score for each available description. Than the highest ranked actions (3, 5 or 10?) will be shown, sorted by rank.</p>
<p>The action will shown as accordion showing the description of it if expanded and the button to take the action or the help text to get to the page and action.</p>
<p>As said in the beginning, I'm not sure if we need this.</p>