Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2020-07-15T14:14:32ZCeph
Redmine Orchestrator - Tasks #46551 (Resolved): cephadm: Add better a better hint how to add a hosthttps://tracker.ceph.com/issues/465512020-07-15T14:14:32ZStephan Müller
<p>Currently:</p>
<pre>
master:~ # ceph orch host add mgr0 192.168.121.230
Error ENOENT: Failed to connect to mgr0 (192.168.121.230).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@mgr0
</pre>
<p>What actually needs to be done:<br /><pre>
master:~ # ceph config-key get mgr/cephadm/ssh_identity_pub > key.pub
master:~ # ssh-copy-id -i "key.pub" root@mgr0
</pre></p>
<p>What the message should look like in the end:<br /><pre>
master:~ # ceph orch host add mgr0 192.168.121.230
Error ENOENT: Failed to connect to mgr0 (192.168.121.230).
Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to add the SSH key to the host:
> ceph config-key get mgr/cephadm/ssh_identity_pub > ~/cephadm_ssh_key.pub
> ssh-copy-id -i ~/cephadm_ssh_key.pub root@mgr0
you may want to check that everything works, before rerunning the command:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_ssh_key
> ssh -F ssh_config -i ~/cephadm_ssh_key root@mgr0
</pre></p> Orchestrator - Support #46547 (Resolved): cephadm: Exception adding host via FQDN if host was alr...https://tracker.ceph.com/issues/465472020-07-15T12:17:02ZStephan Müller
<p>To reproduce you need nodes that have a subdomain (not like in current Vagrantfile). I used sesdev to find this issue.</p>
<pre>
master:~ # ceph orch host add node1.pacific.test
Error ENOENT: New host node1.pacific.test (node1.pacific.test) failed check: [
'INFO:cephadm:podman|docker (/usr/bin/podman) is present',
'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present',
'INFO:cephadm:Unit chronyd.service is enabled and running',
'INFO:cephadm:Hostname "node1.pacific.test" matches what is expected.',
'ERROR: hostname "node1" does not match expected hostname "node1.pacific.test"'
]
</pre>
<p>With `ceph -W cephadm` one observes</p>
<pre>
2020-07-15T13:24:21.159126+0200 mgr.node1.zybwkb [ERR] _Promise failed
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 277, in _finalize
next_result = self._on_complete(self._value)
File "/usr/share/ceph/mgr/cephadm/module.py", line 132, in <lambda>
return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 1098, in add_host
return self._add_host(spec)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1087, in _add_host
spec.hostname, spec.addr, err))
orchestrator._interface.OrchestratorError: New host node1.pacific.test (node1.pacific.test) failed check: ['INFO:cephadm:podman|docker (/usr/bin/podman) is present', 'INFO:cephadm:systemctl is present', 'INFO:cephadm:lvcreate is present', 'INFO:cephadm:Unit chronyd.service is enabled and running', 'INFO:cephadm:Hostname "node1.pacific.test" matches what is expected.', 'ERROR: hostname "node1" does not match expected hostname "node1.pacific.test"']
</pre> Orchestrator - Documentation #46377 (Resolved): cephadm: Missing 'service_id' in last example in ...https://tracker.ceph.com/issues/463772020-07-06T15:33:59ZStephan Müller
<p>Missing 'service_id' in last example in orchestrator#service-specification. Example can be found right above <a class="external" href="https://docs.ceph.com/docs/master/mgr/orchestrator/#placement-specification">https://docs.ceph.com/docs/master/mgr/orchestrator/#placement-specification</a> and it should look like specified in <a class="external" href="https://docs.ceph.com/docs/master/cephadm/drivegroups/#osd-service-specification">https://docs.ceph.com/docs/master/cephadm/drivegroups/#osd-service-specification</a> .</p> Orchestrator - Tasks #46376 (Resolved): cephadm: Make vagrant usage more comfortablehttps://tracker.ceph.com/issues/463762020-07-06T15:28:51ZStephan Müller
<p>Currently you can only use a big scale factor using the vagrant setup. You can have x * (mgr, mon, osd with 2 disks). It would be nicer to use the same constants as vstart is using to select how many mgr, mons and osds one likes to have. I would go further and add a disks constant two.</p>
<p>This would make the creation a lot more flexible. Another thing that is missing is an script to easily snapshot the created vm's and recreate them</p> Orchestrator - Bug #45724 (Resolved): check-host should not fail using fqdn or not that hardhttps://tracker.ceph.com/issues/457242020-05-27T09:26:51ZStephan Müller
<p>I would suggest either identify that it's an FQDN or answer "Host not found. Use 'ceph orch host ls' to see all managed hosts."</p>
<pre>
# ceph cephadm check-host node3.ses7.com
Error EINVAL: Traceback (most recent call last):
File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
return self.handle_command(inbuf, cmd)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
return self.func(mgr, **kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
return func(*args, **kwargs)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1482, in check_host
error_ok=True, no_fsid=True)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1569, in _run_cephadm
conn, connr = self._get_connection(addr)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1521, in _get_connection
n = self.ssh_user + '@' + host
TypeError: must be str, not NoneType
</pre> Dashboard - Feature #45306 (New): mgr/dashboard: asynchronous back-end: Use HTTP2 or websocketshttps://tracker.ceph.com/issues/453062020-04-28T13:11:52ZStephan Müller
<p>In order to determine what we want to use in future. I will compare both HTTP2 and websockets.</p>
<p>First a bunch of information.</p>
<p>Currently we use the protocol HTTP1.1, which only allows one request per connection.</p>
<p>With HTTP2 and websockets it is possible to allow an unlimited amount of request per connection.</p>
<p>What does one request per connection mean? For example a client asks the server for a file, this will open a connection telling the server GET me something, the server will respond and close the connection. As our dashboard does not only consist of one file, a lot of connections are made. To meet the demand of any modern site of so many connections all modern browsers will do 8 connections simultaneously. On every connection also the same header is send.</p>
<p>What does unlimited amount of requests per connection mean? For example a client asks for a (whole) website. The client sends the first request like in HTTP1.1, the server responds with an HTTP1.1 Upgrade header, client and server negotiate which protocol to use (handshake). A connection is established and left open for requests. The client sends requests for multiple files while the server already responds with the files. This maxes out the established connection, as both participants can send at the same time (for example a video chat). As the connection is left open the server can PUSH data to the client even if he had not explicitly asked for (removes polling). To save data, only the headers during the handshake are send, they will not be send multiple times.</p>
<p>Whats the difference between HTTP2 (released as standard 2015) and websockets (released as standard 2011)?<br />Both only need one connection. Websockets can run insecure using port 80 and both can run secure using port 443. Websockets use a different URL prefix <strong>ws://</strong> for insecure connections or <strong>wss://</strong> for secure ones, HTTP2 uses only <strong>https://</strong> as prefix. If HTTP2 is used data will automatically be compressed and the handshake is easier to implement than with websockets.</p>
<p>Sure HTTP2 is the better one as the protocol is much newer, but can we use it with cherrypy?<br />Currently I only found a <a href="https://docs.cherrypy.org/en/latest/advanced.html#websocket-support" class="external">plugin</a> for cherrypy to allow websockets.<br />I've not found one for HTTP2 yet but I'm still collecting information.</p> Dashboard - Feature #44621 (Pending Backport): mgr/dashboard: Automatic preselection of failure d...https://tracker.ceph.com/issues/446212020-03-16T12:12:19ZStephan Müller
<p>Use the automatic preselection of the crush rule creation form inside the erasure code profile form to prevent wrong configured ec profiles which can't be used in the end.</p> Dashboard - Bug #44620 (Resolved): mgr/dashboard: Pool form max sizehttps://tracker.ceph.com/issues/446202020-03-16T12:07:27ZStephan Müller
<p>Currently the pool form max size is determined by "max_size" of the selected rule or the maximum amount of available OSDs. The amount can be wrong if the failure domain of the rule is not OSD.</p>
<p>I'm also currently not sure if "max_size" and "min_size" are useful values to show, at least pools created on a vstart cluster always show the same min and max size values. Please make that sure that those values can still be used safely.</p> Dashboard - Backport #40982 (Resolved): nautilus: mgr/dashboard: Fix the table mouseenter event h...https://tracker.ceph.com/issues/409822019-07-26T13:04:35ZStephan Müller
<p><a class="external" href="https://github.com/ceph/ceph/pull/29354">https://github.com/ceph/ceph/pull/29354</a></p> Dashboard - Backport #40699 (Resolved): nautilus: mgr/dashboard: Silence Alertmanager alertshttps://tracker.ceph.com/issues/406992019-07-09T10:57:04ZStephan Müller
<p><a class="external" href="https://github.com/ceph/ceph/pull/28968">https://github.com/ceph/ceph/pull/28968</a></p> Dashboard - Bug #40330 (Resolved): mgr/dashboard: Warning about stale data makes it hard to click...https://tracker.ceph.com/issues/403302019-06-13T12:29:53ZStephan Müller
<p>Warning about Stale data in the datatable makes the content move up and down, making it hard to hit a certain row</p> Dashboard - Feature #40296 (In Progress): mgr/dashboard: Maintain and improve code coverage on da...https://tracker.ceph.com/issues/402962019-06-12T12:46:28ZStephan Müller
<p>Find <a href="https://github.com/marketplace?utf8=%E2%9C%93&query=coverage" class="external">github plugins</a> or extend Jenkins to enforce a test coverage that cannot decrease.</p>
<p>Best would be on a per file basis for unit tests.</p>
<p>Currently no idea how to measure E2E and API tests, may be there is a way.</p> Dashboard - Bug #39579 (Resolved): mgr/dashboard: Fix run-tox script to accept cli arguments againhttps://tracker.ceph.com/issues/395792019-05-03T10:30:12ZStephan Müller
<p>A regression was introduced by <a href="https://github.com/ceph/ceph/commit/9426f1f2045d0ae0f319530c3dc3a9240d838d07#diff-cc2ee9d8e56f3a2cd98b8148935d3829L37" class="external">this change</a> , causing the script to not accept command line arguments. Therefore command described in the hacking.rst to only run a single tox test ("WITH_PYTHON2=OFF ./run-tox.sh pytest tests/test_rgw_client.py::RgwClientTest::test_ssl_verify") did not work anymore and caused tox to <a href="https://paste.opensuse.org/view//64659695" class="external">fail</a> .</p>
<pre>
Traceback (most recent call last):
File "/usr/bin/tox", line 11, in <module>
load_entry_point('tox==3.7.0', 'console_scripts', 'tox')()
File "/usr/lib/python3.7/site-packages/tox/session.py", line 47, in cmdline
main(args)
File "/usr/lib/python3.7/site-packages/tox/session.py", line 54, in main
retcode = build_session(config).runcommand()
File "/usr/lib/python3.7/site-packages/tox/session.py", line 467, in runcommand
return self.subcommand_test()
File "/usr/lib/python3.7/site-packages/tox/session.py", line 590, in subcommand_test
self.run_sequential()
File "/usr/lib/python3.7/site-packages/tox/session.py", line 609, in run_sequential
self.runtestenv(venv)
File "/usr/lib/python3.7/site-packages/tox/session.py", line 728, in runtestenv
self.hook.tox_runtest(venv=venv, redirect=redirect)
File "/usr/lib/python3.7/site-packages/pluggy/hooks.py", line 289, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/usr/lib/python3.7/site-packages/pluggy/manager.py", line 68, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/usr/lib/python3.7/site-packages/pluggy/manager.py", line 62, in <lambda>
firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
File "/usr/lib/python3.7/site-packages/pluggy/callers.py", line 208, in _multicall
return outcome.get_result()
File "/usr/lib/python3.7/site-packages/pluggy/callers.py", line 80, in get_result
raise ex[1].with_traceback(ex[2])
File "/usr/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/usr/lib/python3.7/site-packages/tox/venv.py", line 597, in tox_runtest
venv.test(redirect=redirect)
File "/usr/lib/python3.7/site-packages/tox/venv.py", line 468, in test
if argv[0].startswith("-"):
IndexError: list index out of range
</pre> Dashboard - Backport #39534 (Resolved): nautilus: mgr/dashboard: New RBD snapshot names should be...https://tracker.ceph.com/issues/395342019-04-30T10:38:01ZStephan Müller
<p><a class="external" href="https://github.com/ceph/ceph/pull/27890">https://github.com/ceph/ceph/pull/27890</a></p> Dashboard - Cleanup #38936 (New): mgr/dashboard: Unify polling behaviorhttps://tracker.ceph.com/issues/389362019-03-25T13:41:58ZStephan Müller
<p>Unify the polling behavior means that all API calls should be handled similar on failure.</p>
<p>The idea is that the dashboard can recover from connection issues automatically, but it should not send out notification on every failure after the initial or it should raise the polling time on each failure.</p>
<p>INHO muting notifications that would be triggered after the initial failure sounds like the best idea.</p>