Bug #48973
closedmgr/dashboard: dashboard hangs when accessing it
0%
Description
Description of problem¶
On first try to access dashboard (typing https://<dashboard_url>
and pressing ENTER) the browser (tested in Chrome Version 87.0.4280.88 (Official Build) (64-bit)
and Firefox 84.0.1 (64-bit)
) doesn't show any page and it simply keeps waiting.
Environment¶
ceph version
string:- Initially identified in master Jan 24th (97480142a69e7ff5bd2abaceb42cffe4b749d00c)
- Reproduced in Pacific (16.1.0)
- Also 1 month ago, Dec 24th (6793756f45f669240de952edd92946541385d090). This discards latest changes to ceph-mgr C++ code related to GIL/locks.
- Dec 16 (63a5cd41c8b4e1ff5ee01854b4aa1425fe2da1bf). This discards CVE changes, including JWT and account lock-out.
- Platform (OS/distro/release):
- CentOS 8.3 / Fedora 32
- python3-cherrypy-18.4.0-1.el8.noarch
- NOT REPRODUCED in OpenSUSE Tumbleweed (Cherrypy 18.6.0-2.1 - Cheroot 8.3.0)
- Cluster details (nodes, monitors, OSDs): minimal vstart cluster with 1 mon + 1 mgr + 3 OSDs. It happens as well in Cephadm deployments.
- Browser used (e.g.:
Version 86.0.4240.198 (Official Build) (64-bit)
):- Chrome
Version 87.0.4280.88 (Official Build) (64-bit)
- Firefox
84.0.1 (64-bit)
- Chrome
- Other:
NOT REPRODUCED with plain HTTP (HTTPS is required)It happens too, so this seems to relate the elapsed time for establishing the connection with the likelihood the issue to pop up (HTTP < static assets over HTTPS < HTTPS + Auth).
How reproducible¶
From a freshly launched dashboard (or an immediately restarted mgr), wait until the initialization finishes (curl -kv https://<dashboard_url>
returns the index.html
). Then switch to a browser (either Chrome or Firefox) and type the dashboard URL in the navigation bar and press ENTER. That's enough to trigger the issue.
Sporadic requests via `curl` don't trigger the issue. It happens when multiple requests are issued at the same time. It can be reproduced from the CLI with Apache benchmark:
> ab -c20 -n1000 "https://<dashboard_url>/docs"
Benchmarking <dashboard_url> (be patient)
Completed 1000 requests
Completed 2000 requests
SSL handshake failed (5).
Completed 3000 requests
SSL handshake failed (5).
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
...
Complete requests: 10000
Failed requests: 2
(Connect: 0, Receive: 0, Length: 2, Exceptions: 0)
Total transferred: 13387322 bytes
Actual results¶
Dashboard login page is not displaying and the browser keeps loading/waiting until manually stopped (minutes). After that, the curl
requests no longer work:
curl -kv https://localhost:11000
* Rebuilt URL to: https://localhost:11000/
* Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 11000 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server did not agree to a protocol
* Server certificate:
* subject: O=IT; CN=ceph-dashboard
* start date: Jan 20 16:42:29 2021 GMT
* expire date: Jan 18 16:42:29 2031 GMT
* issuer: O=IT; CN=ceph-dashboard
* SSL certificate verify result: self signed certificate (18), continuing anyway.
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> GET / HTTP/1.1
> Host: localhost:11000
> User-Agent: curl/7.61.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
Expected results¶
Dashboard login page loads normally.
Additional info¶
Current efforts are led towards finding where this issue comes from:- Dashboard Python code:
- Ceph-mgr Python
- Ceph-mgr C++
- CherryPy: reproduced with both
builtin
andPyOpenSSL
transport wrappers.