mgr/dashboard: (re-)explore a dashboard-proxified Grafana
Currently, in some deployment scenarios (e.g.: OpenStack or any other with different public and provisioning networks) Grafana/Prometheus are actual back-end components and Ceph-Dashboard is the mainly user-facing component.Pros:
- Minimal changes on deployment/config: additionally, self-signed certs could be perfectly secure between back-end only components.
- Smoother integration: currently Dashboard just acts as a 'broker' when it comes to Grafana, and it's the end-user's browser that is responsible for dealing with it. This scenario would allow Dashboard to monitor Grafana/Prometheus health.
- Improved security: it allows for better isolation of components (only mgr/dashboard is public facing, while Grafana/Prometheus can be kept away).
- Added processing/load to the ceph-mgr (Grafana is pretty intensive in networking activity). ceph-mgr/dashboard might become a bottleneck.
- Increased latency.
- Version compatibility issues: proxy integration could be non trivial.
#2 Updated by Patrick Seidensal 5 months ago
I'd like to share some experiences and thoughts we've had with that already. Please don't think of it as irrefutable truths. Please also note that being able to reach Grafana directly has so far also been a goal of Ceph Dashboard.
I am aware of two different approaches for proxying Grafana.
The supported way is to use the root_url and domain settings. Possibly the serve_from_sub_path options as well.
By doing that, the Grafana instance will not be reachable directly anymore, only from the proxy. I know that this is the idea, though, not being able to reach Grafana directly anymore has some implications we should be aware of.
- All Grafana dashboards would need to be shown through Ceph Dashboard or wouldn't be available anymore.
- We currently ship a dashboard that is only reachable through the Grafana UI. It would not be reachable anymore. It is kind of a landing page, but Ceph Dashboard decided to implement its own for users that do not want to use Grafana.
- No custom Grafana dashboards can be browsed by the user anymore.
- maybe Grafana would need to be configured with a single domain for Ceph Dashboard (all mgr instances), which might require an additional proxy for Ceph Dashboard itself
- It may become impossible to use the GUI for any configuration of Grafana, unless Ceph Dasboards also enables that by not only proxying Graphs but other pages or even enable to get a non-embedded view.
- If anything goes wrong, it may be harder to figure out where and why.
Especially the inability to get to Grafana directly to perform any of these tasks that were not possible through the frontend has been criticized in previous openATTIC releases.
Unsupported Way (from Grafanas' POV)
The unsupported way to provide a reverse proxy for Grafana is by not using the aforementioned configuration options, which results in the capability to reach the Grafana instance directly and through the proxy.
Though, to make that work for Ceph Dashboard, content would need to be rewritten by Ceph Dashboard when proxying Grafana. Depending on how it's done or doable, it could result in
- increased maintenance effort
- being bound to specific Grafana version the proxy supports
- extensive testing might be necessary to support different Grafana versions
- very tedious updates to support newer Grafana versions
openATTIC indeed had the same idea. The result was hard to maintain and bound to a specific Grafana version, which, at the time openATTIC was only supported on SUSE, worked.
This is how it looked like: bitbucket.org/openattic/openattic/.../grafana_proxy.py
Of course, this would enable to control the authentication of Grafana through the proxy in Ceph Dashboard and it would be possible to be able to see Graphs only when authenticated.
Best of Both Worlds
The best of both worlds turned out to be not using the proxy setting of Grafana and not rewriting content, so, not proxying at all. This was achieved by embedding Grafana graphs which were publicly available, so that no authentication was required. This results in being able to embed Grafana Dashboards in Ceph Dashboards as well as being able to reach and use Grafana directly.
This is the solution we currently have in Ceph Dashboard.
Grafana could be left out of the equation and be replaced by a frontend library which renders the graphs natively in Ceph Dashboard. The frontend could either query the Backend, where a Prometheus proxy would be implemented or query Proemtheus directly (which might also not be wanted). As we currently do not rely on any other features of Grafana, like alerting, this would be possible.
- Probably really nice looking, native widgets (smoother integration)
- No need for Grafana and no frontend-proxying issues
- Possibly easier to write a proxy from Ceph Dashboard to Prometheus
- Improved security (better isolation, authentication would always be required)
- Less certificates to configure
- Quite some work
- Probably even more work to get on par with Grafana features
- Creating and maintaing dashboards might be more difficult or even impossible for non-developers, depending on the solution
By removing Grafana from the stack, it would not need to be reached separately and outside of Ceph Dashboard. This may also be seen as an disadvantage by Grafana users who enjoy its integration, though it would serve as a plus in terms of the goals of this ticket (enhanced security, smoother integratin, caching).
Though, I'm not sure if caching should be considered.
#3 Updated by Ernesto Puerta 5 months ago
Thanks a lot for your feedback, Patrick. Much appreciated since it comes from battleground experience. I browsed the old OpenATTIC JIRA for this exact information.
I also like the native widget approach, but it might still require a lot of exploratory/PoC work that, with all the releases in flight, we can barely afford... Perhaps a good subject for our next GSoC/Outreachy internship! The proxy approach seemed a bit more straightforward, given we already have the infra to deal with 3rd party HTTP endpoints (RGW, iSCSI, Prom, ...).
#4 Updated by Patrick Seidensal 4 months ago
A proxy would indeed be implemented faster than native widgets and if you want to evaluate this possibility again, that's fine. The crux is the possibility to access Grafana from outside the dashboard. That alone is a hurdle. As far as I know Paul Cuzner also wanted to keep this possibility. But if you can find a better way to rewrite the content of the transmissions and create a proxy that is easier to maintain, why not. But then Grafana would still be accessible for users, which actually contradicts some goals. This is perhaps more a question of what we want than a technical problem. Do we really want to deny users access to the Grafana instance? And if not and it should still be the proxy for other reasons, then the dashboard should be extended to allow full access to Grafana. In that case we could simply use the supported way for Grafana to work behind a reverse proxy and no content would need to be rewritten. The implementation of the proxy would be simple and easy to maintain. If, on the other hand, the user should not be able to access Grafana directly anymore, then also the supported way to configure a reserve proxy in Grafana would be the simplest way (and easiest to maintain) to write such a proxy.