Feature #27049
openmgr/dashboard: retrieve "Data Health" info from dashboard backend
0%
Description
We want to add a chart/info in Dashboard Landing Page
that shows "Data Health" based on PG info as recommended by John Spray:
Ceph already internally has an opinion about the health of PGs
(resulting in the PG_DAMAGED, PG_UNAVAILABLE, PG_DEGRADED,
PG_DEGRADED_FULL health checks). You can see where that's done here:
https://github.com/ceph/ceph/blob/master/src/mon/PGMap.cc#L2179The key point about this is that it's not the status of a PG that
users are really interested in: it's the health of their data (one
difference is that data can't be "working"). So the classifications
we have for the health are:
- Damaged: some (perhaps unrecoverable) data loss has occurred, Ceph
can't fix itself without help
- Degraded: the data is there and accessible, but not as redundant as
we would like
- Unavailable: we can't get at the data right now, but we believe
it's still stored somewhere.So: if the goal is a simplified way to show PG health in the UI, my
suggestion is not to call it PG health at all, call it "data health"
and use exactly the same mappings that we use for the existing health
checks.
AFAIK, statistics that are calculated in the C++ code are required in order
to show the data correctly, but are not currently exposed to python modules.
The goal:
To be able to retrieve this data health group info from dashboard backend
in order to show it in Landing Page.