Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2022-07-08T17:26:58ZCeph
Redmine Dashboard - Bug #56514 (New): mgr/dashboard: paginate alertshttps://tracker.ceph.com/issues/565142022-07-08T17:26:58ZErnesto Puerta
<p>This was an unknown unknown: since we didn't expect this to become such a massive endpoint. The thing is that some alerts are triggered PER daemon, so issues affecting OSDs will trigger x number_of_OSDs (8k entries for an 8k OSD cluster with 1 alert... but if an OSD triggers 2,3 alerts, then than number will multiply by 2, 3 to 16, 24k items).</p>
<p>Target cout: 32k alerts.</p>
<p>Strategy: Prometheus Alerts API (<a class="external" href="https://prometheus.io/docs/prometheus/latest/querying/api/#alerts">https://prometheus.io/docs/prometheus/latest/querying/api/#alerts</a>).</p>
From Pawsey:
<ul>
<li>"The generic table does work that well across all the use cases. For example in the monitoring it would make sense to have a checkboxes for active/suppressed for example. As it stands you have an overlay number on the monitoring menu item indicating the number of issues, but when you look at the alerts you see suppressed as well!"</li>
</ul> Dashboard - Bug #56513 (Pending Backport): mgr/dashboard: paginate hostshttps://tracker.ceph.com/issues/565132022-07-08T17:20:57ZErnesto Puerta
<p>Target count: 100-200s.</p>
<p>Strategy: the orchestrator API should be the source of truth for this (Ceph server API allows to retrieve hostnames for Ceph-only services, but we need to display non-Ceph service hosts too).</p>
From Pawsey:
<ul>
<li>"The hosts view takes about 1sec to load."</li>
</ul> Dashboard - Feature #56512 (Pending Backport): mgr/dashboard: paginate serviceshttps://tracker.ceph.com/issues/565122022-07-08T17:17:25ZErnesto Puerta
<p>Target count: 8k+. Services could grow in a similar pace as OSDs</p>
<p>Strategy: Some service instances could be more stable (stateful ones, like OSDs or mons), while others (stateless) could be easily scaled up-down, relocated or removed. However the count of those is negligible compared to the OSDs.</p>
From Pawsey:
<ul>
<li>"hosts services shows some but not all entries. You see 2 in the main table, but when you drill down you see there are other daemons " </li>
<li>"some daemon info is showing lag (service-03 last refresh) - problem is how is this issue highlighted? I don’t think it is."</li>
</ul> Dashboard - Bug #56511 (New): mgr/dashboard: paginate OSDshttps://tracker.ceph.com/issues/565112022-07-08T17:12:46ZErnesto Puerta
<p>Target count: 8k</p>
<p>Strategy: OSD map is core data structure which is often exchanged between Ceph services. Efficient API calls allow to retrieve the list of OSDs. OSD metadata might require extra calls.</p>
From Pawsey:
<ul>
<li>"osd view takes 40+secs to populate the first page. During this period there is no indication that it is working at all - when you do get the display the UI is very slow no other link works - like the data is still be loaded in the background and consuming cycles. Firefox issues a pop up stating that page is slowing the browser down! I had to close the tab and relogin" </li>
<li>"osd view - refreshes every time you enter this menu option - each time there is a 40+sec lag"</li>
</ul> Dashboard - Bug #56509 (New): mgr/dashboard: paginate inventoryhttps://tracker.ceph.com/issues/565092022-07-08T17:09:51ZErnesto Puerta
<p>Target count: 8k instances.</p>
<p>Since the data is unlikely to change for existing hosts, a fixed cache could be used with hostnames as keys. However, hostnames need to be checked first to look for: new hosts or removed hosts.</p>