Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2021-04-22T17:04:15ZCeph
Redmine Dashboard - Bug #50491 (Resolved): mgr/dashboard: centralized logginghttps://tracker.ceph.com/issues/504912021-04-22T17:04:15ZErnesto Puerta
<a name="User-Story"></a>
<h3 >User Story<a href="#User-Story" class="wiki-anchor">¶</a></h3>
As a Ceph operator I want to have a unified view of the logs from the different daemons, so that:
<ul>
<li>I can perform a backward/post-mortem analysis of events leading to an issue,</li>
<li>I can monitor cluster events in real-time.</li>
</ul>
<a name="Persona"></a>
<h3 >Persona<a href="#Persona" class="wiki-anchor">¶</a></h3>
<ul>
<li>Ceph cluster operator/sys admin</li>
<li>Support engineer</li>
<li>Developers</li>
</ul>
<a name="Context"></a>
<h3 >Context<a href="#Context" class="wiki-anchor">¶</a></h3>
Every daemon in Ceph stores its logs locally (there's a "cluster log" but it's extremely concise, and not useful for troubleshooting). This means that if a user wants to perform a post-mortem analysis of an issue, they have first to collect log traces from multiple hosts, which involves to:
<ol>
<li>Identify on which host a daemon is running,</li>
<li>Log in to that host,</li>
<li>Look for the log file in the filesystem,</li>
<li>Open the log and perform a search.</li>
</ol>
<p>For debugging a Ceph issue, users often have to follow the operational events from multiple daemons, so this tasks gets more and more complicated. Additionally, it's almost impossible to perform real-time (vs. post-mortem) troubleshooting.</p>
<a name="Implementation-details"></a>
<h3 >Implementation details<a href="#Implementation-details" class="wiki-anchor">¶</a></h3>
<p>To explore multiple stacks: ELK, Fluentd, Loki, etc.</p>
<p>This might be embedded via iframe as already done for Grafana dashboards, or accessed stand-alone.</p>
<a name="References"></a>
<h3 >References<a href="#References" class="wiki-anchor">¶</a></h3>
<p><a href="https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-assets-prod/summits/26/presentations/23563/slides/Know-more-about-your-Ceph-cluster-ELK-stack2.pdf" class="external">SUSE's Ceph + ELK</a></p>