Project

General

Profile

Actions

Bug #64733

open

Monitor keeps crashing on 1 specific node

Added by Leon Streichardt about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,
I have one monitor in my cluster of 3 Nodes, which keeps on crashing after a while. If I then remove the daemon from that cluster and add it again, it performs fine for a couple of hours or days and then once it crashes, will crash on restarts until it is not retrying anymore and stays offline.
Other ceph services, like osds, rgw gateways, managers, etc., are working fine and seem to have no issues on the Node. Other nodes with a very similar configuration, just different CPU and Motherboard don't show this same issue.

As underlying storage for the monitor, I am using ZFS and it shows no issues during operation or when running a scrub, so I would assume the underlying storage is working as expected and not causing these issues.

(The Logs have sadly rotated since the last crash but I can hopefully provide new Logs after the next crash)


Files

crash-info.json (2.8 KB) crash-info.json The Output from ceph crash info {id} Leon Streichardt, 03/06/2024 07:21 AM

No data to display

Actions

Also available in: Atom PDF