Project

General

Profile

Actions

Bug #7991

closed

ceph-mon crash

Added by Andrei Mikhailovsky about 10 years ago. Updated almost 10 years ago.

Status:
Rejected
Priority:
Normal
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
ceph-mon, rbd
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've had an issue with crashing ceph-mon. It happened twice over the course of last two weeks. Attached are the ceph-mon log files from two mon servers. I have three in total, but the crash happened on the two servers that i am sending the logs.

the log files grew to around 7gb in size. They are larger than allowed size in bz2. I will try to coordinate the upload over irc.

Actions #1

Updated by Andrei Mikhailovsky about 10 years ago

Logs have been uploaded via in issue7991 folder. Thanks.

Cluster details:

Ubuntu 12.04 - 3 x ceph mons and 2 x ceph osds.
Ceph Emperor
Cluster usage - CloudStack + qemu 1.5.0 + rbd vm volumes.

Thanks

Actions #2

Updated by Ian Colle about 10 years ago

  • Assignee set to Joao Eduardo Luis
Actions #3

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from New to 4

There is no evidence of a crash on the logs.

One of the monitors appears to be working fine.

The other monitor has shutdown due to reaching critical available disk space:

2014-03-24 16:24:08.079989 7ff89584b700  0 mon.arh-ibstorage1-ib@1(peon).data_health(56710) update_stats avail 5% total 14286320 used 12782020 avail 771936
2014-03-24 16:24:08.080251 7ff89584b700 -1 mon.arh-ibstorage1-ib@1(peon).data_health(56710) reached critical levels of available space on data store -- shutdown!
2014-03-24 16:24:08.080257 7ff89584b700  0 ** Shutdown via Data Health Service **
2014-03-24 16:24:08.080284 7ff893e46700 -1 mon.arh-ibstorage1-ib@1(peon) e11 *** Got Signal Interrupt ***
2014-03-24 16:24:08.080307 7ff893e46700  1 mon.arh-ibstorage1-ib@1(peon) e11 shutdown
2014-03-24 16:24:08.080357 7ff893e46700  0 quorum service shutdown
2014-03-24 16:24:08.080370 7ff893e46700  0 mon.arh-ibstorage1-ib@1(shutdown).health(56710) HealthMonitor::service_shutdown 1 services
2014-03-24 16:24:08.080375 7ff893e46700  0 quorum service shutdown
Actions #4

Updated by Josh Durgin almost 10 years ago

  • Project changed from rbd to Ceph
  • Category set to Monitor
Actions #5

Updated by Sage Weil almost 10 years ago

  • Status changed from 4 to Can't reproduce
Actions #6

Updated by Sage Weil almost 10 years ago

  • Status changed from Can't reproduce to Rejected
Actions

Also available in: Atom PDF