Bug #7991
closedceph-mon crash
0%
Description
I've had an issue with crashing ceph-mon. It happened twice over the course of last two weeks. Attached are the ceph-mon log files from two mon servers. I have three in total, but the crash happened on the two servers that i am sending the logs.
the log files grew to around 7gb in size. They are larger than allowed size in bz2. I will try to coordinate the upload over irc.
Updated by Andrei Mikhailovsky about 10 years ago
Logs have been uploaded via cephdrop@ceph.com in issue7991 folder. Thanks.
Cluster details:
Ubuntu 12.04 - 3 x ceph mons and 2 x ceph osds.
Ceph Emperor
Cluster usage - CloudStack + qemu 1.5.0 + rbd vm volumes.
Thanks
Updated by Joao Eduardo Luis about 10 years ago
- Status changed from New to 4
There is no evidence of a crash on the logs.
One of the monitors appears to be working fine.
The other monitor has shutdown due to reaching critical available disk space:
2014-03-24 16:24:08.079989 7ff89584b700 0 mon.arh-ibstorage1-ib@1(peon).data_health(56710) update_stats avail 5% total 14286320 used 12782020 avail 771936 2014-03-24 16:24:08.080251 7ff89584b700 -1 mon.arh-ibstorage1-ib@1(peon).data_health(56710) reached critical levels of available space on data store -- shutdown! 2014-03-24 16:24:08.080257 7ff89584b700 0 ** Shutdown via Data Health Service ** 2014-03-24 16:24:08.080284 7ff893e46700 -1 mon.arh-ibstorage1-ib@1(peon) e11 *** Got Signal Interrupt *** 2014-03-24 16:24:08.080307 7ff893e46700 1 mon.arh-ibstorage1-ib@1(peon) e11 shutdown 2014-03-24 16:24:08.080357 7ff893e46700 0 quorum service shutdown 2014-03-24 16:24:08.080370 7ff893e46700 0 mon.arh-ibstorage1-ib@1(shutdown).health(56710) HealthMonitor::service_shutdown 1 services 2014-03-24 16:24:08.080375 7ff893e46700 0 quorum service shutdown
Updated by Josh Durgin almost 10 years ago
- Project changed from rbd to Ceph
- Category set to Monitor
Updated by Sage Weil almost 10 years ago
- Status changed from 4 to Can't reproduce
Updated by Sage Weil almost 10 years ago
- Status changed from Can't reproduce to Rejected