Bug #7991
closed
Added by Andrei Mikhailovsky about 10 years ago.
Updated almost 10 years ago.
Description
I've had an issue with crashing ceph-mon. It happened twice over the course of last two weeks. Attached are the ceph-mon log files from two mon servers. I have three in total, but the crash happened on the two servers that i am sending the logs.
the log files grew to around 7gb in size. They are larger than allowed size in bz2. I will try to coordinate the upload over irc.
Logs have been uploaded via cephdrop@ceph.com in issue7991 folder. Thanks.
Cluster details:
Ubuntu 12.04 - 3 x ceph mons and 2 x ceph osds.
Ceph Emperor
Cluster usage - CloudStack + qemu 1.5.0 + rbd vm volumes.
Thanks
- Assignee set to Joao Eduardo Luis
- Status changed from New to 4
There is no evidence of a crash on the logs.
One of the monitors appears to be working fine.
The other monitor has shutdown due to reaching critical available disk space:
2014-03-24 16:24:08.079989 7ff89584b700 0 mon.arh-ibstorage1-ib@1(peon).data_health(56710) update_stats avail 5% total 14286320 used 12782020 avail 771936
2014-03-24 16:24:08.080251 7ff89584b700 -1 mon.arh-ibstorage1-ib@1(peon).data_health(56710) reached critical levels of available space on data store -- shutdown!
2014-03-24 16:24:08.080257 7ff89584b700 0 ** Shutdown via Data Health Service **
2014-03-24 16:24:08.080284 7ff893e46700 -1 mon.arh-ibstorage1-ib@1(peon) e11 *** Got Signal Interrupt ***
2014-03-24 16:24:08.080307 7ff893e46700 1 mon.arh-ibstorage1-ib@1(peon) e11 shutdown
2014-03-24 16:24:08.080357 7ff893e46700 0 quorum service shutdown
2014-03-24 16:24:08.080370 7ff893e46700 0 mon.arh-ibstorage1-ib@1(shutdown).health(56710) HealthMonitor::service_shutdown 1 services
2014-03-24 16:24:08.080375 7ff893e46700 0 quorum service shutdown
- Project changed from rbd to Ceph
- Category set to Monitor
- Status changed from 4 to Can't reproduce
- Status changed from Can't reproduce to Rejected
Also available in: Atom
PDF