Bug #19320
openPg inconsistent make ceph osd down
0%
Description
Hi all.
I am running a ceph cluster.
These is a pg inconsistent:
pg 3.aff is active+recovery_wait+degraded+inconsistent, acting [267,463,157]
When i start osd 267. It not recovery compelete, then osd down.
log osd in attach file.
ceph version
ceph 0.94.7-1trusty amd64 distributed storage and file system
dmesg log:
[Tue Mar 21 16:34:02 2017] init: ceph-osd (ceph/267) respawning too fast, stopped
[Tue Mar 21 18:57:20 2017] init: ceph-osd (ceph/267) main process (3423088) killed by SEGV signal
[Tue Mar 21 18:57:20 2017] init: ceph-osd (ceph/267) main process ended, respawning
[Tue Mar 21 19:09:59 2017] init: ceph-osd (ceph/267) main process (3452503) killed by SEGV signal
[Tue Mar 21 19:09:59 2017] init: ceph-osd (ceph/267) main process ended, respawning
[Tue Mar 21 19:10:57 2017] init: ceph-osd (ceph/267) main process (3482095) killed by SEGV signal
[Tue Mar 21 19:10:57 2017] init: ceph-osd (ceph/267) main process ended, respawning
[Tue Mar 21 19:11:12 2017] init: ceph-osd (ceph/267) main process (3486136) killed by SEGV signal
[Tue Mar 21 19:11:12 2017] init: ceph-osd (ceph/267) respawning too fast, stopped
Thanks all.
Hoan
Files
Updated by Kefu Chai about 7 years ago
- Is duplicate of Bug #16503: OSD's assert during snap trim osd/ReplicatedPG.cc: 2655: FAILED assert(0) added
Updated by Kefu Chai about 7 years ago
backtrace in the attached log_inconsistent.txt
ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432) 1: /usr/bin/ceph-osd() [0xab701a] 2: (()+0x10330) [0x7f6c8ae51330] 3: (ReplicatedPG::trim_object(hobject_t const&)+0x440) [0x85bdc0] 4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim const&)+0x427) [0x85e287] 5: (boost::statechart::simple_state<ReplicatedPG::TrimmingObjects, ReplicatedPG::SnapTrimmer, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xb4) [0x8bf1f4] 6: (boost::statechart::state_machine<ReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0x127) [0x8ab787] 7: (boost::statechart::state_machine<ReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x84) [0x8ab954] 8: (ReplicatedPG::snap_trimmer()+0x52c) [0x82f7fc] 9: (OSD::SnapTrimWQ::_process(PG*)+0x1a) [0x6c43aa] 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xba2a0e] 11: (ThreadPool::WorkThread::entry()+0x10) [0xba3ab0] 12: (()+0x8184) [0x7f6c8ae49184] 13: (clone()+0x6d) [0x7f6c893b437d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Greg Farnum almost 7 years ago
- Project changed from Ceph to RADOS
- Category set to Snapshots
- Component(RADOS) OSD added
Hmm, did one of our official release said have the broken snapshot trimming back port semantics? I didn't think so but maybe; check that when looking st this issue.
Updated by Greg Farnum almost 7 years ago
- Priority changed from Urgent to Normal