Project

General

Profile

Actions

Bug #8520

closed

osd: segv in PushOp::print()

Added by Sage Weil almost 10 years ago. Updated over 9 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  -115> 2014-06-02 15:48:18.642958 7f73fab07700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f73fab07700

 ceph version 0.80-702-g9bac31b (9bac31be07098f4c75a209e35dc91b214b72a394)
 1: ceph-osd() [0x9750bf]
 2: (()+0x10340) [0x7f7413221340]
 3: (PushOp::print(std::ostream&) const+0x20) [0x6d01c0]
 4: (ReplicatedBackend::send_pushes(int, std::map<pg_shard_t, std::vector<PushOp, std::allocator<PushOp> >, std::less<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, std::vector<PushOp, std::allocator<PushOp> > > > >&)+0x482) [0x7d9e92]
 5: (ReplicatedBackend::do_pull(std::tr1::shared_ptr<OpRequest>)+0x6be) [0x7da77e]
 6: (ReplicatedBackend::handle_message(std::tr1::shared_ptr<OpRequest>)+0x41e) [0x90465e]
 7: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x2db) [0x7aea9b]
 8: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x459) [0x634e89]
 9: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, ThreadPool::TPHandle&)+0x1f4) [0x635304]
 10: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process(void*, ThreadPool::TPHandle&)+0x9c) [0x668eec]
 11: (ThreadPool::worker(ThreadPool::WorkThread*)+0xaf1) [0xa48831]
 12: (ThreadPool::WorkThread::entry()+0x10) [0xa49720]
 13: (()+0x8182) [0x7f7413219182]
 14: (clone()+0x6d) [0x7f74115ba30d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

ubuntu@teuthology:/a/teuthology-2014-06-02_02:30:05-rados-master-testing-basic-plana/285586
Actions #1

Updated by Samuel Just almost 10 years ago

  • Status changed from New to Need More Info

I can't seem to actually find this thread in the core, which is odd. Marking as Need More Info for the moment.

Actions #2

Updated by Samuel Just almost 10 years ago

  • Status changed from Need More Info to Can't reproduce

Haven't seen it recur lately, marking can't reproduce.

Actions #3

Updated by Sage Weil over 9 years ago

  • Status changed from Can't reproduce to 12
Program terminated with signal SIGSEGV, Segmentation fault.
#0  PushOp::print (this=this@entry=0xffffffff00000013, out=...) at /usr/include/c++/4.8/bits/stl_map.h:435
435           { return _M_t.size(); }
(gdb) bt
#0  PushOp::print (this=this@entry=0xffffffff00000013, out=...) at /usr/include/c++/4.8/bits/stl_map.h:435
#1  0x00000000006ff6ee in operator<< (out=..., op=...) at osd/osd_types.cc:4154
#2  0x00000000007ca1c2 in ReplicatedBackend::send_pushes (this=this@entry=0x24a8a00, prio=10, pushes=...) at osd/ReplicatedPG.cc:8444
#3  0x00000000007caaae in ReplicatedBackend::do_pull (this=this@entry=0x24a8a00, op=...) at osd/ReplicatedPG.cc:2200
#4  0x000000000090415e in ReplicatedBackend::handle_message (this=0x24a8a00, op=...) at osd/ReplicatedBackend.cc:138
#5  0x000000000079df6b in ReplicatedPG::do_request (this=0x257ec00, op=..., handle=...) at osd/ReplicatedPG.cc:1113
#6  0x00000000005fca01 in OSD::dequeue_op (this=0x218c000, pg=..., op=..., handle=...) at osd/OSD.cc:7715
#7  0x0000000000617054 in OSD::OpWQ::_process (this=0x218ce28, pg=..., handle=...) at osd/OSD.cc:7685
#8  0x0000000000659ebc in ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process (this=0x218ce28, handle=...) at ./common/WorkQueue.h:190
#9  0x0000000000a4d951 in ThreadPool::worker (this=0x218c470, wt=0x2154220) at common/WorkQueue.cc:125
#10 0x0000000000a4e840 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:317
#11 0x00007fb7a175e182 in start_thread (arg=0x7fb78b678700) at pthread_create.c:312
#12 0x00007fb79fed238d in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#13 0x0000000000000000 in ?? ()

incomplete log, but the core is good. ubuntu@teuthology:/a/sage-2014-08-10_18:40:12-rados-firefly-next-distro-basic-multi/414840

#1  0x00000000006ff6ee in operator<< (out=..., op=...) at osd/osd_types.cc:4154
4154    osd/osd_types.cc: No such file or directory.
(gdb) p op
$6 = (const PushOp &) @0xffffffff00000013: <error reading variable>
(gdb) up
#2  0x00000000007ca1c2 in ReplicatedBackend::send_pushes (this=this@entry=0x24a8a00, prio=10, pushes=...) at osd/ReplicatedPG.cc:8444
8444    osd/ReplicatedPG.cc: No such file or directory.
(gdb) p j
$7 = {_M_current = 0xffffffff00000013}
(gdb) p pushes
$9 = 0

can't inspect the function argument because it is shadowed by a local int pushes.

core is on vpm022 with matching packages installed.

Actions #4

Updated by Samuel Just over 9 years ago

  • Status changed from 12 to Can't reproduce
Actions

Also available in: Atom PDF