Bug #2017
osd: segfault in snap trimmer
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
Testing some reasonably solid changes to the rbd code I ran across an OSD crash.
It looks like it happ
The YAML file for the test is below.
I will save some binaries, etc. on flak. Will update shortly to indicate where.
Here is the stack trace at the end of osd.0.log---------------------
2012-02-02 14:15:23.751408 7f5206624700 log [INF] : 2.5 scrub ok
- Caught signal (Segmentation fault) *
in thread 7f5204e21700
ceph version 0.41-68-g2a26295 (commit:2a262956947a3fa2f3d621b8af392bdecb653464)
1: /tmp/cephtest/binary/usr/local/bin/ceph-osd() [0x5979a4]
2: (()+0xfb40) [0x7f52154bdb40]
3: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather)+0x1e) [0x4dfcae]
4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim const&)+0x3a7) [0x4e0e47]
5: (boost::statechart::simple_state<ReplicatedPG::TrimmingObjects, ReplicatedPG::SnapTrimmer, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0xcb) [0x52818b]
6: (boost::statechart::state_machine<ReplicatedPG::SnapTrimmer, ReplicatedPG::NotTrimming, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x6b) [0x51f8cb]
7: (ReplicatedPG::snap_trimmer()+0x51e) [0x4bc3de]
8: (ThreadPool::worker()+0xa28) [0x5abdc8]
9: (ThreadPool::WorkThread::entry()+0xd) [0x57913d]
10: (()+0x7971) [0x7f52154b5971]
11: (clone()+0x6d) [0x7f5213b4092d]
---------------------
nuke-on-error:
targets:
ubuntu@sepia18.ceph.dreamhost.com: ...
ubuntu@sepia24.ceph.dreamhost.com: ...
ubuntu@sepia26.ceph.dreamhost.com: ...
roles:
- [mon.a, mon.c, osd.0]
- [mon.b, mds.a, osd.1]
- [client.0]
kernel:
osd:
branch: wip-rbd-new-fixes
client:
branch: wip-rbd-new-fixes
tasks:
- ceph:
- - kclient:
- rbd:
all:
- workunit:
client.0:
- rbd/copy.sh
- rbd/import_export.sh # The next two produce an error # - rbd/kernel.sh # - rbd/test_librbd_python.sh
- rbd/test_librbd.sh
History
#1 Updated by Josh Durgin almost 12 years ago
The segfault was from trying to dereference repop->ctx->op, which was NULL.
#2 Updated by Alex Elder almost 12 years ago
I bundled up the /tmp/cephtest directory in its entirety. It is here:
flak.ops.newdream.net:~elder/tracker_2017/sepia24-cephtest.tgz
#3 Updated by Sage Weil almost 12 years ago
- Status changed from New to Resolved
pushed fix for this (and another similar bug) to master.
#4 Updated by Alex Elder almost 12 years ago
Since Sage has fixed this, I've deleted the archive of /tmp/cephtest I had saved.