Project

General

Profile

Actions

Bug #9625

closed

firefly: memory corruption

Added by Samuel Just over 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I am guessing that these two coredumps are related.

#0 0x00007f1918142f07 in dl_map_object_deps (map=map@entry=0x7f19183434e8, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648) at dl-deps.c:528
#1 0x00007f1918149aab in dl_open_worker (a=a@entry=0x7f1901981628) at dl-open.c:272
#2 0x00007f1918144ff4 in _dl_catch_error (objname=objname@entry=0x7f1901981618, errstring=errstring@entry=0x7f1901981620, mallocedp=mallocedp@entry=0x7f1901981610, operate=operate@entry=0x7f19181499a0 <dl_open_worker>, args=args@entry=0x7f1901981628) at dl-error.c:187
#3 0x00007f19181493bb in _dl_open (file=0x7f191605b4de "libgcc_s.so.1", mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=4, argv=0x7fff0b0cfe88, env=0x3366000) at dl-open.c:661
#4 0x00007f1916015002 in do_dlopen (ptr=ptr@entry=0x7f1901981840) at dl-libc.c:87
#5 0x00007f1918144ff4 in _dl_catch_error (objname=0x7f1901981820, errstring=0x7f1901981830, mallocedp=0x7f1901981810, operate=0x7f1916014fc0 <do_dlopen>, args=0x7f1901981840) at dl-error.c:187
#6 0x00007f19160150c2 in dlerror_run (args=0x7f1901981840, operate=0x7f1916014fc0 <do_dlopen>) at dl-libc.c:46
#7 GI_libc_dlopen_mode (name=name@entry=0x7f191605b4de "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163
#8 0x00007f1915fe9c65 in init () at ../sysdeps/x86_64/backtrace.c:52
#9 0x00007f191786aa90 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103
#10 0x00007f1915fe9d7c in GI_backtrace (array=<optimized out>, size=100) at ../sysdeps/x86_64/backtrace.c:103
#11 0x00000000009809af in BackTrace (s=0, this=0x7f1901981b70) at ./common/BackTrace.h:19
#12 handle_fatal_signal (signum=11) at global/signal_handler.cc:90
#13 <signal handler called>
#14 0x00007f191687a3b9 in std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x00007f191687af7b in std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#16 0x00007f191687b014 in std::string::reserve(unsigned long) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#17 0x00007f191687b0b8 in std::string::append(std::string const&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#18 0x00000000006adcbc in std::operator+<char, std::char_traits<char>, std::allocator<char> > (
_lhs=..., _rhs=...) at /usr/include/c++/4.8/bits/basic_string.h:2369
#19 0x000000000089fff4 in LFNIndex (_error_injection_probability=<optimized out>, index_version=3,
base_path=0x6530033 "d\021\374-\032\003?O\024\037\222\032\372\035\215n4\005q\260\254\362\062\f\036\225\r\207<\253\216?2U\345\030[\230F\305>g,}O\262G\026\222\n\\\361\217)\223
\a2\217\307c\270A\210\360\200{\237\314\006F\a\235\300\364\343\a#\364]\332igK\016\017v\226n `Y\026\064V?5\006\336\021\347\367\244\247%\232t.\352\033\317F\026\373\001T\355^\212\063^$:\253\b\271\036\242\375\254\036\023C\001a\302\036Ew\253\017e4>\232>]K\312\363\004\276t\002\262\n>\024If^\226\246\272\"\v\tV3\037\065g\230zV({Z\244?\336\313v\353!\b\034\335\016\023R\342\355\300", <incomplete sequence \354\215>..., collection=..., this=0x653dd20)
at os/LFNIndex.h:149
#20 HashIndex (retry_probability=<optimized out>, index_version=3, split_multiple=2, merge_at=10,
base_path=0x6530033 "d\021\374-\032\003?O\024\037\222\032\372\035\215n4\005q\260\254\362\062\f\036\225\r\207<\253\216?2U\345\030[\230F\305>g,}O\262G\026\222\n\\\361\217)\223_\a2\217\307c\270A\210\360\200{\237\314\006F\a\235\300\364\343\a#\364]\332igK\016\017v\226n `Y\026\064V?5\006\336\021\347\367\244\247%\232t.\352\033\317F\026\373\001T\355^\212\063^$:\253\b\271\036\242\375\254\036\023C\001a\302\036Ew\253\017e4>\232>]K\312\363\004\276t\002\262\n>\024If^\226\246\272\"\v\tV3\037\065g\230zV({Z\244?\336\313v\353!\b\034\335\016\023R\342\355\300", <incomplete sequence \354\215>..., collection=..., this=0x653dd20)
at os/HashIndex.h:144
#21 IndexManager::build_index (this=this@entry=0x33ac428, c=..., path=path@entry=0x7f1901982f40 "/var/lib/ceph/osd/ceph-0/current/3.fs0_head", index=index@entry=0x7f1901984020) at os/IndexManager.cc:115
#22 0x00000000008a15ae in IndexManager::get_index (this=this@entry=0x33ac428, c=..., path=path@entry=0x7f1901982f40 "/var/lib/ceph/osd/ceph-0/current/3.fs0_head", index=index@entry=0x7f1901984020) at os/IndexManager.cc:125
#23 0x0000000000856e7b in FileStore::get_index (this=this@entry=0x33ac000, cid=..., index=index@entry=0x7f1901984020) at os/FileStore.cc:152
#24 0x000000000087af95 in FileStore::lfn_open (this=this@entry=0x33ac000, cid=..., oid=..., create=create@entry=false, outfd=outfd@entry=0x7f1901984310, path=path@entry=0x0, index=0x7f1901984020, index@entry=0x0) at os/FileStore.cc:234
#25 0x0000000000882375 in FileStore::getattrs (this=this@entry=0x33ac000, cid=..., oid=..., aset=..., user_only=user_only@entry=false) at os/FileStore.cc:3570
#26 0x00000000008301cf in ObjectStore::getattrs (this=0x33ac000, cid=..., oid=..., aset=..., user_only=false) at ./os/ObjectStore.h:1385
#27 0x0000000000911f7f in ECBackend::objects_get_attrs (this=0x5251f80, hoid=..., out=0x6ac13f8) at osd/ECBackend.cc:1697
#28 0x00000000007d0c98 in ReplicatedPG::get_object_context (this=this@entry=0x4382c00, soid=..., can_create=can_create@entry=false, attrs=attrs@entry=0x0) at osd/ReplicatedPG.cc:7162
#29 0x00000000007db910 in ReplicatedPG::find_object_context (this=this@entry=0x4382c00, oid=..., pobc=pobc@entry=0x7f1901984f10, can_create=can_create@entry=false, map_snapid_to_clone=<optimized out>, pmissing=pmissing@entry=0x7f1901985250) at osd/ReplicatedPG.cc:7344
#30 0x0000000000800e85 in ReplicatedPG::do_op (this=0x4382c00, op=...) at osd/ReplicatedPG.cc:1322
#31 0x00000000007a157d in ReplicatedPG::do_request (this=0x4382c00, op=..., handle=...) at osd/ReplicatedPG.cc:1129
#32 0x00000000005ff6f1 in OSD::dequeue_op (this=0x34c2000, pg=..., op=..., handle=...) at osd/OSD.cc:7779
#33 0x000000000061a054 in OSD::OpWQ::_process (this=0x34c2e58, pg=..., handle=...) at osd/OSD.cc:7749
#34 0x000000000065c77c in ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process (this=0x34c2e58, handle=...) at ./common/WorkQueue.h:191
#35 0x0000000000a541a1 in ThreadPool::worker (this=0x34c2470, wt=0x33cd330) at common/WorkQueue.cc:128
#36 0x0000000000a55090 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#37 0x00007f1917865182 in start_thread (arg=0x7f1901986700) at pthread_create.c:312
#38 0x00007f1915fd938d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

#0 0x00007f75d788af07 in dl_map_object_deps (map=map@entry=0x7f75d7a8c4e8, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648) at dl-deps.c:528
#1 0x00007f75d7891aab in dl_open_worker (a=a@entry=0x7f75c08cc568) at dl-open.c:272
#2 0x00007f75d788cff4 in _dl_catch_error (objname=objname@entry=0x7f75c08cc558, errstring=errstring@entry=0x7f75c08cc560, mallocedp=mallocedp@entry=0x7f75c08cc550, operate=operate@entry=0x7f75d78919a0 <dl_open_worker>, args=args@entry=0x7f75c08cc568) at dl-error.c:187
#3 0x00007f75d78913bb in _dl_open (file=0x7f75d57a34de "libgcc_s.so.1", mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=4, argv=0x7ffff61c1af8, env=0x3140000) at dl-open.c:661
#4 0x00007f75d575d002 in do_dlopen (ptr=ptr@entry=0x7f75c08cc780) at dl-libc.c:87
#5 0x00007f75d788cff4 in _dl_catch_error (objname=0x7f75c08cc760, errstring=0x7f75c08cc770, mallocedp=0x7f75c08cc750, operate=0x7f75d575cfc0 <do_dlopen>, args=0x7f75c08cc780) at dl-error.c:187
#6 0x00007f75d575d0c2 in dlerror_run (args=0x7f75c08cc780, operate=0x7f75d575cfc0 <do_dlopen>) at dl-libc.c:46
#7 GI_libc_dlopen_mode (name=name@entry=0x7f75d57a34de "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163
#8 0x00007f75d5731c65 in init () at ../sysdeps/x86_64/backtrace.c:52
#9 0x00007f75d6fb2a90 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103
#10 0x00007f75d5731d7c in GI_backtrace (array=<optimized out>, size=100) at ../sysdeps/x86_64/backtrace.c:103
#11 0x00000000009809af in BackTrace (s=0, this=0x7f75c08ccab0) at ./common/BackTrace.h:19
#12 handle_fatal_signal (signum=11) at global/signal_handler.cc:90
#13 <signal handler called>
#14 _M_init (this=<optimized out>) at /usr/include/c++/4.8/bits/stl_list.h:386
#15 _List_base (
_a=..., this=<optimized out>) at /usr/include/c++/4.8/bits/stl_list.h:365
#16 list (_x=..., this=<optimized out>) at /usr/include/c++/4.8/bits/stl_list.h:583
#17 pair (this=0xeea850 <vtable for std::tr1::_Sp_counted_base_impl<DeletingState*, SharedPtrRegistry<spg_t, DeletingState>::OnRemoval, (
_gnu_cxx::_Lock_policy)2>48>) at /usr/include/c+/4.8/bits/stl_pair.h:96
#18 construct (this=<optimized out>, _val=..., __p=0xeea850 <vtable for std::tr1::_Sp_counted_base_impl<DeletingState*, SharedPtrRegistry<spg_t, DeletingState>::OnRemoval, (_gnu_cxx::_Lock_policy)2>48>) at /usr/include/c+/4.8/ext/new_allocator.h:130
#19 M_create_node (this=<optimized out>, __x=...) at /usr/include/c++/4.8/bits/stl_tree.h:382
#20 _M_insert
(_v=..., __p=0x5b46e50, __x=<optimized out>, this=0x5b46e48) at /usr/include/c++/4.8/bits/stl_tree.h:1023
#21 std::_Rb_tree<int, std::pair<int const, std::list<Message*, std::allocator<Message*> > >, std::_Select1st<std::pair<int const, std::list<Message*, std::allocator<Message*> > > >, std::less<int>, std::allocator<std::pair<int const, std::list<Message*, std::allocator<Message*> > > > >::_M_insert_unique
(
this=this@entry=0x5b46e48, _position=..., __position@entry=..., __v=...) at /usr/include/c++/4.8/bits/stl_tree.h:1482
#22 0x0000000000a43b69 in insert (
_x=..., _position=..., this=0x5b46e48) at /usr/include/c++/4.8/bits/stl_map.h:648
#23 operator[] (
_k=<optimized out>, this=0x5b46e48) at /usr/include/c++/4.8/bits/stl_map.h:469
#24 _send (m=0x55d1d40, this=0x5b46c80) at msg/Pipe.h:248
#25 SimpleMessenger::submit_message (this=this@entry=0x31acc00, m=m@entry=0x55d1d40, con=<optimized out>, dest_addr=..., dest_type=<optimized out>, lazy=<optimized out>) at msg/SimpleMessenger.cc:432
#26 0x0000000000a44654 in SimpleMessenger::_send_message (this=0x31acc00, m=0x55d1d40, dest=..., lazy=<optimized out>) at msg/SimpleMessenger.cc:119
#27 0x000000000062d40f in OSDService::send_message_osd_cluster (this=0x32df6e8, peer=5, m=0x55d1d40, from_epoch=<optimized out>) at osd/OSD.cc:3915
#28 0x0000000000917429 in ECBackend::start_read_op (this=this@entry=0x3b83f80, priority=priority@entry=127, to_read=..., _op=...) at osd/ECBackend.cc:1430
#29 0x0000000000917e7c in ECBackend::dispatch_recovery_messages (this=this@entry=0x3b83f80, m=..., priority=priority@entry=127) at osd/ECBackend.cc:457
#30 0x000000000091e1b0 in ECBackend::handle_message (this=0x3b83f80, _op=...) at osd/ECBackend.cc:703
#31 0x00000000007a121b in ReplicatedPG::do_request (this=0x3cc4400, op=..., handle=...) at osd/ReplicatedPG.cc:1113
#32 0x00000000005ff6f1 in OSD::dequeue_op (this=0x32de000, pg=..., op=..., handle=...) at osd/OSD.cc:7779
#33 0x000000000061a054 in OSD::OpWQ::_process (this=0x32dee58, pg=..., handle=...) at osd/OSD.cc:7749
#34 0x000000000065c77c in ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process (this=0x32dee58, handle=...) at ./common/WorkQueue.h:191
#35 0x0000000000a541a1 in ThreadPool::worker (this=0x32de470, wt=0x3424810) at common/WorkQueue.cc:128
#36 0x0000000000a55090 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#37 0x00007f75d6fad182 in start_thread (arg=0x7f75c08cf700) at pthread_create.c:312
#38 0x00007f75d572138d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #10485: unreadable ceph-osd core dump (firefly)ResolvedLoïc Dachary01/08/2015

Actions
Actions #1

Updated by Samuel Just over 9 years ago

ubuntu@teuthology:/a/sage-2014-09-27_20:55:12-rados-firefly-distro-basic-multi/515818
ubuntu@teuthology:/a/sage-2014-09-27_20:55:12-rados-firefly-distro-basic-multi/515914

Actions #2

Updated by Samuel Just over 9 years ago

/a/samuelj-2014-09-23_14:40:50-rados-firefly-wip-testing-old-vanilla-basic-multi/507058 another example

Actions #3

Updated by Sage Weil over 9 years ago

  • Assignee set to Sage Weil
Actions #4

Updated by Sage Weil over 9 years ago

hit it again (or something very similar):

(gdb) bt
#0  0x00007f866bd16f07 in _dl_map_object_deps (map=map@entry=0x7f866bf174e8, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648) at dl-deps.c:528
#1  0x00007f866bd1daab in dl_open_worker (a=a@entry=0x7f86551961e8) at dl-open.c:272
#2  0x00007f866bd18ff4 in _dl_catch_error (objname=objname@entry=0x7f86551961d8, errstring=errstring@entry=0x7f86551961e0, mallocedp=mallocedp@entry=0x7f86551961d0, operate=operate@entry=0x7f866bd1d9a0 <dl_open_worker>, args=args@entry=0x7f86551961e8) at dl-error.c:187
#3  0x00007f866bd1d3bb in _dl_open (file=0x7f8669c2f0fe "libgcc_s.so.1", mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=4, argv=0x7fff84b90a58, env=0x249c000) at dl-open.c:661
#4  0x00007f8669be8c32 in do_dlopen (ptr=ptr@entry=0x7f8655196400) at dl-libc.c:87
#5  0x00007f866bd18ff4 in _dl_catch_error (objname=0x7f86551963e0, errstring=0x7f86551963f0, mallocedp=0x7f86551963d0, operate=0x7f8669be8bf0 <do_dlopen>, args=0x7f8655196400) at dl-error.c:187
#6  0x00007f8669be8cf2 in dlerror_run (args=0x7f8655196400, operate=0x7f8669be8bf0 <do_dlopen>) at dl-libc.c:46
#7  __GI___libc_dlopen_mode (name=name@entry=0x7f8669c2f0fe "libgcc_s.so.1", mode=mode@entry=-2147483647) at dl-libc.c:163
#8  0x00007f8669bbd895 in init () at ../sysdeps/x86_64/backtrace.c:52
#9  0x00007f866b43ea90 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103
#10 0x00007f8669bbd9ac in __GI___backtrace (array=<optimized out>, size=100) at ../sysdeps/x86_64/backtrace.c:103
#11 0x00000000009809af in BackTrace (s=0, this=0x7f8655196730) at ./common/BackTrace.h:19
#12 handle_fatal_signal (signum=11) at global/signal_handler.cc:90
#13 <signal handler called>
#14 0x00000000006aa340 in pair (this=0xeea850 <vtable for std::tr1::_Sp_counted_base_impl<DeletingState*, SharedPtrRegistry<spg_t, DeletingState>::OnRemoval, (__gnu_cxx::_Lock_policy)2>+48>) at /usr/include/c++/4.8/bits/stl_pair.h:96
#15 construct (this=<optimized out>, __val=..., __p=0xeea850 <vtable for std::tr1::_Sp_counted_base_impl<DeletingState*, SharedPtrRegistry<spg_t, DeletingState>::OnRemoval, (__gnu_cxx::_Lock_policy)2>+48>) at /usr/include/c++/4.8/ext/new_allocator.h:130
#16 _M_create_node (this=<optimized out>, __x=...) at /usr/include/c++/4.8/bits/stl_tree.h:382
#17 _M_insert_ (__v=..., __p=0x9da4000, __x=0x0, this=0x273adc8) at /usr/include/c++/4.8/bits/stl_tree.h:1023
#18 std::_Rb_tree<std::pair<utime_t, std::tr1::shared_ptr<TrackedOp> >, std::pair<utime_t, std::tr1::shared_ptr<TrackedOp> >, std::_Identity<std::pair<utime_t, std::tr1::shared_ptr<TrackedOp> > >, std::less<std::pair<utime_t, std::tr1::shared_ptr<TrackedOp> > >, std::allocator<std::pair<utime_t, std::tr1::shared_ptr<TrackedOp> > > >::_M_insert_unique (this=this@entry=0x273adc8, __v=...) at /usr/include/c++/4.8/bits/stl_tree.h:1382
#19 0x00000000006a5e05 in insert (__x=..., this=0x273adc8) at /usr/include/c++/4.8/bits/stl_set.h:463
#20 OpHistory::insert (this=this@entry=0x273adc8, now=..., now@entry=..., op=...) at common/TrackedOp.cc:43
#21 0x00000000006a63c4 in OpTracker::unregister_inflight_op (this=0x273ad50, i=i@entry=0x8049d10) at common/TrackedOp.cc:131
#22 0x00000000006a6741 in OpTracker::RemoveOnDelete::operator() (this=0x3535d18, op=0x8049d10) at common/TrackedOp.cc:250
#23 0x0000000000664449 in std::tr1::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x3535d00) at /usr/include/c++/4.8/tr1/shared_ptr.h:141
#24 0x000000000061a0b7 in ~__shared_count (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/tr1/shared_ptr.h:341
#25 ~__shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/tr1/shared_ptr.h:541
#26 ~shared_ptr (this=<synthetic pointer>, __in_chrg=<optimized out>) at /usr/include/c++/4.8/tr1/shared_ptr.h:985
#27 OSD::OpWQ::_process (this=0x273ae58, pg=..., handle=...) at osd/OSD.cc:7750
#28 0x000000000065c77c in ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> >::_void_process (this=0x273ae58, handle=...) at ./common/WorkQueue.h:191
#29 0x0000000000a541a1 in ThreadPool::worker (this=0x273a470, wt=0x2d2be30) at common/WorkQueue.cc:128
#30 0x0000000000a55090 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:318
#31 0x00007f866b439182 in start_thread (arg=0x7f8655198700) at pthread_create.c:312
#32 0x00007f8669bacfbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Actions #5

Updated by Sage Weil over 9 years ago

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-bug-9625-e/521446

Actions #6

Updated by Sage Weil over 9 years ago

  • Status changed from New to Need More Info
Actions #7

Updated by Sage Weil about 9 years ago

  • Priority changed from Urgent to High
Actions #8

Updated by Samuel Just about 9 years ago

  • Status changed from Need More Info to Resolved
Actions

Also available in: Atom PDF