Actions
Bug #2084
closedsegfault in tcmalloc
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
heap corruption?
(gdb) bt #0 0x00007f844f073a0b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42 #1 0x00000000009bce23 in reraise_fatal (signum=11) at global/signal_handler.cc:59 #2 0x00000000009bcfa7 in handle_fatal_signal (signum=11) at global/signal_handler.cc:95 #3 <signal handler called> #4 0x00007f844d3fdd52 in ?? () from /usr/lib/libunwind.so.7 #5 0x00007f844d3fbc75 in ?? () from /usr/lib/libunwind.so.7 #6 0x00007f844d3fc24c in ?? () from /usr/lib/libunwind.so.7 #7 0x00007f844d3fc409 in ?? () from /usr/lib/libunwind.so.7 #8 0x00007f844d3fe6ea in _ULx86_64_step () from /usr/lib/libunwind.so.7 #9 0x00007f844e15ba3b in GetStackTrace(void**, int, int) () from /usr/lib/libtcmalloc.so.0 #10 0x00007f844e142fb5 in ?? () from /usr/lib/libtcmalloc.so.0 #11 0x00007f844e15fe44 in tc_new () from /usr/lib/libtcmalloc.so.0 #12 0x00000000009dfd3c in __gnu_cxx::new_allocator<std::_List_node<Message*> >::allocate(unsigned long, void const*) () #13 0x00000000009de790 in std::_List_base<Message*, std::allocator<Message*> >::_M_get_node() () #14 0x00000000009dc055 in std::list<Message*, std::allocator<Message*> >::_M_create_node(Message* const&) () #15 0x00000000009da277 in std::list<Message*, std::allocator<Message*> >::_M_insert(std::_List_iterator<Message*>, Message* const&) () #16 0x00000000009d9224 in std::list<Message*, std::allocator<Message*> >::push_back(Message* const&) () #17 0x00000000009ce1bf in SimpleMessenger::Pipe::writer (this=0x2e1cc80) at msg/SimpleMessenger.cc:1764 #18 0x000000000077142c in SimpleMessenger::Pipe::Writer::entry (this=0x2e1cec8) at msg/SimpleMessenger.h:173 #19 0x00000000008cb311 in Thread::_entry_func (arg=0x2e1cec8) at common/Thread.cc:41 #20 0x00007f844f06b971 in start_thread (arg=<value optimized out>) at pthread_create.c:304 #21 0x00007f844d6f692d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #22 0x0000000000000000 in ?? ()
saw this yesterday, too.
Updated by Sage Weil about 12 years ago
and again (hammer b.yaml). right before the crash sched_scrub() was called...
2012-02-21 04:26:17.470245 7fbc67b7b700 -- 10.3.14.199:0/17191 <== osd.3 10.3.14.199:6802/17190 4069 ==== osd_ping(heartbeat e2072 as_of 2059) v1 ==== 39+0+0 (1304125321 0 0) 0x3edfa80 con 0x3085b40 2012-02-21 04:26:17.474387 7fbc6074d700 osd.4 2072 OSD::ms_get_authorizer type=osd 2012-02-21 04:26:17.474639 7fbc6074d700 cephx: verify_authorizer_reply exception in decode_decrypt with AQBYhENPSLe4EBAAZQ1y/Gvw1kmkJukiQpHlwQ== 2012-02-21 04:26:17.474658 7fbc6074d700 -- 10.3.14.199:6807/17191 >> 10.3.14.199:6803/17188 pipe(0x30d2500 sd=44 pgs=0 cs=0 l=0).failed verifying authorize reply 2012-02-21 04:26:17.523352 7fbc6d386700 journal do_write latency 0.292478 2012-02-21 04:26:17.523371 7fbc6d386700 journal do_write queueing finishers through seq 119056 2012-02-21 04:26:17.523385 7fbc6d386700 journal queue_completions_thru seq 119056 queueing seq 119056 0x358ba20 lat 0.292744 2012-02-21 04:26:17.523416 7fbc6d386700 journal put_throttle finished 1 ops and 160 bytes, now 0 ops and 0 bytes 2012-02-21 04:26:17.523432 7fbc6d386700 journal write_thread_entry going to sleep 2012-02-21 04:26:17.523468 7fbc6c384700 filestore(/tmp/cephtest/data/osd.4.data) _journaled_ahead 119056 0x37cf500 2012-02-21 04:26:17.523482 7fbc6c384700 journal op_apply_start 119056 open_ops 0 -> 1 2012-02-21 04:26:17.523497 7fbc6c384700 filestore(/tmp/cephtest/data/osd.4.data) queue_op 0x7aadb40 seq 119056 155 bytes (queue has 1 ops and 155 bytes) 2012-02-21 04:26:17.523556 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _do_op 0x7aadb40 119056 osr 0x304e0f0/0x3056770 start 2012-02-21 04:26:17.523587 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _do_transaction on 0x37cf500 2012-02-21 04:26:17.523642 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) remove meta/1f9f1b4e/pglog_0.0p2/0 2012-02-21 04:26:17.551663 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) remove meta/1f9f1b4e/pglog_0.0p2/0 = 0 2012-02-21 04:26:17.551688 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) remove meta/a04c80d2/pginfo_0.0p2/0 2012-02-21 04:26:17.551840 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) remove meta/a04c80d2/pginfo_0.0p2/0 = 0 2012-02-21 04:26:17.551855 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _destroy_collection /tmp/cephtest/data/osd.4.data/current/0.0p2_head 2012-02-21 04:26:17.551948 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _destroy_collection /tmp/cephtest/data/osd.4.data/current/0.0p2_head = 0 2012-02-21 04:26:17.551960 7fbc6b382700 journal op_apply_finish 119056 open_ops 1 -> 0 2012-02-21 04:26:17.551971 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _do_op 0x7aadb40 119056 r = 0, finisher 0x36fb300 0 2012-02-21 04:26:17.551982 7fbc6b382700 filestore(/tmp/cephtest/data/osd.4.data) _finish_op on osr 0x304e0f0/0x3056770 2012-02-21 04:26:17.733734 7fbc7038c700 osd.4 2072 tick 2012-02-21 04:26:17.733833 7fbc7038c700 osd.4 2072 scrub_should_schedule loadavg 3.66 < max 5 = yes 2012-02-21 04:26:17.733845 7fbc7038c700 osd.4 2072 sched_scrub 2012-02-21 04:26:17.733862 7fbc7038c700 osd.4 2072 on 2012-02-21 04:16:11.508702 2.e 2012-02-21 04:26:17.733888 7fbc7038c700 osd.4 2072 2.1p0 at 2012-02-21 04:16:35.823932 > 2012-02-21 04:16:17.733854 (600 seconds ago) 2012-02-21 04:26:17.733898 7fbc7038c700 osd.4 2072 sched_scrub done ceph version 0.42-69-g9927671 (commit:9927671b3ddce5c3edaa6be00ef2e8923aea6e6b)
Updated by Sage Weil about 12 years ago
- Target version changed from v0.43 to v0.44
Updated by Sage Weil almost 12 years ago
- Status changed from New to Can't reproduce
Actions