Project

General

Profile

Bug #43306

segv in collect_sys_info

Added by Yuri Weinstein 4 months ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
nautilus, mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/luminous-x
Component(RADOS):
Pull request ID:
Crash signature:

Description

Run: http://pulpito.ceph.com/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/
Job: '4596627'
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2019-12-13_02:25:03-upgrade:luminous-x-nautilus-distro-basic-smithi/4596627/teuthology.log

2019-12-13T04:23:20.446 INFO:tasks.ceph.mon.a.smithi059.stderr:*** Caught signal (Segmentation fault) **
2019-12-13T04:23:20.447 INFO:tasks.ceph.mon.a.smithi059.stderr: in thread 7f034a13a700 thread_name:ms_dispatch
2019-12-13T04:23:20.451 INFO:tasks.ceph.mon.a.smithi059.stderr: ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
2019-12-13T04:23:20.452 INFO:tasks.ceph.mon.a.smithi059.stderr: 1: (()+0x12890) [0x7f0356d45890]
2019-12-13T04:23:20.452 INFO:tasks.ceph.mon.a.smithi059.stderr: 2: (()+0x18a487) [0x7f0355f89487]
2019-12-13T04:23:20.452 INFO:tasks.ceph.mon.a.smithi059.stderr: 3: (_IO_getline()+0x55) [0x7f0355e7ee25]
2019-12-13T04:23:20.452 INFO:tasks.ceph.mon.a.smithi059.stderr: 4: (fgets()+0xad) [0x7f0355e7dbcd]
2019-12-13T04:23:20.452 INFO:tasks.ceph.mon.a.smithi059.stderr: 5: (collect_sys_info(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, CephContext*)+0xdd8) [0x7f0358119de8]
2019-12-13T04:23:20.453 INFO:tasks.ceph.mon.a.smithi059.stderr: 6: (Monitor::collect_metadata(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*)+0x43) [0x5592ac9de0c3]
2019-12-13T04:23:20.453 INFO:tasks.ceph.mon.a.smithi059.stderr: 7: (Elector::start()+0x2b7) [0x5592aca7fb17]
2019-12-13T04:23:20.453 INFO:tasks.ceph.mon.a.smithi059.stderr: 8: (Monitor::start_election()+0x18e) [0x5592ac9ee59e]
2019-12-13T04:23:20.453 INFO:tasks.ceph.mon.a.smithi059.stderr: 9: (Elector::handle_propose(boost::intrusive_ptr<MonOpRequest>)+0x999) [0x5592aca7f579]
2019-12-13T04:23:20.453 INFO:tasks.ceph.mon.a.smithi059.stderr: 10: (Elector::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x96d) [0x5592aca812ad]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 11: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x11ef) [0x5592aca07d9f]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 12: (Monitor::_ms_dispatch(Message*)+0x4aa) [0x5592aca084ba]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 13: (Monitor::ms_dispatch(Message*)+0x26) [0x5592aca37df6]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 14: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5592aca34326]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 15: (DispatchQueue::entry()+0x1a49) [0x7f0358121e79]
2019-12-13T04:23:20.454 INFO:tasks.ceph.mon.a.smithi059.stderr: 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f03581ce6bd]
2019-12-13T04:23:20.455 INFO:tasks.ceph.mon.a.smithi059.stderr: 17: (()+0x76db) [0x7f0356d3a6db]
2019-12-13T04:23:20.455 INFO:tasks.ceph.mon.a.smithi059.stderr: 18: (clone()+0x3f) [0x7f0355f2088f]
2019-12-13T04:23:20.455 INFO:tasks.ceph.mon.a.smithi059.stderr:2019-12-13 04:23:20.450 7f034a13a700 -1 *** Caught signal (Segmentation fault) **
2019-12-13T04:23:20.455 INFO:tasks.ceph.mon.a.smithi059.stderr: in thread 7f034a13a700 thread_name:ms_dispatch
2019-12-13T04:23:20.455 INFO:tasks.ceph.mon.a.smithi059.stderr:
2019-12-13T04:23:20.456 INFO:tasks.ceph.mon.a.smithi059.stderr: ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
2019-12-13T04:23:20.456 INFO:tasks.ceph.mon.a.smithi059.stderr: 1: (()+0x12890) [0x7f0356d45890]
2019-12-13T04:23:20.456 INFO:tasks.ceph.mon.a.smithi059.stderr: 2: (()+0x18a487) [0x7f0355f89487]
2019-12-13T04:23:20.456 INFO:tasks.ceph.mon.a.smithi059.stderr: 3: (_IO_getline()+0x55) [0x7f0355e7ee25]
2019-12-13T04:23:20.456 INFO:tasks.ceph.mon.a.smithi059.stderr: 4: (fgets()+0xad) [0x7f0355e7dbcd]
2019-12-13T04:23:20.457 INFO:tasks.ceph.mon.a.smithi059.stderr: 5: (collect_sys_info(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, CephContext*)+0xdd8) [0x7f0358119de8]
2019-12-13T04:23:20.457 INFO:tasks.ceph.mon.a.smithi059.stderr: 6: (Monitor::collect_metadata(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*)+0x43) [0x5592ac9de0c3]
2019-12-13T04:23:20.457 INFO:tasks.ceph.mon.a.smithi059.stderr: 7: (Elector::start()+0x2b7) [0x5592aca7fb17]
2019-12-13T04:23:20.457 INFO:tasks.ceph.mon.a.smithi059.stderr: 8: (Monitor::start_election()+0x18e) [0x5592ac9ee59e]
2019-12-13T04:23:20.457 INFO:tasks.ceph.mon.a.smithi059.stderr: 9: (Elector::handle_propose(boost::intrusive_ptr<MonOpRequest>)+0x999) [0x5592aca7f579]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 10: (Elector::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x96d) [0x5592aca812ad]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 11: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x11ef) [0x5592aca07d9f]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 12: (Monitor::_ms_dispatch(Message*)+0x4aa) [0x5592aca084ba]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 13: (Monitor::ms_dispatch(Message*)+0x26) [0x5592aca37df6]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 14: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5592aca34326]
2019-12-13T04:23:20.458 INFO:tasks.ceph.mon.a.smithi059.stderr: 15: (DispatchQueue::entry()+0x1a49) [0x7f0358121e79]
2019-12-13T04:23:20.459 INFO:tasks.ceph.mon.a.smithi059.stderr: 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f03581ce6bd]
2019-12-13T04:23:20.459 INFO:tasks.ceph.mon.a.smithi059.stderr: 17: (()+0x76db) [0x7f0356d3a6db]
2019-12-13T04:23:20.459 INFO:tasks.ceph.mon.a.smithi059.stderr: 18: (clone()+0x3f) [0x7f0355f2088f]
2019-12-13T04:23:20.459 INFO:tasks.ceph.mon.a.smithi059.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2019-12-13T04:23:20.459 INFO:tasks.ceph.mon.a.smithi059.stderr:
2019-12-13T04:23:20.471 INFO:tasks.ceph.mon.a.smithi059.stderr:     0> 2019-12-13 04:23:20.450 7f034a13a700 -1 *** Caught signal (Segmentation fault) **
2019-12-13T04:23:20.471 INFO:tasks.ceph.mon.a.smithi059.stderr: in thread 7f034a13a700 thread_name:ms_dispatch
2019-12-13T04:23:20.472 INFO:tasks.ceph.mon.a.smithi059.stderr:
2019-12-13T04:23:20.472 INFO:tasks.ceph.mon.a.smithi059.stderr: ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
2019-12-13T04:23:20.472 INFO:tasks.ceph.mon.a.smithi059.stderr: 1: (()+0x12890) [0x7f0356d45890]
2019-12-13T04:23:20.472 INFO:tasks.ceph.mon.a.smithi059.stderr: 2: (()+0x18a487) [0x7f0355f89487]
2019-12-13T04:23:20.472 INFO:tasks.ceph.mon.a.smithi059.stderr: 3: (_IO_getline()+0x55) [0x7f0355e7ee25]
2019-12-13T04:23:20.473 INFO:tasks.ceph.mon.a.smithi059.stderr: 4: (fgets()+0xad) [0x7f0355e7dbcd]
2019-12-13T04:23:20.473 INFO:tasks.ceph.mon.a.smithi059.stderr: 5: (collect_sys_info(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, CephContext*)+0xdd8) [0x7f0358119de8]
2019-12-13T04:23:20.473 INFO:tasks.ceph.mon.a.smithi059.stderr: 6: (Monitor::collect_metadata(std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*)+0x43) [0x5592ac9de0c3]
2019-12-13T04:23:20.473 INFO:tasks.ceph.mon.a.smithi059.stderr: 7: (Elector::start()+0x2b7) [0x5592aca7fb17]
2019-12-13T04:23:20.473 INFO:tasks.ceph.mon.a.smithi059.stderr: 8: (Monitor::start_election()+0x18e) [0x5592ac9ee59e]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 9: (Elector::handle_propose(boost::intrusive_ptr<MonOpRequest>)+0x999) [0x5592aca7f579]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 10: (Elector::dispatch(boost::intrusive_ptr<MonOpRequest>)+0x96d) [0x5592aca812ad]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 11: (Monitor::dispatch_op(boost::intrusive_ptr<MonOpRequest>)+0x11ef) [0x5592aca07d9f]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 12: (Monitor::_ms_dispatch(Message*)+0x4aa) [0x5592aca084ba]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 13: (Monitor::ms_dispatch(Message*)+0x26) [0x5592aca37df6]
2019-12-13T04:23:20.474 INFO:tasks.ceph.mon.a.smithi059.stderr: 14: (Dispatcher::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26) [0x5592aca34326]
2019-12-13T04:23:20.475 INFO:tasks.ceph.mon.a.smithi059.stderr: 15: (DispatchQueue::entry()+0x1a49) [0x7f0358121e79]
2019-12-13T04:23:20.475 INFO:tasks.ceph.mon.a.smithi059.stderr: 16: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f03581ce6bd]
2019-12-13T04:23:20.475 INFO:tasks.ceph.mon.a.smithi059.stderr: 17: (()+0x76db) [0x7f0356d3a6db]
2019-12-13T04:23:20.475 INFO:tasks.ceph.mon.a.smithi059.stderr: 18: (clone()+0x3f) [0x7f0355f2088f]
2019-12-13T04:23:20.476 INFO:tasks.ceph.mon.a.smithi059.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Related issues

Related to RADOS - Bug #38296: segv in fgets() in collect_sys_info reading /proc/cpuinfo Resolved
Copied to RADOS - Backport #43630: mimic: segv in collect_sys_info Resolved
Copied to RADOS - Backport #43631: nautilus: segv in collect_sys_info Resolved
Copied to RADOS - Backport #43632: luminous: segv in collect_sys_info New

History

#1 Updated by Neha Ojha 4 months ago

This looks similar to https://tracker.ceph.com/issues/38296, though the mon seems to have been upgraded to nautilus(which now has the fix). The luminous backport is still pending https://tracker.ceph.com/issues/39474.

#2 Updated by Nathan Cutler 4 months ago

Neha Ojha wrote:

This looks similar to https://tracker.ceph.com/issues/38296, though the mon seems to have been upgraded to nautilus(which now has the fix). The luminous backport is still pending https://tracker.ceph.com/issues/39474.

luminous backport https://tracker.ceph.com/issues/39474 staged at https://github.com/ceph/ceph/pull/32349

#3 Updated by Sage Weil 3 months ago

  • Priority changed from Normal to Urgent
  • Backport set to nautilus, mimic, luminous

#38296 changed the buffer to 1024 chars, but /proc/cpuinfo can be bigger than that, too. On smithi (8 CPUs), it's 9920 bytes, and that's not many CPUs.

#4 Updated by Sage Weil 3 months ago

  • Related to Bug #38296: segv in fgets() in collect_sys_info reading /proc/cpuinfo added

#5 Updated by Sage Weil 3 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 32621

#6 Updated by Sage Weil 3 months ago

  • Subject changed from *** Caught signal (Segmentation fault) ** in upgrade:luminous-x-nautilus to segv in collect_sys_info

#7 Updated by Kefu Chai 3 months ago

https://github.com/ceph/ceph/pull/32630 is posted to avoid using fgets().

#8 Updated by Sage Weil 3 months ago

  • Pull request ID changed from 32621 to 32630

#9 Updated by Kefu Chai 3 months ago

  • Status changed from Fix Under Review to Pending Backport

#10 Updated by Nathan Cutler 3 months ago

#11 Updated by Nathan Cutler 3 months ago

#12 Updated by Nathan Cutler 3 months ago

Also available in: Atom PDF