Project

General

Profile

Bug #46216

mon: log entry with garbage generated by bad memory access

Added by Patrick Donnelly almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Monitor
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Causes the mgr to segmentation fault:

2020-06-25T23:37:11.148+0000 7fc53f034700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fc53f034700 thread_name:mgr-fin

 ceph version 16.0.0-2933-gc11e8fa7765 (c11e8fa7765ea6214ebb0850ec7a2c2f4d7173db) pacific (dev)
 1: (()+0xefbd0c) [0x55d98a593d0c]
 2: (()+0x12dd0) [0x7fc58d882dd0]
 3: (PyDict_SetItem()+0x20e) [0x7fc58eb1e38e]
 4: (PyFormatter::dump_pyobject(std::basic_string_view<char, std::char_traits<char> >, _object*)+0x111) [0x55d98a31097b]
 5: (PyFormatter::dump_string(std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >)+0x59) [0x55d98a310643]
 6: (LogEntry::dump(ceph::Formatter*) const+0x282) [0x7fc5904f6b8e]
 7: (ActivePyModule::notify_clog(LogEntry const&)+0x1b8) [0x55d98a1a73ae]
 8: (()+0xb19a6a) [0x55d98a1b1a6a]
 9: (()+0xb226d2) [0x55d98a1ba6d2]
 10: (Context::complete(int)+0x27) [0x55d98a1bec15]
 11: (Finisher::finisher_thread_entry()+0x39d) [0x7fc5904a67a9]
 12: (Finisher::FinisherThread::entry()+0x1c) [0x55d98a1c1e46]
 13: (Thread::entry_wrapper()+0x78) [0x7fc590517888]
 14: (Thread::_entry_func(void*)+0x18) [0x7fc590517806]
 15: (()+0x82de) [0x7fc58d8782de]
 16: (clone()+0x43) [0x7fc58c7ce4b3]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Probably the same error as #45331. Here's the gdb info:

(gdb) p *this
$2 = {name = {static STR_TO_ENTITY_TYPE = {_M_elems = {{type = 32, str = 0x7fc590eedac0 "auth"}, {type = 1, str = 0x7fc590eedac5 "mon"}, {type = 4,
          str = 0x7fc590eedac9 "osd"}, {type = 2, str = 0x7fc590eedacd "mds"}, {type = 16, str = 0x7fc590eedad1 "mgr"}, {type = 8, str = 0x7fc590eedad5 "client"}}},
    type = 1, id = "a", type_id = "mon.a"}, rank = {_type = 1 '\001', _num = 0, static TYPE_MON = 1, static TYPE_MDS = 2, static TYPE_OSD = 4, static TYPE_CLIENT = 8,
    static TYPE_MGR = 16, static NEW = -1}, addrs = {v = std::vector of length 2, capacity 2 = {{static TYPE_DEFAULT = entity_addr_t::TYPE_MSGR2, type = 2, nonce = 0, u = {
          sa = {sa_family = 2, sa_data = "\237P\254\025\t!\000\000\000\000\000\000\000"}, sin = {sin_family = 2, sin_port = 20639, sin_addr = {s_addr = 554243500},
            sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 20639, sin6_flowinfo = 554243500, sin6_addr = {__in6_u = {
                __u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}}, {
        static TYPE_DEFAULT = entity_addr_t::TYPE_MSGR2, type = 1, nonce = 0, u = {sa = {sa_family = 2, sa_data = "\237Q\254\025\t!\000\000\000\000\000\000\000"}, sin = {
            sin_family = 2, sin_port = 20895, sin_addr = {s_addr = 554243500}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = {sin6_family = 2, sin6_port = 20895,
            sin6_flowinfo = 554243500, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}},
            sin6_scope_id = 0}}}}}, stamp = {tv = {tv_sec = 1593128230, tv_nsec = 652978537}}, seq = 253, prio = CLOG_INFO,
  msg = "MDS daemon mds. Y\371\344\235U\000\000\063.lpevwa is removed because it is dead or otherwise unavailable.", channel = "cluster"}

This was generated by:

https://github.com/ceph/ceph/blob/master/src/mon/MDSMonitor.cc#L2067-L2075

The reference info variable is invalid at this point.


Related issues

Related to mgr - Bug #45331: Segmentation fault New
Copied to RADOS - Backport #46286: octopus: mon: log entry with garbage generated by bad memory access Resolved
Copied to RADOS - Backport #46287: nautilus: mon: log entry with garbage generated by bad memory access Rejected

History

#1 Updated by Patrick Donnelly almost 4 years ago

#2 Updated by Neha Ojha over 3 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 35784

#3 Updated by Patrick Donnelly over 3 years ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Patrick Donnelly over 3 years ago

  • Copied to Backport #46286: octopus: mon: log entry with garbage generated by bad memory access added

#5 Updated by Patrick Donnelly over 3 years ago

  • Copied to Backport #46287: nautilus: mon: log entry with garbage generated by bad memory access added

#6 Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF