Project

General

Profile

Actions

Bug #45661

closed

valgrind issue: UninitValue in ProtocolV2

Added by Neha Ojha almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
High
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-05-21T21:01:49.681 INFO:tasks.ceph:Checking for errors in any valgrind logs...
2020-05-21T21:01:49.682 INFO:teuthology.orchestra.run.smithi130:> sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq
2020-05-21T21:01:49.723 INFO:teuthology.orchestra.run.smithi110:> sudo zgrep '<kind>' /var/log/ceph/valgrind/* /dev/null | sort | uniq
2020-05-21T21:01:49.766 INFO:teuthology.orchestra.run.smithi110.stdout:/var/log/ceph/valgrind/osd.5.log:  <kind>SyscallParam</kind>
2020-05-21T21:01:49.766 INFO:teuthology.orchestra.run.smithi110.stdout:/var/log/ceph/valgrind/osd.5.log:  <kind>UninitValue</kind>
2020-05-21T21:01:49.769 INFO:teuthology.orchestra.run.smithi130.stdout:/var/log/ceph/valgrind/osd.2.log:  <kind>SyscallParam</kind>
2020-05-21T21:01:49.769 INFO:teuthology.orchestra.run.smithi130.stdout:/var/log/ceph/valgrind/osd.2.log:  <kind>UninitValue</kind>
2020-05-21T21:01:49.770 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/osd.2.log kind   <kind>SyscallParam</kind>
2020-05-21T21:01:49.770 ERROR:tasks.ceph:saw valgrind issue   <kind>SyscallParam</kind> in /var/log/ceph/valgrind/osd.2.log
2020-05-21T21:01:49.771 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/osd.2.log kind   <kind>UninitValue</kind>
2020-05-21T21:01:49.771 ERROR:tasks.ceph:saw valgrind issue   <kind>UninitValue</kind> in /var/log/ceph/valgrind/osd.2.log
2020-05-21T21:01:49.771 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/osd.5.log kind   <kind>SyscallParam</kind>
2020-05-21T21:01:49.772 ERROR:tasks.ceph:saw valgrind issue   <kind>SyscallParam</kind> in /var/log/ceph/valgrind/osd.5.log
2020-05-21T21:01:49.772 DEBUG:tasks.ceph:file /var/log/ceph/valgrind/osd.5.log kind   <kind>UninitValue</kind>
2020-05-21T21:01:49.772 ERROR:tasks.ceph:saw valgrind issue   <kind>UninitValue</kind> in /var/log/ceph/valgrind/osd.5.log
  <threadname>msgr-worker-2</threadname>
  <kind>UninitValue</kind>
  <what>Use of uninitialised value of size 8</what>
  <stack>
    <frame>
      <ip>0x13DF30F</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ceph_crc32c_intel_baseline</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/common</dir>
      <file>crc32c_intel_baseline.c</file>
      <line>123</line>
    </frame>
    <frame>
      <ip>0xF2FE8E</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ceph_crc32c</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/include</dir>
      <file>crc32c.h</file>
      <line>50</line>
    </frame>
    <frame>
      <ip>0xF2FE8E</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ceph_crc32c</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/include</dir>
      <file>crc32c.h</file>
      <line>43</line>
    </frame>
    <frame>
      <ip>0xF2FE8E</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ceph::buffer::v15_2_0::list::crc32c(unsigned int) const</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/common</dir>
      <file>buffer.cc</file>
      <line>1981</line>
    </frame>
    <frame>
      <ip>0x10F39C8</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>get_buffer</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/msg/async</dir>
      <file>frames_v2.h</file>
      <line>288</line>
    </frame>
    <frame>
      <ip>0x10F39C8</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ProtocolV2::write_message(Message*, bool)</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>536</line>
    </frame>
    <frame>
      <ip>0x110B69A</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>ProtocolV2::write_event()</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/msg/async</dir>
      <file>ProtocolV2.cc</file>
      <line>646</line>
    </frame>
    <frame>
      <ip>0x10D06F6</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>AsyncConnection::handle_write()</fn>
      <dir>/usr/src/debug/ceph-16.0.0-1779.gb728eae6766.el8.x86_64/src/msg/async</dir>
      <file>AsyncConnection.cc</file>
      <line>710</line>
    </frame>
    <frame>
...

/a/nojha-2020-05-21_19:33:40-rados-wip-32601-distro-basic-smithi/5076910

Actions #1

Updated by Brad Hubbard almost 4 years ago

  • Category set to Correctness/Safety
  • Affected Versions v16.0.0 added

/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085545
/a/yuriw-2020-05-23_15:15:01-rados-wip-yuri-master_5.22.20-distro-basic-smithi/5085506

Actions #2

Updated by Neha Ojha almost 4 years ago

  • Priority changed from Normal to High
Actions #3

Updated by Brad Hubbard almost 4 years ago

/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5088037
/a/yuriw-2020-05-24_19:30:40-rados-wip-yuri-master_5.24.20-distro-basic-smithi/5087786

Actions #4

Updated by Kefu Chai almost 4 years ago

/a/kchai-2020-05-27_23:43:53-rados-wip-kefu-testing-2020-05-27-2242-distro-basic-smithi/5097299/remote/*/log/valgrind/osd.7.log.gz

Actions #5

Updated by Brad Hubbard almost 4 years ago

/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5103952
/a/yuriw-2020-05-30_02:18:17-rados-wip-yuri-master_5.29.20-distro-basic-smithi/5104355

Actions #6

Updated by Neha Ojha almost 4 years ago

  • Assignee set to Radoslaw Zarzynski
Actions #7

Updated by Radoslaw Zarzynski almost 4 years ago

  ...
  <auxwhat>Uninitialised value was created by a stack allocation</auxwhat>
  <stack>
    <frame>
      <ip>0x85C267</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>PrimaryLogPG::do_manifest_flush(boost::intrusive_ptr&lt;OpRequest&gt;, std::shared_ptr&lt;ObjectContext&gt;, std::shared_ptr&lt;PrimaryLogPG::FlushOp&gt;, unsigned long, bool)</fn>
      <dir>/usr/src/debug/ceph-16.0.0-2011.ge84b54a3023.el8.x86_64/src/osd</dir>
      <file>PrimaryLogPG.cc</file>
      <line>2528</line>
    </frame>
  </stack>

There is a few recent changes in the PrimaryLogPG::do_manifest_flush() area.

UPDATE: the changes are clean-ups.

Actions #8

Updated by Radoslaw Zarzynski almost 4 years ago

Pin-pointed to a branch of PrimaryLogPG::do_manifest_flush():

    if (pg_pool_t::fingerprint_t fp_algo = pool.info.get_fingerprint_type();
        iter->second.has_reference() &&
        fp_algo != pg_pool_t::TYPE_FINGERPRINT_NONE) {
      // ...
      // add data op
      ceph_osd_op osd_op;
      osd_op.extent.offset = 0;
      osd_op.extent.length = chunk_data.length();
      encode(osd_op, in);

The ceph_osd_op instance has uninitialized members:

struct ceph_osd_op {
        __le16 op;           /* CEPH_OSD_OP_* */
        __le32 flags;        /* CEPH_OSD_OP_FLAG_* */
        union {
                struct {
                        __le64 offset, length;
                        __le64 truncate_size;
                        __le32 truncate_seq; 
                } __attribute__ ((packed)) extent;
                // ...
        } __attribute__ ((packed));
        __le32 payload_len;
} __attribute__ ((packed));

The encoder is defined as:

WRITE_RAW_ENCODER(ceph_osd_op)

which means just memcpy of the struct:

template<class T>
inline void encode_raw(const T& t, bufferlist& bl)
{
  bl.append((char*)&t, sizeof(t));
}
template<class T>
inline void decode_raw(T& t, bufferlist::const_iterator &p)
{
  p.copy(sizeof(t), (char*)&t);
}

#define WRITE_RAW_ENCODER(type)                                         \
  inline void encode(const type &v, ::ceph::bufferlist& bl, uint64_t features=0) { ::ceph::encode_raw(v, bl); } \
  inline void decode(type &v, ::ceph::bufferlist::const_iterator& p) { ::ceph::decode_raw(v, p); }

Actions #9

Updated by Radoslaw Zarzynski almost 4 years ago

  • Status changed from New to Fix Under Review
Actions #10

Updated by Neha Ojha almost 4 years ago

  • Pull request ID set to 35407
Actions #11

Updated by Radoslaw Zarzynski almost 4 years ago

  • Status changed from Fix Under Review to Resolved
  • Backport set to nautilus

In master the PR #35407 has been closed in favor of https://github.com/ceph/ceph/pull/35186.
#35407 still might be useful if we need to backport.

Actions

Also available in: Atom PDF