Project

General

Profile

Actions

Bug #15413

closed

msg/simple: "existing" Pipe can get reaped prior to replacement during accept()

Added by Sage Weil about 8 years ago. Updated almost 3 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  <kind>InvalidRead</kind>
  <what>Invalid read of size 4</what>
  <stack>
    <frame>
      <ip>0xBB9F164</ip>
      <obj>/lib/x86_64-linux-gnu/libpthread-2.19.so</obj>
      <fn>__pthread_mutex_cond_lock</fn>
      <dir>/build/buildd/eglibc-2.19/nptl/../nptl</dir>
      <file>pthread_mutex_lock.c</file>
      <line>66</line>
    </frame>
    <frame>
      <ip>0xBB994A3</ip>
      <obj>/lib/x86_64-linux-gnu/libpthread-2.19.so</obj>
      <fn>pthread_cond_wait@@GLIBC_2.3.2</fn>
      <dir>/build/buildd/eglibc-2.19/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64</dir>
      <file>pthread_cond_wait.S</file>
      <line>259</line>
    </frame>
    <frame>
      <ip>0xBD5F3E</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>Pipe::accept()</fn>
      <dir>/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-10.1.0-571-g3611e39/src/./common</dir>
      <file>Cond.h</file>
      <line>56</line>
    </frame>
    <frame>
      <ip>0xBDAD3E</ip>
      <obj>/usr/bin/ceph-osd</obj>
      <fn>Pipe::reader()</fn>
      <dir>/srv/autobuild-ceph/gitbuilder.git/build/out~/ceph-10.1.0-571-g3611e39/src/msg/simple</dir>
      <file>Pipe.cc</file>
      <line>1499</line>
    </frame>

Can't tell exactly which cond, but my guess is that here the existing Pipe * goes away...
      if (existing->reader_dispatching) {
    /** we need to wait, or we can deadlock if downstream
     *  fast_dispatchers are (naughtily!) waiting on resources
     *  held by somebody trying to make use of the SimpleMessenger lock.
     *  So drop locks, wait, and retry. It just looks like a slow network
     *  to everybody else.
     */
    pipe_lock.Unlock();
    msgr->lock.Unlock();
    existing->notify_on_dispatch_done = true;
    while (existing->reader_dispatching)
      existing->cond.Wait(existing->pipe_lock);
    existing->pipe_lock.Unlock();
    goto retry_existing_lookup;
      }

/a/sage-2016-04-06_19:36:52-rados-wip-sage-testing-distro-basic-smithi/112920

Actions #1

Updated by Haomai Wang about 8 years ago

not sure. Looks like PR(https://github.com/ceph/ceph/pull/8336) may cover this.

it may only occur in theory if existing->state is marked as CLOSED while existing->reader unlock pipe_lock then do inline fast_dispatch. So accept(this invalid read) will still wait the fast dispatch process. After existing's reader completes fast dispatch, the marking_down thread may firstly wakeup and reap pipe? so this happen?

Actions #2

Updated by Greg Farnum about 8 years ago

  • Subject changed from msg/simple: bad read in Pipe::accept to msg/simple: "existing" Pipe can get reaped prior to replacement during accept()

Haomai's right. The logic for reading those values is pretty compact and obviously correct. We don't do anything to guarantee the existing Pipe doesn't get reaped until we're done looking at it. Given the number of lock steps, you'd have to get pretty unlucky, but it is possible.

And, d'oh, the valgrind log includes that information. Yep, the existing thread got freed. I'm kind of confused we didn't get crashes on the invalid lock actions, though...

Actions #3

Updated by Sage Weil almost 3 years ago

  • Status changed from New to Won't Fix
Actions

Also available in: Atom PDF