https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2018-09-24T20:45:19ZCeph Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1213492018-09-24T20:45:19ZPatrick Donnellypdonnell@redhat.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/121349/diff?detail_id=120002">diff</a>)</li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>Urgent</i></li></ul> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1213522018-09-24T21:31:20ZPatrick Donnellypdonnell@redhat.com
<ul></ul><p>Another: /ceph/teuthology-archive/pdonnell-2018-09-23_19:22:20-multimds-wip-pdonnell-testing-20180923.160923-distro-basic-smithi/3062102/remote/smithi085/coredump/1537811733.35246.core</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1213722018-09-25T09:19:15ZRicardo Diasrdias@suse.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li><li><strong>Assignee</strong> set to <i>Ricardo Dias</i></li></ul><p>The problem is that there are always messages to read in the socket and the message handling is implemented has a sequence of recursive functions, which after some time will fill up the stack and causing a crash.</p>
<p>The previous implementation used a "while loop" with a switch inside and hence no stack overflow would happen.</p>
<p>I'll create a patch to fix this.</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1214852018-09-26T14:36:59ZPatrick Donnellypdonnell@redhat.com
<ul></ul><p>Ricardo Dias wrote:</p>
<blockquote>
<p>The problem is that there are always messages to read in the socket and the message handling is implemented has a sequence of recursive functions, which after some time will fill up the stack and causing a crash.</p>
</blockquote>
<p>I'm looking at the code and don't see a termination condition for the recursive calls. In particular, ProtocolV1::handle_message_footer always calls ProtocolV1::wait_message which always reads data from the connection and calls back into ProtocolV1::handle_message, starting the whole stack over again. What am I missing?</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1216062018-09-27T06:22:08ZRicardo Diasrdias@suse.com
<ul></ul><p>Patrick Donnelly wrote:</p>
<blockquote>
<p>I'm looking at the code and don't see a termination condition for the recursive calls. In particular, ProtocolV1::handle_message_footer always calls ProtocolV1::wait_message which always reads data from the connection and calls back into ProtocolV1::handle_message, starting the whole stack over again. What am I missing?</p>
</blockquote>
<p>In every READ call, if the socket does not have enough bytes to read, control flow is broken and a callback is registered by the connection to call when all the bytes requested are available. Also, if the state has changed to something different than OPENED, the recursive cycle will also be broken when ProtocolV1::wait_message executes.</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1216312018-09-27T15:17:08ZRicardo Diasrdias@suse.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p>PR: <a class="external" href="https://github.com/ceph/ceph/pull/24305">https://github.com/ceph/ceph/pull/24305</a></p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1216322018-09-27T15:18:05ZRicardo Diasrdias@suse.com
<ul></ul><p>@Patrick can you test the patch in the PR to see if it fixes the problem?</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1216382018-09-27T16:15:32ZPatrick Donnellypdonnell@redhat.com
<ul></ul><p>Ricardo Dias wrote:</p>
<blockquote>
<p>@Patrick can you test the patch in the PR to see if it fixes the problem?</p>
</blockquote>
<p>ACK, will do</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1217682018-10-01T12:42:16ZIlya Dryomov
<ul></ul><p>FWIW this is the tip of the stack trace that I hit:</p>
<pre>
#0 0x00007fe545afa18e in _IO_vfprintf_internal (s=s@entry=0x7fe542ee01a0, format=format@entry=0x7fe545c3ff20 <fmt> "%u.%u.%u.%u", ap=ap@entry=0x7fe542ee02c8) at vfprintf.c:1267
#1 0x00007fe545b1d1cb in __IO_vsprintf (string=0x7fe542ee0410 "", format=0x7fe545c3ff20 <fmt> "%u.%u.%u.%u", args=args@entry=0x7fe542ee02c8) at iovsprintf.c:42
#2 0x00007fe545b029c7 in __sprintf (s=s@entry=0x7fe542ee0410 "", format=format@entry=0x7fe545c3ff20 <fmt> "%u.%u.%u.%u") at sprintf.c:32
#3 0x00007fe545bd260f in inet_ntop4 (size=1025, dst=0x7fe542ee0ad0 "", src=0x7fe542ee0414 "") at inet_ntop.c:92
#4 __GI_inet_ntop (af=af@entry=2, src=src@entry=0xb58c0dc, dst=dst@entry=0x7fe542ee0ad0 "", size=size@entry=1025) at inet_ntop.c:63
#5 0x00007fe545bce65e in gni_host_inet_numeric (tmpbuf=0x7fe542ee0640, addrlen=<optimized out>, flags=3, hostlen=1025, host=0x7fe542ee0ad0 "", sa=0xb58c0d8) at getnameinfo.c:369
#6 gni_host_inet (addrlen=<optimized out>, flags=3, hostlen=1025, host=0x7fe542ee0ad0 "", sa=0xb58c0d8, tmpbuf=0x7fe542ee0640) at getnameinfo.c:392
#7 gni_host (addrlen=<optimized out>, flags=3, hostlen=1025, host=0x7fe542ee0ad0 "", sa=0xb58c0d8, tmpbuf=0x7fe542ee0640) at getnameinfo.c:425
#8 __GI_getnameinfo (sa=sa@entry=0xb58c0d8, addrlen=<optimized out>, host=host@entry=0x7fe542ee0ad0 "", hostlen=hostlen@entry=1025, serv=serv@entry=0x7fe542ee0ab0 "", servlen=servlen@entry=32, flags=3) at getnameinfo.c:542
#9 0x000000000111e28a in operator<< (out=..., sa=sa@entry=0xb58c0d8) at /build/ceph-14.0.0-3768-g49f4614/src/msg/msg_types.cc:214
#10 0x000000000111e57f in operator<< (out=..., addr=...) at /build/ceph-14.0.0-3768-g49f4614/src/msg/msg_types.cc:178
#11 0x0000000001283b10 in operator<< (av=..., out=...) at /build/ceph-14.0.0-3768-g49f4614/src/msg/msg_types.h:619
#12 DispatchQueue::pre_dispatch (this=this@entry=0xc0aaa00, m=...) at /build/ceph-14.0.0-3768-g49f4614/src/msg/DispatchQueue.cc:41
#13 0x0000000001285d44 in DispatchQueue::fast_dispatch (this=0xc0aaa00, m=...) at /build/ceph-14.0.0-3768-g49f4614/src/msg/DispatchQueue.cc:70
#14 0x00000000012c1065 in DispatchQueue::fast_dispatch (m=0xed323c0, this=<optimized out>) at /build/ceph-14.0.0-3768-g49f4614/src/msg/DispatchQueue.h:204
#15 ProtocolV1::handle_message_footer (this=0xebd7c00, buffer=<optimized out>, r=<optimized out>) at /build/ceph-14.0.0-3768-g49f4614/src/msg/async/Protocol.cc:1000
#16 0x00000000012b616c in std::function<void (char*, long)>::operator()(char*, long) const (__args#1=<optimized out>, __args#0=<optimized out>, this=0x7fe542ee13c0) at /usr/include/c++/7/bits/std_function.h:706
#17 AsyncConnection::read(unsigned int, char*, std::function<void (char*, long)>) (this=this@entry=0xc570300, len=len@entry=21, buffer=0xce6f000 "%\262\337\352", callback=...) at /build/ceph-14.0.0-3768-g49f4614/src/msg/async/AsyncConnection.cc:176
</pre>
<p>I was looking at it in a small terminal pane, didn't see (didn't bother to check) anything past frame 17 out of 64265 and as a result spent quite some time trying to figure out why getnameinfo() would segfault on a perfectly valid AF_INET sockaddr...</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1225002018-10-11T12:27:53ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Resolved</i></li><li><strong>Backport</strong> deleted (<del><i>mimic,luminous</i></del>)</li></ul><p>no backport needed.. this bug is in the protocol refactor for nautilus</p> Messengers - Bug #36167: msg: infinite recursion while reading messageshttps://tracker.ceph.com/issues/36167?journal_id=1318232019-03-12T23:16:32ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Project</strong> changed from <i>RADOS</i> to <i>Messengers</i></li></ul>