Project

General

Profile

Actions

Bug #14120

closed

Pipe::do_recv() may loop infinitely

Added by Zheng Yan over 8 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After cutting down network between osd and client, I saw one OSD thread use 100% CPU, The thread loops Pipe::do_recv

#0 Pipe::do_recv (this=0x22d09000, buf=0x23262000 "", len=4096, flags=64) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:2508
#1 0x000000000205de08 in Pipe::buffered_recv (this=0x22d09000, buf=0x7f78a4c297fc "\001", len=53, flags=64) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:2552
#2 0x000000000205debe in Pipe::tcp_read_nonblocking (this=0x22d09000, buf=0x7f78a4c297f0 "\265", <incomplete sequence \356>, len=53) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:2570
#3 0x000000000205d927 in Pipe::tcp_read (this=0x22d09000, buf=0x7f78a4c297f0 "\265", <incomplete sequence \356>, len=53) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:2457
#4 0x000000000205970f in Pipe::read_message (this=0x22d09000, pm=0x7f78a4c29c18, auth_handler=0x226aaca0) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:1901
#5 0x00000000020564b8 in Pipe::reader (this=0x22d09000) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.cc:1581
#6 0x000000000205e788 in Pipe::Reader::entry (this=0x22d09018) at /home/zhyan/Ceph/ceph/src/msg/simple/Pipe.h:50
#7 0x0000000001f66b60 in Thread::entry_wrapper (this=0x22d09018) at /home/zhyan/Ceph/ceph/src/common/Thread.cc:84
#8 0x0000000001f66aae in Thread::_entry_func (arg=0x22d09018) at /home/zhyan/Ceph/ceph/src/common/Thread.cc:66
#9 0x00007f78cfef460a in start_thread () from /lib64/libpthread.so.0
#10 0x00007f78ce521a9d in clone () from /lib64/libc.so.6


Files

perf.out.txt (844 KB) perf.out.txt Zheng Yan, 05/05/2016 12:35 PM

Related issues 2 (0 open2 closed)

Related to Ceph - Bug #17411: Wierd deadlock in OSDResolved09/27/2016

Actions
Copied to Ceph - Backport #18155: Pipe::do_recv() may loop infinitelyRejectedSage WeilActions
Actions #1

Updated by Zheng Yan over 8 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by Zheng Yan almost 8 years ago

I encountered this again. I ran 'perf record -g -e cycles:u -p xxx', perf report > perf.out.txt

Actions #3

Updated by Sage Weil almost 8 years ago

  • Status changed from Fix Under Review to Resolved
Actions #4

Updated by Sage Weil over 7 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to jewel
Actions #5

Updated by Abhishek Lekshmanan over 7 years ago

Actions #6

Updated by Greg Farnum over 7 years ago

  • Status changed from Pending Backport to In Progress

So, I think this patch was a mistake and we need to revert it wherever it's appeared.

From "man 2 recv":

RETURN VALUE
       These calls return the number of bytes received, or -1 if an error occurred.  In the
event of an error, errno is set to indicate the error.  The return value will be 0 when the
peer has performed an orderly shutdown.

ERRORS
       These are some standard errors generated by the socket layer.  Additional errors may
be generated and returned from the underlying protocol modules; see their manual pages.

       EAGAIN or EWOULDBLOCK
              The socket is marked nonblocking and the receive operation would block, or a
receive timeout had been set and the timeout expired before data was received.  POSIX.1-2001
allows either error to be returned for this case, and does not require these constants to
have the same value, so a portable application should check for both  possibilities.

So, the only way we get EAGAIN is if we have no data available when making the ::recv() call, or it would block for another undetermined reason. But we always make sure data is available before doing invoking this function (by calling tcp_read_wait())! So marking a failure on EAGAIN is either blatantly incorrect, or masking some other kind of failure as turned out to be the root cause here (see #18184 for more details).

Actions #7

Updated by Greg Farnum over 7 years ago

According to http://pubs.opengroup.org/onlinepubs/009695399/functions/recv.html,

[EAGAIN] or [EWOULDBLOCK]
The socket's file descriptor is marked O_NONBLOCK and no data is waiting to be received; or
MSG_OOB is set and no out-of-band data is available and either the socket's file descriptor
is marked O_NONBLOCK or the socket does not support blocking to await out-of-band data.

That makes it a little more explicit than the Linux manpage does that we shouldn't get EAGAIN if we have pre-checked for data, but I'm nervous about the edges. Sage suggests printing out a warning if we get EAGAIN here; we could also make it some kind of config that defaults to looping on EAGAIN but lets us disable that if we find errors in the future?

Actions #8

Updated by Greg Farnum over 7 years ago

Revert PR at https://github.com/ceph/ceph/pull/12374 may get more discussion of options.

Actions #9

Updated by Greg Farnum over 7 years ago

  • Assignee set to Greg Farnum
Actions #10

Updated by Zheng Yan over 7 years ago

Greg Farnum wrote:

So, I think this patch was a mistake and we need to revert it wherever it's appeared.

From "man 2 recv":
[...]

So, the only way we get EAGAIN is if we have no data available when making the ::recv() call, or it would block for another undetermined reason. But we always make sure data is available before doing invoking this function (by calling tcp_read_wait())! So marking a failure on EAGAIN is either blatantly incorrect, or masking some other kind of failure as turned out to be the root cause here (see #18184 for more details).

As I stated in commit message, tcp_read_wait() does not guarantee the socket has data (it return 0 if there are pending data). For example, the loop happens if Pipe::tcp_read() want to read 100 bytes, there are only 50 bytes pending data and the socket has no data.

Actions #11

Updated by Alexey Sheplyakov over 7 years ago

Greg Farnum wrote:

That makes it a little more explicit than the Linux manpage does that we shouldn't get EAGAIN if we have pre-checked for data

It's perfectly fine to get EAGAIN in non-blocking mode (even if the socket is reportedly ready for read):

Under Linux, select() may report a socket file descriptor as "ready for reading",
while nevertheless a subsequent read blocks. This could for example happen when data
has arrived but upon examination has wrong checksum and is discarded. There may be other
circumstances in which a file descriptor is spuriously reported as ready.
Thus it may be safer to use O_NONBLOCK on sockets that should not block

https://linux.die.net/man/2/select

Actions #12

Updated by Samuel Just over 7 years ago

  • Status changed from In Progress to Resolved
Actions #13

Updated by Sage Weil almost 7 years ago

  • Related to Bug #17411: Wierd deadlock in OSD added
Actions #14

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
Actions

Also available in: Atom PDF