Bug #4324
kernel messenger: leaks in ceph_con_in_msg_alloc()
0%
Description
I posted some code for review and Greg caught what looks
like a memory leak in error handling. I looked a bit into
it but am giving up for now, because I think this function
is a bit of a mess and it's going to take a bit more focus
to get this right.
I'll get to this soon, but in the mean time I wanted to
document it so I can move on.
Here's a message that documents the discussion.
http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg12910.html
History
#1 Updated by Greg Farnum about 11 years ago
FYI I mentioned this to Sage and he said the code has been refactored incorrectly a couple of times so it's definitely become a mess. But my money is that a BUG_ON(con->in_msg) is actually appropriate. :)
#2 Updated by Alex Elder about 11 years ago
- Status changed from New to In Progress
FYI I mentioned this to Sage and he said the code has been
refactored incorrectly a couple of times so it's definitely
become a mess.
That would likely be my handiwork. It's unfortunate if it
was known to be incorrect and allowed to get committed.
But my money is that a BUG_ON(con->in_msg) is actually appropriate. :)
The caller actually does that in this case. I'm looking at
it a little more broadly at the moment though, to see if I can
straighten things out a bit. The logic with the skip parameter
is a little janky.
#3 Updated by Alex Elder about 11 years ago
- Status changed from In Progress to Fix Under Review
The following patch has been posted for review.
(This patch is available as the top commit in branch
"review/wip-4324" in the ceph-client git repository.)
[PATCH] libceph: clean up skipped message logic In ceph_con_in_msg_alloc() it is possible for a connection's alloc_msg method to indicate an incoming message should be skipped. By default, read_partial_message() initializes the skip variable to 0 before it gets provided to ceph_con_in_msg_alloc(). The osd client, mon client, and mds client each supply an alloc_msg method. The mds client always assigns skip to be 0. The other two leave the skip value of as-is, or assigns it to zero, except: - if no (osd or mon) request having the given tid is found, in which case skip is set to 1 and NULL is returned; or - in the osd client, if the data of the reply message is not adequate to hold the message to be read, it assigns skip value 1 and returns NULL. So the returned message pointer will always be NULL if skip is ever non-zero. Clean up the logic a bit in ceph_con_in_msg_alloc() to make this state of affairs more obvious. Add a comment explaining how a null message pointer can mean either a message that should be skipped or a problem allocating a message. This resolves: http://tracker.ceph.com/issues/4324 Reported-by: Greg Farnum <greg@inktank.com> Signed-off-by: Alex Elder <elder@inktank.com>
#4 Updated by Sage Weil about 11 years ago
I seem to remember having a hand in it as well. :)
#5 Updated by Alex Elder about 11 years ago
- Status changed from Fix Under Review to Resolved
This has been committed to the ceph-client testing branch:
d11db70 libceph: clean up skipped message logic