Project

General

Profile

Actions

Bug #481

closed

cosd leaking messenger threads

Added by Sage Weil over 13 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

600 threads on ballpit3, running 0.22~rc, almost all messenger threads.

Actions #1

Updated by Sage Weil over 13 years ago

see ballpit3:/tmp/a

Actions #2

Updated by Greg Farnum over 13 years ago

  • Status changed from New to In Progress
  • Priority changed from Urgent to High

The problem here is that tcp_read never times out, and OSDs don't write to sessions unless they're replying to something. So once formed, a connection hangs around until the client says to kill it (which can't happe if they die or lose their network)!
I've begun fixing this by setting a timeout on the tcp socket, but initial testing is revealing odd problems.

Since the OSD isn't losing thread references and I've got another bug that's an actual error, I'm going to move to that and come back to this.

Actions #3

Updated by Sage Weil over 13 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF