Project

General

Profile

Actions

Bug #1093

closed

msgr: race conditon with replaced pipe's connection_state

Added by Josh Durgin almost 13 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When a non-lossy connection is replaced, the messenger sets its connection_state to NULL while holding the pipe_lock. However, this variable is read when the pipe_lock is not held in Pipe::read_message and Pipe::write_message.

I hit this while running 10 osds on one disk:

 1: (ceph::BackTrace::BackTrace(int)+0x32) [0x8c024a]
 2: ./cosd() [0x97b69e]
 3: (()+0xef60) [0x7f205711ff60]
 4: (Connection::has_feature(int) const+0x18) [0x701f26]
 5: (SimpleMessenger::Pipe::write_message(Message*)+0x283) [0x6fbbef]
 6: (SimpleMessenger::Pipe::writer()+0x852) [0x6f966e]
 7: (SimpleMessenger::Pipe::Writer::entry()+0x21) [0x6e3e4d]
 8: (Thread::_entry_func(void*)+0x28) [0x701231]
 9: (()+0x68ba) [0x7f20571178ba]
 10: (clone()+0x6d) [0x7f2055dac02d]

Relevant logs and core files are vit:/home/joshd/weekend_run/osd.4 and vit:/home/joshd/weekend_run/core.2556.1305416401

Actions

Also available in: Atom PDF