Project

General

Profile

Actions

Bug #1220

closed

failed assert: peer_addr > msgr->ms_addr

Added by Josh Durgin almost 13 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I got this crash soon after OSD startup during a teuthology run:

msg/SimpleMessenger.cc: In function 'int SimpleMessenger::Pipe::accept()', in thread '0x7eff21a42700'
msg/SimpleMessenger.cc: 827: FAILED assert(peer_addr > msgr->ms_addr)
 ceph version 0.29.1-369-gdf2e3bc (commit:df2e3bcb2ac790e179e97f6b8017b6fa6a8087bf)
 1: (SimpleMessenger::Pipe::accept()+0x2b50) [0x623710]
 2: (SimpleMessenger::Pipe::reader()+0xd4d) [0x624a2d]
 3: (SimpleMessenger::Pipe::Reader::entry()+0xd) [0x48779d]
 4: (()+0x7971) [0x7eff324ca971]
 5: (clone()+0x6d) [0x7eff30f6292d]

The logs and core are in vit:/home/joshd/startup_crash/ - the relevant OSD is osd.1 on sepia82.

Actions #1

Updated by Sage Weil almost 13 years ago

  • Target version set to v0.31
Actions #2

Updated by Greg Farnum almost 13 years ago

  • Category set to msgr
  • Assignee set to Greg Farnum
Actions #3

Updated by Greg Farnum almost 13 years ago

  • Category changed from msgr to OSD
  • Status changed from New to In Progress

Actually, this is an OSD bug:

2011-06-22 14:35:36.502538 7eff27c50700 -- 0.0.0.0:6801/19482 --> osd1 10.3.14.209:6801/19482 -- osd_sub_op(unknown0.0:0 0.0 /0 [scrub-unreserve] v 0'0 snapset=0=[]:[] snapc=0=[]) v1 -- ?+0 0x1bb4000

Actions #4

Updated by Greg Farnum almost 13 years ago

But not just an OSD bug -- the messenger shouldn't form connections to itself. There is code to prevent this and asserts to check it in various places, but there must be some path that isn't guarded.

Actions #5

Updated by Greg Farnum almost 13 years ago

  • Category changed from OSD to msgr
  • Status changed from In Progress to Resolved

I fixed the self-connection problems in commit:8bcc639ab2171827286dafb42ef4635477dee8f1.

Then I created bug #1242 for the OSD issues.

Actions #6

Updated by Greg Farnum about 5 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
  • Target version deleted (v0.31)
Actions

Also available in: Atom PDF