Project

General

Profile

Actions

Bug #1434

closed

osds all failing each other

Added by John Leach over 12 years ago. Updated over 12 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

build of git tree 3a623bb327

cluster boots, then within a minute all the osds start failing each other. logs show heartbeats not being received.

was working fine before, though this is the same cluster that was reporting errant scrub stat mismatches previously (but otherwise apparently functioning fine). See ticket #1376

see attached log from one of my 4 osds.


Files

osd.2.log.gz (936 KB) osd.2.log.gz John Leach, 08/22/2011 03:46 PM
osd.2.log.gz (1.18 MB) osd.2.log.gz survived John Leach, 08/22/2011 03:58 PM
osd.3.log.gz (1.45 MB) osd.3.log.gz crashed John Leach, 08/22/2011 03:58 PM

Updated by John Leach over 12 years ago

reproduced, but with "debug ms = 1" this time.

now I notice that all but 1 osd crashes:

osd/PG.cc: 3892: FAILED assert(0 == "we got a bad state machine event")

stack traces in the log files. attaching logs from two osds, one that crashed one that didn't.

Actions #2

Updated by Sage Weil over 12 years ago

  • Category set to OSD
  • Priority changed from Normal to High
  • Target version set to v0.35
Actions #3

Updated by Sage Weil over 12 years ago

That assert is fixed 4 commits later by cf3b7cf6a9d3f873ad27a313cc1635822bdd89a1. Can you still reproduce with the lastest? There's not enough here (before the crash) to see any heartbeat misbehavior.

Thanks!

Actions #4

Updated by Sage Weil over 12 years ago

  • Assignee set to Sage Weil
Actions #5

Updated by Sage Weil over 12 years ago

  • Status changed from New to 4
Actions #6

Updated by John Leach over 12 years ago

Looks fixed with latest code.

Actions #7

Updated by Sage Weil over 12 years ago

  • Status changed from 4 to Resolved
Actions

Also available in: Atom PDF