Project

General

Profile

Actions

Bug #3816

closed

osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())

Added by Sage Weil over 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Spent time:
Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I had two of these systems running, but yesterday I wanted to add a third one.

So I had 8 OSDs (one per disk) running on 0.56.1 and I added one host bringing
the total to 12.

The cluster came into a degraded state (about 50%) and it started to recover
until it reached somewhere about 48%

In a manner of about 5 minutes all the original 8 OSDs had crashed with the same
backtrace:

    -1> 2013-01-15 17:20:29.058426 7f95a0fd8700 10 --
[2a00:f10:113:0:6051:e06c:df3:f374]:6803/4913 reaper done
     0> 2013-01-15 17:20:29.061054 7f959cfd0700 -1 osd/OSD.cc: In function 'void
OSD::do_waiters()' thread 7f959cfd0700 time 2013-01-15 17:20:29.057714
osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())

 ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
 1: (OSD::do_waiters()+0x2c3) [0x6251f3]
 2: (OSD::ms_dispatch(Message*)+0x1c4) [0x62d714]
 3: (DispatchQueue::entry()+0x349) [0x8ba289]
 4: (DispatchQueue::DispatchThread::entry()+0xd) [0x8137cd]
 5: (()+0x7e9a) [0x7f95a95dae9a]
 6: (clone()+0x6d) [0x7f95a805ecbd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to
interpret this.

from Wido on ML

Files

ceph-osd.0.log.1.gz (2.3 MB) ceph-osd.0.log.1.gz Wido den Hollander, 01/16/2013 12:43 PM
ceph-osd.1.log.1.gz (2.29 MB) ceph-osd.1.log.1.gz Wido den Hollander, 01/16/2013 12:43 PM
ceph-osd.2.log.1.gz (2.25 MB) ceph-osd.2.log.1.gz Wido den Hollander, 01/16/2013 12:43 PM
ceph-osd.3.log.1.gz (2.27 MB) ceph-osd.3.log.1.gz Wido den Hollander, 01/16/2013 12:43 PM
ceph-osd.0.log.gz (6.85 MB) ceph-osd.0.log.gz osd.0 log Wido den Hollander, 02/27/2013 09:56 AM
ceph-osd.3.log.gz (9.43 MB) ceph-osd.3.log.gz osd.3 log Wido den Hollander, 02/27/2013 09:56 AM
ceph-osd.0.log.1.gz (3.63 MB) ceph-osd.0.log.1.gz osd.0 crash 06-03-2013 Wido den Hollander, 03/07/2013 05:57 AM
ceph-osd.32.log.1.gz (3.03 MB) ceph-osd.32.log.1.gz osd.32 crash 06-03-2013 Wido den Hollander, 03/07/2013 05:57 AM
ceph-osd.26.log.1.gz (4.3 MB) ceph-osd.26.log.1.gz osd.26 crash 06-03-2013 Wido den Hollander, 03/07/2013 05:57 AM
ceph-osd-logs.tar.gz (1.91 MB) ceph-osd-logs.tar.gz OSD logs from March 15th Wido den Hollander, 03/15/2013 11:57 AM
ceph-osd-0-3.tar.gz (6.82 MB) ceph-osd-0-3.tar.gz osd 0 - 3 March 15th Wido den Hollander, 03/15/2013 12:31 PM

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #4556: OSDs crash with OSD::handle_op during recoveryResolvedSage Weil03/26/2013

Actions
Actions

Also available in: Atom PDF