Project

General

Profile

Bug #292

OSD crash raw_pg_to_pg

Added by Wido den Hollander over 13 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In my cluster osd28 just got marked as down, i assume the heartbeat problem again. ( I was playing with the S3 Gateway) Not a real problem, but then my cluster started to recover from its degraded state, during this situation osd18 crashed.

10.07.20_10:52:05.241731 7fd6ab4f4710 7fd6ab4f4710 default allow=0 default_deny=0
osd/OSDMap.h: In function 'pg_t OSDMap::raw_pg_to_pg(pg_t)':
osd/OSDMap.h:898: FAILED assert(pools.count(pg.pool()))
 1: (OSD::handle_op(MOSDOp*)+0x4d3) [0x4d5023]
 2: (OSD::_dispatch(Message*)+0x37d) [0x4e7e2d]
 3: (OSD::ms_dispatch(Message*)+0x39) [0x4e86c9]
 4: (SimpleMessenger::dispatch_entry()+0x749) [0x460659]
 5: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45727c]
 6: (Thread::_entry_func(void*)+0xa) [0x46afaa]
 7: (()+0x69ca) [0x7fd6cdd169ca]
 8: (clone()+0x6d) [0x7fd6ccf366cd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Since i saw a message about pools, i thought this would be relevant:

root@logger:~# rados lspools
data
metadata
casdata
rbd
.rgw
.users
.users.email
thesimpsons
flashforward
wido
root@logger:~#

Just before the crash i had been creating some buckets via de S3 gateway.

I made a copy of the log, coredump and binaery to logger.ceph.widodh.nl and placed the files in /srv/ceph/issues/raw_pg_to_pg_osd_crash

History

#1 Updated by Sage Weil over 13 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF