Bug #4855: peek map assert - Ceph - Ceph

Actions

Copy link

Bug #4855

closed

peek map assert

Added by Samuel Just about 11 years ago. Updated almost 11 years ago.

Status:

Can't reproduce

Priority:

High

Assignee:

Samuel Just

Category:

OSD

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

From list:

Hey folks,

I'm helping put together a new test/experimental cluster, and hit this today when bringing the cluster up for the first time (using mkcephfs).

After doing the normal "service ceph -a start", I noticed one OSD was down, and a lot of PGs were stuck creating. I tried restarting the down OSD, but it would come up. It always had this error:

-1> 2013-04-27 18:11:56.179804 b6fcd000  2 osd.1 0 boot
     0> 2013-04-27 18:11:56.402161 b6fcd000 -1 osd/PG.cc: In function 'static epoch_t PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, ceph::bufferlist*)' thread b6fcd000 time 2013-04-27 18:11:56.399089
osd/PG.cc: 2556: FAILED assert(values.size() == 1)

ceph version 0.60-401-g17a3859 (17a38593d60f5f29b9b66c13c0aaa759762c6d04)
 1: (PG::peek_map_epoch(ObjectStore*, coll_t, hobject_t&, ceph::buffer::list*)+0x1ad) [0x2c3c0a]
 2: (OSD::load_pgs()+0x357) [0x28cba0]
 3: (OSD::init()+0x741) [0x290a16]
 4: (main()+0x1427) [0x2155c0]
 5: (__libc_start_main()+0x99) [0xb69bcf42]
 NOTE: a copy of the executable, or `objdump -rdS &lt;executable&gt;` is needed to interpret this.

I then did a full cluster restart, and now I have ten OSDs down -- each showing the same exception/failed assert.

Files

dmesg.txt (60 KB) dmesg.txt

dmesg output for node

Nigel Williams, 05/31/2013 03:55 PM

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #4855

peek map assert

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just almost 11 years ago

Updated by Nigel Williams almost 11 years ago