Project

General

Profile

Actions

Bug #1690

closed

osd re-created from scratch will crash on start-up

Added by Alexandre Oliva over 12 years ago. Updated over 12 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Some time ago, it was possible to re-create an osd after its filesystem failed as simply as running “cosd -i # --mkfs --mkjournal”, and then starting it. This no longer works. ceph-osd --mkfs --mkjournal completes successfully, but after starting the osd, its log shows:

2011-11-05 09:49:31.203140 7f7ed41c0740 filestore(/etc/ceph/osd2) mount: enabling PARALLEL journal mode: btrfs, SNAP_CREATE_V2 detected and 'filestore btrfs snap' mode is enabled
2011-11-05 09:49:31.203557 7f7ed41c0740 journal _open /etc/ceph/osd2/journal2 fd 14: 1610612736 bytes, block size 4096 bytes, directio = 1
2011-11-05 09:49:31.203683 7f7ed41c0740 journal read_entry 4096 : seq 1 212 bytes
2011-11-05 09:49:31.203736 7f7ed41c0740 journal _open /etc/ceph/osd2/journal2 fd 14: 1610612736 bytes, block size 4096 bytes, directio = 1
*** Caught signal (Aborted) **
 in thread 0x7f7ec17fa700
*** Caught signal (Segmentation fault) **
 in thread 0x7f7ec17fa700

In order to bring the osd back up, I rsynced --exclude=/*_head a recent snap of another osd, adjusted the osd number in the superblock, and duplicated the snapshot into current. It then recovered successfully, but it was supposed to have copies of all PGs just like the other; I'm not sure how to go about recovering the osd if this wasn't the case.

Actions

Also available in: Atom PDF