Project

General

Profile

Bug #18347

osd crashed due to journal corrupt

Added by huang jun over 7 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

hi,all

we have a test cluster with 12 osds on 3 nodes, each has 4 osds,
use ssd as osd journal,
Ceph version: 0.94.5
OS: oracle linux 6.5
Linux kernel: 3.18

use dd doing heavy writes, and then plug all nodes power cable,
then boot the node.
but one osd crashed due to journal corrupt.

here is the log:
2016-12-22 22:45:48.820184 7feb89a14800 10 filestore(/var/lib/ceph/osd/ceph-8) open_journal at /var/lib/ceph/osd/ceph-8/journal
2016-12-22 22:45:48.820216 7feb89a14800 0 filestore(/var/lib/ceph/osd/ceph-8) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-12-22 22:45:48.820220 7feb89a14800 10 filestore(/var/lib/ceph/osd/ceph-8) list_collections
2016-12-22 22:45:48.821622 7feb89a14800 10 journal journal_replay fs op_seq 12872
2016-12-22 22:45:48.821634 7feb89a14800 2 journal open /var/lib/ceph/osd/ceph-8/journal fsid 592e7c5d-6f89-4984-b6e7-bdc8e380c506 fs_op_seq 12872
2016-12-22 22:45:48.821664 7feb89a14800 10 journal _open_block_device: ignoring osd journal size. We'll use the entire block device (size: 19995295744)
2016-12-22 22:45:48.821650 7feb7ff9f700 20 filestore(/var/lib/ceph/osd/ceph-8) sync_entry waiting for max_interval 10.000000
2016-12-22 22:45:49.043037 7feb89a14800 20 journal _check_disk_write_cache: disk write cache is on, but your kernel is new enough to handle it correctly. (fn:/var/lib/ceph/osd/ceph-8/journal)
2016-12-22 22:45:49.043118 7feb89a14800 1 journal _open /var/lib/ceph/osd/ceph-8/journal fd 19: 19995295744 bytes, block size 4096 bytes, directio = 1, aio = 1
2016-12-22 22:45:49.043132 7feb89a14800 10 journal read_header
2016-12-22 22:45:49.044216 7feb89a14800 10 journal header: block_size 4096 alignment 4096 max_size 19995295744
2016-12-22 22:45:49.044226 7feb89a14800 10 journal header: start 10925465600
2016-12-22 22:45:49.044227 7feb89a14800 10 journal write_pos 4096
2016-12-22 22:45:49.044231 7feb89a14800 10 journal open header.fsid = 592e7c5d-6f89-4984-b6e7-bdc8e380c506
2016-12-22 22:45:49.055570 7feb89a14800 2 journal read_entry 10925465600 : seq 13045 4196506 bytes
2016-12-22 22:45:49.055592 7feb89a14800 10 journal open entry 13045 len 4196506 > next_seq 12873, ignoring journal contents
2016-12-22 22:45:49.057073 7feb89a14800 25 journal read_entry -1 : bad header magic, end of journal
2016-12-22 22:45:49.057084 7feb89a14800 -1 journal Unable to read past sequence 12873 but header indicates the journal has committed up through 13060, journal is corrupt
2016-12-22 22:45:49.058633 7feb89a14800 -1 os/FileJournal.cc: In function 'bool FileJournal::read_entry(ceph::bufferlist&, uint64_t&, bool*)' thread 7feb89a14800 time 2016-12-22 22:45:49.0570
91
os/FileJournal.cc: 1780: FAILED assert(0)

message "read_entry -1 : bad header magic, end of journal" is weird for me,

related code:

if (!h->check_magic(cur_pos, header.get_fsid64())) {
dout(25) << "read_entry " << init_pos
<< " : bad header magic, end of journal" << dendl;
if (ss)
*ss << "bad header magic";
if (next_pos)
*next_pos = init_pos + (4<&lt;10); // check 4k ahead
return MAYBE_CORRUPT;
}
bool check_magic(off64_t pos, uint64_t fsid) {
return
magic1 (uint64_t)pos &&
magic2 (fsid ^ seq ^ len);
}

In this case, cur_pos is -1, i think check_magic will always fail.
is it intended to do so?

we do the test 5 times, and each time there is 1 osd crashed according to journal corrupt.

at first i think the outage result the disk misbehaved, but can not confirm it,
dmesg shows everything is ok.

History

#1 Updated by Greg Farnum almost 7 years ago

  • Status changed from New to Closed

Yep, that's some kind of bad setting if power corruption results in a busted journal.

Also available in: Atom PDF