Project

General

Profile

Actions

Bug #2843

closed

filestore: replay failure on xfs

Added by Sage Weil almost 12 years ago. Updated over 11 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

congress osd.328 crashed with


   -31> 2012-07-26 04:50:31.096568 7ff6ff903780  2 osd.328 0 mounting /srv/ceph/osd/328 /srv/ceph/devices/osd.328.journal
   -30> 2012-07-26 04:50:31.096599 7ff6ff903780  5 filestore(/srv/ceph/osd/328) basedir /srv/ceph/osd/328 journal /srv/ceph/devices/osd.328.journal
   -29> 2012-07-26 04:50:31.096626 7ff6ff903780 10 filestore(/srv/ceph/osd/328) mount fsid is d46d9806-017d-4172-ae28-cfef8660eef5
   -28> 2012-07-26 04:50:31.099377 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount FIEMAP ioctl is supported and appears to work
   -27> 2012-07-26 04:50:31.099392 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option
   -26> 2012-07-26 04:50:31.099666 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount did NOT detect btrfs
   -25> 2012-07-26 04:50:31.102271 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount syncfs(2) syscall fully supported (by glibc and kernel)
   -24> 2012-07-26 04:50:31.102329 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount found snaps <>
   -23> 2012-07-26 04:50:31.102343 7ff6ff903780  5 filestore(/srv/ceph/osd/328) mount op_seq is 1085689
   -22> 2012-07-26 04:50:31.103499 7ff6ff903780 20 filestore (init)dbobjectmap: seq is 27809
   -21> 2012-07-26 04:50:31.103514 7ff6ff903780 10 filestore(/srv/ceph/osd/328) open_journal at /srv/ceph/devices/osd.328.journal
   -20> 2012-07-26 04:50:31.103524 7ff6ff903780  0 filestore(/srv/ceph/osd/328) mount: enabling WRITEAHEAD journal mode: btrfs not detected
   -19> 2012-07-26 04:50:31.103527 7ff6ff903780 10 filestore(/srv/ceph/osd/328) list_collections
   -18> 2012-07-26 04:50:31.114766 7ff6ff903780  2 journal open /srv/ceph/devices/osd.328.journal fsid d46d9806-017d-4172-ae28-cfef8660eef5 fs_op_seq 1085689
   -17> 2012-07-26 04:50:31.114778 7ff6f6d75700 20 filestore(/srv/ceph/osd/328) sync_entry waiting for max_interval 5.000000
   -16> 2012-07-26 04:50:31.123102 7ff6ff903780  1 journal _open /srv/ceph/devices/osd.328.journal fd 32: 10736398336 bytes, block size 4096 bytes, directio = 1, aio = 0
   -15> 2012-07-26 04:50:31.884099 7ff6ff903780  2 journal read_entry 10318311424 : seq 1085689 114185663 bytes
   -14> 2012-07-26 04:50:31.884140 7ff6ff903780  2 journal read_entry 10432499712 : bad header magic, end of journal
   -13> 2012-07-26 04:50:31.884154 7ff6ff903780  2 journal read_entry 10432499712 : bad header magic, end of journal
   -12> 2012-07-26 04:50:31.884157 7ff6ff903780  3 journal journal_replay: end of journal, done.
   -11> 2012-07-26 04:50:31.930818 7ff6ff903780  1 journal _open /srv/ceph/devices/osd.328.journal fd 32: 10736398336 bytes, block size 4096 bytes, directio = 1, aio = 0
   -10> 2012-07-26 04:50:31.931001 7ff6f3d6f700 20 filestore(/srv/ceph/osd/328) flusher_entry start
    -9> 2012-07-26 04:50:31.931027 7ff6f3d6f700 20 filestore(/srv/ceph/osd/328) flusher_entry sleeping
    -8> 2012-07-26 04:50:31.931074 7ff6ff903780  2 osd.328 0 boot
    -7> 2012-07-26 04:50:31.931086 7ff6ff903780 15 filestore(/srv/ceph/osd/328) read meta/23c2fcde/osd_superblock/0//-1 0~0
    -6> 2012-07-26 04:50:31.931176 7ff6ff903780 10 filestore(/srv/ceph/osd/328) FileStore::read meta/23c2fcde/osd_superblock/0//-1 0~144/144
    -5> 2012-07-26 04:50:31.931188 7ff6ff903780 10 osd.328 0 read_superblock sb(31b8be2f-ac05-4e56-96b7-e702df166e29 osd.328 d46d9806-017d-4172-ae28-cfef8660eef5 e74853 [69038,74853] lci=[72194,74853])
    -4> 2012-07-26 04:50:31.931215 7ff6ff903780 20 osd.328 0 get_map 74853 - loading and decoding 0x2621700
    -3> 2012-07-26 04:50:31.931222 7ff6ff903780 15 filestore(/srv/ceph/osd/328) read meta/4f711459/osdmap.74853/0//-1 0~0
    -2> 2012-07-26 04:50:31.931254 7ff6ff903780 10 filestore(/srv/ceph/osd/328) FileStore::read meta/4f711459/osdmap.74853/0//-1 0~0/0
    -1> 2012-07-26 04:50:31.931260 7ff6ff903780 10 osd.328 0 add_map_bl 74853 0 bytes
     0> 2012-07-26 04:50:31.932648 7ff6ff903780 -1 *** Caught signal (Aborted) **
 in thread 7ff6ff903780

 ceph version 0.48argonaut-48-g16302ac (commit:16302acefd8def98fc4597366d6ba2845e17fcb6)
 1: ceph-osd() [0x6f4eba]
 2: (()+0xfcb0) [0x7ff6ff2e1cb0]
 3: (gsignal()+0x35) [0x7ff6fd7d2445]
 4: (abort()+0x17b) [0x7ff6fd7d5bab]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7ff6fe12069d]
 6: (()+0xb5846) [0x7ff6fe11e846]
 7: (()+0xb5873) [0x7ff6fe11e873]
 8: (()+0xb596e) [0x7ff6fe11e96e]
 9: (ceph::buffer::list::iterator::copy(unsigned int, char*)+0x127) [0x7b1fe7]
 10: (OSDMap::decode(ceph::buffer::list::iterator&)+0x3f) [0x76f72f]
 11: (OSDMap::decode(ceph::buffer::list&)+0x3e) [0x77082e]
 12: (OSD::get_map(unsigned int)+0x326) [0x5ced36]
 13: (OSD::init()+0x4ee) [0x5dc90e]
 14: (main()+0x2377) [0x522f37]
 15: (__libc_start_main()+0xed) [0x7ff6fd7bd76d]
 16: ceph-osd() [0x525109]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

the file on disk is 0 bytes.  dump-journal is attached.  the one entry writes that object, but doesn't replay because of the current seq value.


Files

fooo (89.2 KB) fooo Sage Weil, 07/25/2012 09:54 PM

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #2830: [argonaut] osd/OSD.cc: 3906: FAILED assert(_get_map_bl(epoch, bl)) Duplicate07/24/2012

Actions
Actions

Also available in: Atom PDF