Project

General

Profile

Actions

Bug #7448

closed

os/FileJournal.cc: FAILED assert(fd >= 0)

Added by Joao Eduardo Luis about 10 years ago. Updated about 10 years ago.

Status:
Duplicate
Priority:
High
Category:
OSD
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Triggered while running the ceph-deploy suite on next, on debian (http://pulpito.ceph.com/sage-2014-02-15_17:05:48-ceph-deploy-wip-7212-sage-b-testing-basic-vps/84897/)

Logs can be found in teuthology:/a/sage-2014-02-15_17:05:48-ceph-deploy-wip-7212-sage-b-testing-basic-vps/84897/

In particular, log for osd.5 (/a/sage-2014-02-15_17:05:48-c/log/ceph-osd.5.log.gz):

   -32> 2014-02-16 02:09:25.551178 7f1dbd965780  0 ceph version 0.76-406-gcf4f702 (cf4f7027e7a05090d3d1c14f2ac4db73bf1d8fa4), process ceph-osd, pid 28588
   -31> 2014-02-16 02:09:25.551490 7f1dbd965780  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6811/28588 need_addr=1
   -30> 2014-02-16 02:09:25.551511 7f1dbd965780  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6812/28588 need_addr=1
   -29> 2014-02-16 02:09:25.551523 7f1dbd965780  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6813/28588 need_addr=1
   -28> 2014-02-16 02:09:25.551535 7f1dbd965780  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6814/28588 need_addr=1
   -27> 2014-02-16 02:09:25.551548 7f1dbd965780  1 accepter.accepter.bind my_inst.addr is 0.0.0.0:6815/28588 need_addr=1
   -26> 2014-02-16 02:09:25.559059 7f1dbd965780  1 finished global_init_daemonize
   -25> 2014-02-16 02:09:25.559311 7f1dbd965780  5 asok(0x28a01c0) init /var/run/ceph/ceph-osd.5.asok
   -24> 2014-02-16 02:09:25.559330 7f1dbd965780  5 asok(0x28a01c0) bind_and_listen /var/run/ceph/ceph-osd.5.asok
   -23> 2014-02-16 02:09:25.559361 7f1dbd965780  5 asok(0x28a01c0) register_command 0 hook 0x288c030
   -22> 2014-02-16 02:09:25.559374 7f1dbd965780  5 asok(0x28a01c0) register_command version hook 0x288c030
   -21> 2014-02-16 02:09:25.559376 7f1dbd965780  5 asok(0x28a01c0) register_command git_version hook 0x288c030
   -20> 2014-02-16 02:09:25.559382 7f1dbd965780  5 asok(0x28a01c0) register_command help hook 0x288e090
   -19> 2014-02-16 02:09:25.559386 7f1dbd965780  5 asok(0x28a01c0) register_command get_command_descriptions hook 0x288e080
   -18> 2014-02-16 02:09:25.559506 7f1dbd965780  1 filestore(/var/lib/ceph/osd/ceph-5) mount detected xfs
   -17> 2014-02-16 02:09:25.559524 7f1dbd965780  1 filestore(/var/lib/ceph/osd/ceph-5)  disabling 'filestore replica fadvise' due to known issues with fadvise(DONTNEED) on xfs
   -16> 2014-02-16 02:09:25.561505 7f1dba241700  5 asok(0x28a01c0) entry start
   -15> 2014-02-16 02:09:25.562157 7f1dbd965780  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-5) detect_features: FIEMAP ioctl is supported and appears to work
   -14> 2014-02-16 02:09:25.562169 7f1dbd965780  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-5) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option
   -13> 2014-02-16 02:09:25.575152 7f1dbd965780  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-5) detect_features: syscall(SYS_syncfs, fd) fully supported
   -12> 2014-02-16 02:09:25.578424 7f1dbd965780  0 filestore(/var/lib/ceph/osd/ceph-5) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
   -11> 2014-02-16 02:09:25.578543 7f1dbd965780  2 journal open /var/lib/ceph/osd/ceph-5/journal fsid 0e461957-5502-414a-a279-1638f9648739 fs_op_seq 2
   -10> 2014-02-16 02:09:25.586098 7f1dbd965780 -1 journal _check_disk_write_cache: pclose failed: (61) No data available
    -9> 2014-02-16 02:09:25.586143 7f1dbd965780  1 journal _open /var/lib/ceph/osd/ceph-5/journal fd 21: 5367660544 bytes, block size 4096 bytes, directio = 1, aio = 1
    -8> 2014-02-16 02:09:25.594747 7f1dbd965780  2 journal read_entry 8192 : seq 2 505 bytes
    -7> 2014-02-16 02:09:25.594774 7f1dbd965780  2 journal No further valid entries found, journal is most likely valid
    -6> 2014-02-16 02:09:25.594781 7f1dbd965780  2 journal No further valid entries found, journal is most likely valid
    -5> 2014-02-16 02:09:25.594822 7f1dbd965780  3 journal journal_replay: end of journal, done.
    -4> 2014-02-16 02:09:25.594842 7f1dbd965780  2 journal FileJournal::_open: unable to open journal: open() failed: (2) No such file or directory
    -3> 2014-02-16 02:09:25.598370 7f1db6a3a700  1 FileStore::op_tp worker finish
    -2> 2014-02-16 02:09:25.598433 7f1db723b700  1 FileStore::op_tp worker finish
    -1> 2014-02-16 02:09:25.598476 7f1dbd965780  1 journal close /var/lib/ceph/osd/ceph-5/journal
     0> 2014-02-16 02:09:25.601126 7f1dbd965780 -1 os/FileJournal.cc: In function 'virtual void FileJournal::close()' thread 7f1dbd965780 time 2014-02-16 02:09:25.598585
os/FileJournal.cc: 547: FAILED assert(fd >= 0)

 ceph version 0.76-406-gcf4f702 (cf4f7027e7a05090d3d1c14f2ac4db73bf1d8fa4)
 1: (FileJournal::close()+0x185) [0xa10a65]
 2: (JournalingObjectStore::journal_stop()+0x64) [0x9ad9c4]
 3: (FileStore::umount()+0xd4) [0x970a24]
 4: (OSD::do_convertfs(ObjectStore*)+0x1c1) [0x776d21]
 5: (main()+0x2234) [0x721274]
 6: (__libc_start_main()+0xfd) [0x7f1dbb889ead]
 7: /usr/bin/ceph-osd() [0x724e29]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions #1

Updated by Alfredo Deza about 10 years ago

  • Priority changed from Normal to High

Increasing the priority on this since we cannot seem to get passing tests for ceph-deploy for months.

Actions #2

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from New to In Progress
  • Assignee set to Joao Eduardo Luis
Actions #3

Updated by Samuel Just about 10 years ago

  • Status changed from In Progress to Duplicate
Actions

Also available in: Atom PDF