Project

General

Profile

Bug #11428

"FAILED assert(0)" in rados-giant-distro-basic-multi run

Added by Yuri Weinstein almost 9 years ago. Updated almost 9 years ago.

Status:
Rejected
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is validation of giant v0.87.2 release
Run: http://pulpito.ceph.com/teuthology-2015-04-17_16:28:50-rados-giant-distro-basic-multi/
Job: 852039
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-04-17_16:28:50-rados-giant-distro-basic-multi/852039/

description: rados/monthrash/{ceph/ceph.yaml clusters/9-mons.yaml fs/xfs.yaml msgr-failures/mon-delay.yaml thrashers/sync.yaml workloads/snaps-few-objects.yaml}

2015-04-17 20:37:27.878772 7f450d7a3700 -1 journal FileJournal::write_bl : write_fd failed: (5) Input/output error
2015-04-17 20:37:27.878787 7f450d7a3700 -1 journal FileJournal::do_write: write_bl(pos=48775168) failed
2015-04-17 20:37:27.880180 7f450d7a3700 -1 os/FileJournal.cc: In function 'void FileJournal::do_write(ceph::bufferlist&)' thread 7f450d7a3700 time 2015-04-17 20:37:27.878801
os/FileJournal.cc: 1050: FAILED assert(0)

Assertion: os/FileJournal.cc: 1050: FAILED assert(0)
ceph version 0.87.1-108-gc1301e8 (c1301e84aee0f399db85e2d37818a66147a0ce78)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xb8061b]
 2: (FileJournal::do_write(ceph::buffer::list&)+0x91a) [0xa49b4a]
 3: (FileJournal::write_thread_entry()+0x151) [0xa4d2c1]
 4: (FileJournal::Writer::entry()+0xd) [0x91778d]
 5: (()+0x8182) [0x7f4519b57182]
 6: (clone()+0x6d) [0x7f45180c338d]

Related issues

Related to Ceph - Bug #11267: 2015-03-27T19:11:41.127 INFO:tasks.rados.rados.0.burnupi56.stderr:Error: finished tid 1 when last_acked_tid was 6 Resolved 03/29/2015
Related to Ceph - Bug #11332: mon failed to read inc osdmap Resolved 04/06/2015

History

#1 Updated by Loïc Dachary almost 9 years ago

  • Description updated (diff)

#2 Updated by Loïc Dachary almost 9 years ago

It's not http://tracker.ceph.com/issues/9443 because it's xfs and not btrfs

#3 Updated by Loïc Dachary almost 9 years ago

  • Description updated (diff)

#4 Updated by Loïc Dachary almost 9 years ago

  • Assignee set to Yuri Weinstein

The line from the log

2015-04-17 20:37:27.878772 7f450d7a3700 -1 journal FileJournal::write_bl : write_fd failed: (5) Input/output error

suggest a hardware failure. But I'm not familiar with their manifestation. What do you think Yuri ? I know you had to deal a lot with that recently and I'm hoping you know more. I tried to ssh to burnupi30 which had the core dump but I get
ssh: connect to host burnupi30 port 22: No route to host

#5 Updated by Yuri Weinstein almost 9 years ago

burnupi30 had been pain lately, so it's a good working assumption, we will know more when I rerun failed jobs.

#6 Updated by Yuri Weinstein almost 9 years ago

  • Status changed from New to Rejected

Also available in: Atom PDF