Project

General

Profile

Actions

Backport #14236

closed

"OSDMonitor.cc: 2116: FAILED assert(0)" in rados-hammer-distro-basic-openstack

Added by Yuri Weinstein over 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Joao Eduardo Luis
Target version:
Release:
hammer
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Actions #1

Updated by Samuel Just over 8 years ago

  • Assignee set to Samuel Just
  • Priority changed from Normal to Urgent
Actions #2

Updated by Samuel Just over 8 years ago

  • Assignee changed from Samuel Just to Joao Eduardo Luis
Actions #3

Updated by Joao Eduardo Luis over 8 years ago

  • Category set to Monitor
  • Status changed from New to Fix Under Review
  • Target version set to v0.94.6
  • Affected Versions v0.94.6 added

got a candidate fix in https://github.com/ceph/ceph/pull/7150

needs review & testing.

Actions #4

Updated by Joao Eduardo Luis over 8 years ago

  • Regression changed from No to Yes
Actions #5

Updated by Loïc Dachary over 8 years ago

  • Tracker changed from Bug to Backport
  • Description updated (diff)
  • Target version deleted (v0.94.6)

original description

Run: http://pulpito.ovh.sepia.ceph.com:8081/teuthology-2016-01-04_21:00:02-rados-hammer-distro-basic-openstack/
Job: 59165
Logs: http://teuthology.ovh.sepia.ceph.com/teuthology/teuthology-2016-01-04_21:00:02-rados-hammer-distro-basic-openstack/59165/teuthology.log

2016-01-05T00:44:04.497 INFO:tasks.ceph.osd.5.target084154.stderr:2016-01-05 00:44:04.397876 7f2fdb652700 -1 osd.5 336 heartbeat_check: no reply from osd.1 since back 2016-01-05 00:43:39.522911 front 2016-01-05 00:43:39.522911 (cutoff 2016-01-05 00:43:44.397873)
2016-01-05T00:44:04.583 INFO:tasks.ceph.mon.b.target084154.stderr:mon/OSDMonitor.cc: In function 'MOSDMap* OSDMonitor::build_incremental(epoch_t, epoch_t)' thread 7f76ae096700 time 2016-01-05 00:44:04.479479
2016-01-05T00:44:04.583 INFO:tasks.ceph.mon.b.target084154.stderr:mon/OSDMonitor.cc: 2116: FAILED assert(0)
2016-01-05T00:44:04.592 INFO:tasks.ceph.mon.b.target084154.stderr: ceph version 0.94.5-178-g9739d4d (9739d4de49f8167866eda556b2f1581c068ec8a7)
2016-01-05T00:44:04.593 INFO:tasks.ceph.mon.b.target084154.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7dffeb]
2016-01-05T00:44:04.593 INFO:tasks.ceph.mon.b.target084154.stderr: 2: (OSDMonitor::build_incremental(unsigned int, unsigned int)+0x97e) [0x61cc4e]
2016-01-05T00:44:04.593 INFO:tasks.ceph.mon.b.target084154.stderr: 3: (OSDMonitor::send_incremental(PaxosServiceMessage*, unsigned int)+0x54c) [0x61d74c]
2016-01-05T00:44:04.593 INFO:tasks.ceph.mon.b.target084154.stderr: 4: (OSDMonitor::send_latest(PaxosServiceMessage*, unsigned int)+0x81) [0x61e951]
2016-01-05T00:44:04.593 INFO:tasks.ceph.mon.b.target084154.stderr: 5: (OSDMonitor::process_failures()+0x1ea) [0x61ecaa]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 6: (OSDMonitor::update_from_paxos(bool*)+0x12c4) [0x623c74]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 7: (PaxosService::refresh(bool*)+0x19a) [0x60476a]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 8: (Monitor::refresh_from_paxos(bool*)+0x1db) [0x5b079b]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 9: (Paxos::do_refresh()+0x2e) [0x5eeece]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 10: (Paxos::commit_finish()+0x569) [0x5fc359]
2016-01-05T00:44:04.594 INFO:tasks.ceph.mon.b.target084154.stderr: 11: (C_Committed::finish(int)+0x2b) [0x6007cb]
2016-01-05T00:44:04.595 INFO:tasks.ceph.mon.b.target084154.stderr: 12: (Context::complete(int)+0x9) [0x5d51b9]
2016-01-05T00:44:04.595 INFO:tasks.ceph.mon.b.target084154.stderr: 13: (MonitorDBStore::C_DoTransaction::finish(int)+0x8c) [0x5ff8fc]
2016-01-05T00:44:04.595 INFO:tasks.ceph.mon.b.target084154.stderr: 14: (Context::complete(int)+0x9) [0x5d51b9]
2016-01-05T00:44:04.595 INFO:tasks.ceph.mon.b.target084154.stderr: 15: (Finisher::finisher_thread_entry()+0x158) [0x7172a8]
2016-01-05T00:44:04.596 INFO:tasks.ceph.mon.b.target084154.stderr: 16: (()+0x8182) [0x7f76b3194182]
2016-01-05T00:44:04.596 INFO:tasks.ceph.mon.b.target084154.stderr: 17: (clone()+0x6d) [0x7f76b16ff47d]
2016-01-05T00:44:04.596 INFO:tasks.ceph.mon.b.target084154.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Actions #6

Updated by Loïc Dachary over 8 years ago

Note: the rationale for this backport is to fix a regression introduced by a previous backport (details can be found in the commit message).

Actions #7

Updated by Yuri Weinstein over 8 years ago

  • Related to Bug #14306: "RadosModel.h: 854: FAILED assert(0)" in rados-hammer-distro-basic-openstack added
Actions #8

Updated by Samuel Just over 8 years ago

  • Related to deleted (Bug #14306: "RadosModel.h: 854: FAILED assert(0)" in rados-hammer-distro-basic-openstack)
Actions #9

Updated by Nathan Cutler over 8 years ago

So it's not a conventional backport, but rather a hammer-specific fix.

Actions #10

Updated by Loïc Dachary about 8 years ago

  • Status changed from Fix Under Review to In Progress
Actions #11

Updated by Loïc Dachary about 8 years ago

  • Status changed from In Progress to Resolved
  • Target version set to v0.94.6
Actions

Also available in: Atom PDF