Project

General

Profile

Actions

Bug #1601

closed

mds crash during snaps workunit

Added by Josh Durgin over 12 years ago. Updated over 10 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From teuthology:~teuthworker/archive/nightly_coverage_2011-10-06/163/teuthology.log:

2011-10-06T01:50:02.158 INFO:teuthology.task.ceph.mon.0.err:daemon-helper: command crashed with signal 9
2011-10-06T03:49:26.808 INFO:teuthology.task.ceph.mds.0.err:*** Caught signal (Bus error) **
2011-10-06T03:49:26.809 INFO:teuthology.task.ceph.mds.0.err: in thread 0x7f8abb311700
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: ceph version 0.36-224-g5ab7f8f (commit:5ab7f8fab37e7a61436e073fa30951bd4e23f20a)
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: 1: /tmp/cephtest/binary/usr/local/bin/ceph-mds() [0x8f1194]
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: 2: (()+0xfb40) [0x7f8ac038cb40]
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: 3: (operator delete(void*)+0x1e4) [0x7f8abf8827f4]
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: 4: (std::_Rb_tree<int, std::pair<int const, std::vector<snapid_t, std::allocator<snapid_t> > >, std::_Select1st<std::pair<int const, std::vector<snapid_t, std::allocator<snapid_t> > > >, std::less<int>, std::allocator<std::pair<int const, std::vector<snapid_t, std::allocator<snapid_t> > > > >::_M_erase(std::_Rb_tree_node<std::pair<int const, std::vector<snapid_t, std::allocator<snapid_t> > > >*)+0x4e) [0x77963e]
2011-10-06T03:49:29.387 INFO:teuthology.task.ceph.mds.0.err: 5: (MRemoveSnaps::~MRemoveSnaps()+0x32) [0x77cfc2]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 6: (SimpleMessenger::Pipe::discard_queue()+0x7ba) [0x82806a]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 7: (SimpleMessenger::Pipe::fail()+0x5d) [0x833ecd]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 8: (SimpleMessenger::Pipe::fault(bool, bool)+0x334) [0x834694]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 9: (SimpleMessenger::Pipe::reader()+0x1b10) [0x840840]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 10: (SimpleMessenger::Pipe::Reader::entry()+0x15) [0x4931e5]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 11: (Thread::_entry_func(void*)+0x12) [0x81a0b2]
2011-10-06T03:49:29.388 INFO:teuthology.task.ceph.mds.0.err: 12: (()+0x7971) [0x7f8ac0384971]
2011-10-06T03:49:29.389 INFO:teuthology.task.ceph.mds.0.err: 13: (clone()+0x6d) [0x7f8abee1892d]
Actions #1

Updated by Josh Durgin over 12 years ago

  • Category set to 1
  • Target version set to v0.38
Actions #2

Updated by Sage Weil over 12 years ago

this looks like heap corruption to me. we should run that teuthology task with valgrind on the mds.

Actions #3

Updated by Sage Weil over 12 years ago

  • Translation missing: en.field_position set to 1
  • Translation missing: en.field_position changed from 1 to 948
Actions #4

Updated by Sage Weil over 12 years ago

  • Target version changed from v0.38 to v0.39
Actions #5

Updated by Sage Weil over 12 years ago

  • Target version changed from v0.39 to v0.40
Actions #6

Updated by Sage Weil over 12 years ago

  • Target version deleted (v0.40)
  • Translation missing: en.field_position deleted (1072)
  • Translation missing: en.field_position set to 215
Actions #7

Updated by Sage Weil over 11 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
Actions #8

Updated by Sage Weil over 10 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF