Project

General

Profile

Actions

Bug #6117

closed

osd: bad mutex assert in ReplicatedPG::context_registry_on_change()

Added by Sage Weil over 10 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

100%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

pull request

2013-08-25 12:26:35.911288 7fda96367700 -1 common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7fda96367700 time 2013-08-25 12:26:35.905967
common/Mutex.cc: 93: FAILED assert(r == 0)

 ceph version 0.67-352-g8b1b745 (8b1b74598bae0e13691e6244c647fb89cc9e21a7)
 1: (Mutex::Lock(bool)+0x1c3) [0x8979f3]
 2: (SharedPtrRegistry<hobject_t, ObjectContext>::OnRemoval::operator()(ObjectContext*)+0x21) [0x770c31]
 3: (std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2>::operator=(std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2> const&)+0x86) [0x633776]
 4: (ReplicatedPG::context_registry_on_change()+0x23a) [0x70888a]
 5: (ReplicatedPG::on_change(ObjectStore::Transaction*)+0xec) [0x7236fc]
 6: (PG::start_peering_interval(std::tr1::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, ObjectStore::Transaction*)+0x5e3) [0x69abb3]
 7: (PG::RecoveryState::Reset::react(PG::AdvMap const&)+0x313) [0x69fe23]
 8: (boost::statechart::detail::reaction_result boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list5<boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed, boost::statechart::detail::no_context<boost::statechart::event_base>, &(boost::statechart::detail::no_context<boost::statechart::event_base>::no_function(boost::statechart::event_base const&))> >, boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0> >(boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*)+0x100) [0x6ea220]
 9: (boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x4e) [0x6ea2fe]
 10: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0xfb) [0x6cd35b]
 11: (PG::RecoveryState::handle_event(boost::statechart::event_base const&, PG::RecoveryCtx*)+0x57) [0x6cd5d7]
 12: (PG::handle_advance_map(std::tr1::shared_ptr<OSDMap const>, std::tr1::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&, PG::RecoveryCtx*)+0x464) [0x6b9004]
 13: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >*)+0x206) [0x6139a6]
 14: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x23a) [0x61400a]
 15: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x12) [0x6592d2]
 16: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8c5156]
 17: (ThreadPool::WorkThread::entry()+0x10) [0x8c6f60]
 18: (()+0x7e9a) [0x7fdaa9c24e9a]
 19: (clone()+0x6d) [0x7fdaa7db7ccd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

ubuntu@teuthology:/a/teuthology-2013-08-25_09:25:20-krbd-master-testing-basic-plana/4964$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: c2f29906882bd30794da6993e755a0dab2b7a665
machine_type: plana
nuke-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: master
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 8b1b74598bae0e13691e6244c647fb89cc9e21a7
  ceph-deploy:
    branch:
      dev: master
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: 8b1b74598bae0e13691e6244c647fb89cc9e21a7
  s3tests:
    branch: master
  workunit:
    sha1: 8b1b74598bae0e13691e6244c647fb89cc9e21a7
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
tasks:
- chef: null
- clock.check: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds: null
- rbd:
    all:
      image_size: 20480
- workunit:
    clients:
      all:
      - suites/ffsb.sh
teuthology_branch: master

Actions #1

Updated by Sage Weil over 10 years ago

also

ubuntu@teuthology:/a/teuthology-2013-08-25_09:24:44-rbd-master-testing-basic-plana/4847
ubuntu@teuthology:/a/teuthology-2013-08-25_09:24:44-rbd-master-testing-basic-plana/4849
ubuntu@teuthology:/a/teuthology-2013-08-25_09:24:44-rbd-master-testing-basic-plana/4853

Actions #2

Updated by Sage Weil over 10 years ago

ubuntu@teuthology:/a/teuthology-2013-08-26_01:01:39-krbd-master-testing-basic-plana/5937

Actions #3

Updated by Loïc Dachary over 10 years ago

  • Assignee set to Loïc Dachary
Actions #4

Updated by Loïc Dachary over 10 years ago

Is this a possible scenario for this error:

But gregaf shows it can't happend:

<gregaf> I doubt the issue is with our handling of the pthreads atomicity stuff; it's probably a bad pointer deref or something :)
<gregaf> the only error conditions pthread_mutex_lock() can return are if you've done something *very* naughty; probably one of them is what's been violated but it doesn't involve racing with other threads or anything that I can see?
<gregaf> in particular that trylock/lock thing can't race as lock is blocking; the trylock is just to avoid doing perfcounters overhead on non-blocking mutex locks
<gregaf> loicd: that assert usually means we're looking at some deallocated memory for some reason, or another similar issue

Actions #5

Updated by Loïc Dachary over 10 years ago

  • Description updated (diff)
Actions #6

Updated by Loïc Dachary over 10 years ago

trying to run the config.yaml standalone to see if it reproduces the problem

Actions #7

Updated by Loïc Dachary over 10 years ago

/a/teuthology-2013-08-25_09:24:/log/ceph-osd.3.log.gz

 ceph version 0.67-352-g8b1b745 (8b1b74598bae0e13691e6244c647fb89cc9e21a7)
 1: ceph-osd() [0x80b12a]
 2: (()+0xfcb0) [0x7fe6d730ecb0]
 3: (gsignal()+0x35) [0x7fe6d53dc425]
 4: (abort()+0x17b) [0x7fe6d53dfb8b]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7fe6d5d2e69d]
 6: (()+0xb5846) [0x7fe6d5d2c846]
 7: (()+0xb5873) [0x7fe6d5d2c873]
 8: (()+0xb596e) [0x7fe6d5d2c96e]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1df) [0x8d203f]
 10: (Mutex::Lock(bool)+0x1c3) [0x8979f3]
 11: (SharedPtrRegistry<hobject_t, ObjectContext>::OnRemoval::operator()(ObjectContext*)+0x21) [0x770c31]
 12: (std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2>::operator=(std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2> const&)+0x86) [0x633776]
 13: (ReplicatedPG::context_registry_on_change()+0x23a) [0x70888a]
 14: (ReplicatedPG::on_change(ObjectStore::Transaction*)+0xec) [0x7236fc]
 15: (PG::start_peering_interval(std::tr1::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> > const&, std::vector<int, std::allocator<int> > const&, ObjectStore::Transaction*)+0x5e3) [0x69abb3]
 16: (PG::RecoveryState::Reset::react(PG::AdvMap const&)+0x313) [0x69fe23]
 17: (boost::statechart::detail::reaction_result boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list5<boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed, boost::statechart::detail::no_context<boost::statechart::event_base>, &(boost::statechart::detail::no_context<boost::statechart::event_base>::no_function(boost::statechart::event_base const&))> >, boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0> >(boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>&, boost::statechart::event_base const&, void const*)+0x100) [0x6ea220]
 18: (boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x4e) [0x6ea2fe]
 19: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_queued_events()+0xfb) [0x6cd35b]
 20: (PG::RecoveryState::handle_event(boost::statechart::event_base const&, PG::RecoveryCtx*)+0x57) [0x6cd5d7]
 21: (PG::handle_advance_map(std::tr1::shared_ptr<OSDMap const>, std::tr1::shared_ptr<OSDMap const>, std::vector<int, std::allocator<int> >&, std::vector<int, std::allocator<int> >&, PG::RecoveryCtx*)+0x464) [0x6b9004]
 22: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >*)+0x206) [0x6139a6]
 23: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x23a) [0x61400a]
 24: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x12) [0x6592d2]
 25: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8c5156]
 26: (ThreadPool::WorkThread::entry()+0x10) [0x8c6f60]
 27: (()+0x7e9a) [0x7fe6d7306e9a]
 28: (clone()+0x6d) [0x7fe6d5499ccd]

Actions #8

Updated by Loïc Dachary over 10 years ago

teuthology-2013-08-26_01:01:/log/ceph-osd.4.log.gz
teuthology-2013-08-26_01:01:/log/ceph-osd.1.log.gz
teuthology-2013-08-26_01:01:/log/ceph-osd.2.log.gz
teuthology-2013-08-26_01:01:/log/ceph-osd.4.log.gz
teuthology-2013-08-26_01:01:/log/ceph-osd.4.log.gz
teuthology-2013-08-26_01:01:/log/ceph-osd.4.log.gz

All logs show the exact same trace.

Actions #9

Updated by Loïc Dachary over 10 years ago

context_registry_on_change was

void ReplicatedPG::context_registry_on_change()
{
  list<ObjectContext *> contexts;
  for (map<hobject_t, ObjectContext*>::iterator i = object_contexts.begin();
       i != object_contexts.end();
       ++i) {
    i->second->get();
    contexts.push_back(i->second);
    for (map<pair<uint64_t, entity_name_t>, WatchRef>::iterator j =
       i->second->watchers.begin();
     j != i->second->watchers.end();
     i->second->watchers.erase(j++)) {
      j->second->discard();
    }
  }
  for (list<ObjectContext *>::iterator i = contexts.begin();
       i != contexts.end();
       contexts.erase(i++)) {
    put_object_context(*i);
  }
}

and changed to
void ReplicatedPG::context_registry_on_change()
{
  pair<hobject_t, ObjectContextRef> i;
  while (object_contexts.get_next(i.first, &i)) {
    ObjectContextRef obc(i.second);
    if (obc) {
      for (map<pair<uint64_t, entity_name_t>, WatchRef>::iterator j =
         obc->watchers.begin();
       j != obc->watchers.end();
       obc->watchers.erase(j++)) {
    j->second->discard();
      }
    }
  }
}

Actions #10

Updated by Loïc Dachary over 10 years ago

r = 35
#define    EDEADLK        35    /* Resource deadlock would occur */
Actions #11

Updated by Loïc Dachary over 10 years ago

crashes with the provided yaml file and sudo gdb /usr/bin/ceph-osd cephtest/lo1308271455/archive/coredump/1377608368.3942.core provides a more detailed stack trace.

#0  0x00007f4893097b7b in raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x000000000080b27e in reraise_fatal (signum=6) at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:105
#3  <signal handler called>
#4  0x00007f4891165425 in __GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#5  0x00007f4891168b8b in __GI_abort () at abort.c:91
#6  0x00007f4891ab769d in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007f4891ab5846 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007f4891ab5873 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007f4891ab596e in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00000000008d203f in ceph::__ceph_assert_fail (assertion=0xa47584 "r == 0", file=<optimized out>, line=93, func=0xa5a410 "void Mutex::Lock(bool)") at common/assert.cc:77
#11 0x00000000008979f3 in Mutex::Lock (this=0x27af730, no_lockdep=<optimized out>) at common/Mutex.cc:93
#12 0x0000000000770c31 in Locker (m=..., this=<synthetic pointer>) at ./common/Mutex.h:120
#13 SharedPtrRegistry<hobject_t, ObjectContext>::OnRemoval::operator() (this=0x2946748, to_remove=0x2858780) at ./common/sharedptr_registry.hpp:46
#14 0x0000000000633776 in _M_release (this=0x2946730) at /usr/include/c++/4.6/tr1/shared_ptr.h:147
#15 std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2>::operator= (this=<optimized out>, __r=...) at /usr/include/c++/4.6/tr1/shared_ptr.h:367
#16 0x000000000070888a in operator= (this=0x7f487f7d0b10) at /usr/include/c++/4.6/tr1/shared_ptr.h:548
#17 operator= (this=0x7f487f7d0b10) at /usr/include/c++/4.6/tr1/shared_ptr.h:992
#18 operator= (this=0x7f487f7d0ae0) at /usr/include/c++/4.6/bits/stl_pair.h:87
#19 get_next (next=0x7f487f7d0ae0, key=..., this=0x27af728) at ./common/sharedptr_registry.hpp:76
#20 ReplicatedPG::context_registry_on_change (this=0x27ae000) at osd/ReplicatedPG.cc:4555
#21 0x00000000007236fc in ReplicatedPG::on_change (this=0x27ae000, t=0x4360600) at osd/ReplicatedPG.cc:6695
#22 0x000000000069abb3 in PG::start_peering_interval (this=0x27ae000, lastmap=..., newup=..., newacting=..., t=0x4360600) at osd/PG.cc:4612
#23 0x000000000069fe23 in PG::RecoveryState::Reset::react (this=<optimized out>, advmap=...) at osd/PG.cc:5256
#24 0x00000000006ea220 in react<PG::RecoveryState::Reset, boost::statechart::event_base, void const*> (evt=..., stt=..., eventType=<optimized out>)
    at /usr/include/boost/statechart/custom_reaction.hpp:42
#25 boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list5<boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed, boost::statechart::detail::no_context<boost::statechart::event_base>, &boost::statechart::detail::no_context<boost::statechart::event_base>::no_function> >, boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0> > (stt=..., evt=..., 
    eventType=0xd70c40) at /usr/include/boost/statechart/simple_state.hpp:816
#26 0x00000000006ea2fe in local_react<boost::mpl::list5<boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed> > > (
    eventType=0xd70c40, evt=..., this=0x2786dc0) at /usr/include/boost/statechart/simple_state.hpp:851
#27 local_react_impl<boost::mpl::list<boost::statechart::custom_reaction<PG::QueryState>, boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed> >, boost::statechart::simple_state<PG::RecoveryState::Reset, PG::RecoveryState::RecoveryMachine, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0> > (stt=..., eventType=0xd70c40, evt=...) at /usr/include/boost/statechart/simple_state.hpp:820
#28 local_react<boost::mpl::list<boost::statechart::custom_reaction<PG::QueryState>, boost::statechart::custom_reaction<PG::AdvMap>, boost::statechart::custom_reaction<PG::ActMap>, boost::statechart::custom_reaction<PG::NullEvt>, boost::statechart::custom_reaction<PG::FlushedEvt>, boost::statechart::transition<boost::statechart::event_base, PG::RecoveryState::Crashed> > > (eventType=0xd70c40, evt=..., this=0x2786dc0) at /usr/include/boost/statechart/simple_state.hpp:851

Actions #12

Updated by Loïc Dachary over 10 years ago

The following:

diff --git a/src/test/common/test_sharedptr_registry.cc b/src/test/common/test_sharedptr_registry.cc
index 233412a..a73a32b 100644
--- a/src/test/common/test_sharedptr_registry.cc
+++ b/src/test/common/test_sharedptr_registry.cc
@@ -238,6 +238,20 @@ TEST_F(SharedPtrRegistry_all, get_next) {

     EXPECT_FALSE(registry.get_next(i.first, &i));
   }
+  {
+    SharedPtrRegistryTest registry;
+    const unsigned int key1 = 111;
+    shared_ptr<int> *ptr1 = new shared_ptr<int>(registry.lookup_or_create(key1));
+    const unsigned int key2 = 222;
+    shared_ptr<int> ptr2 = registry.lookup_or_create(key2);
+    
+    pair<unsigned int, shared_ptr<int> > i;
+    EXPECT_TRUE(registry.get_next(i.first, &i));
+    EXPECT_EQ(key1, i.first);
+    delete ptr1;
+    EXPECT_TRUE(registry.get_next(i.first, &i));    
+    EXPECT_EQ(key2, i.first);
+  }
 }

 class SharedPtrRegistry_destructor : public ::testing::Test {

triggers the error
common/Mutex.cc: In function 'void Mutex::Lock(bool)' thread 7fa02940d780 time 2013-08-27 15:53:35.985533
common/Mutex.cc: 93: FAILED assert(r == 0)
 ceph version 0.67-370-gaf5281e (af5281e0f672554a322fef826d2229f563ae8577)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x836e49]
 2: (Mutex::Lock(bool)+0x14e) [0x83679e]
 3: (Mutex::Locker::Locker(Mutex&)+0x2f) [0x82e78f]
 4: (SharedPtrRegistry<unsigned int, int>::OnRemoval::operator()(int*)+0x2b) [0x834011]
 5: (std::tr1::_Sp_counted_base_impl<int*, SharedPtrRegistry<unsigned int, int>::OnRemoval, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x27) [0x836009]
 6: (std::tr1::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x42) [0x830dca]
 7: (std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2>::operator=(std::tr1::__shared_count<(__gnu_cxx::_Lock_policy)2> const&)+0x56) [0x82f98c]
 8: (std::tr1::__shared_ptr<int, (__gnu_cxx::_Lock_policy)2>::operator=(std::tr1::__shared_ptr<int, (__gnu_cxx::_Lock_policy)2> const&)+0x39) [0x82ede9]
 9: (std::tr1::shared_ptr<int>::operator=(std::tr1::shared_ptr<int> const&)+0x23) [0x82ee13]
 10: (std::pair<unsigned int, std::tr1::shared_ptr<int> >::operator=(std::pair<unsigned int, std::tr1::shared_ptr<int> > const&)+0x37) [0x8304ed]

Actions #13

Updated by Loïc Dachary over 10 years ago

  • Description updated (diff)
  • Status changed from New to Fix Under Review
Actions #14

Updated by Loïc Dachary over 10 years ago

Running the following against the wip-6117 branch

interactive-on-error: true
os_type: ubuntu
overrides:
  admin_socket:
    branch: master
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: ea2fc85e091683ced062594ad25fa569e5c1bbd7
  ceph-deploy:
    branch:
      dev: master
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
  install:
    ceph:
      sha1: ea2fc85e091683ced062594ad25fa569e5c1bbd7
  s3tests:
    branch: master
  workunit:
    sha1: ea2fc85e091683ced062594ad25fa569e5c1bbd7
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
tasks:
- chef: null
- clock.check: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds: null
- rbd:
    all:
      image_size: 20480
- workunit:
    clients:
      all:
      - suites/ffsb.sh
teuthology_branch: master

passes
...
INFO:teuthology.run:Summary data:
{duration: 6715.834357976913, flavor: basic, owner: loic@dachary.org, success: true}
INFO:teuthology.run:pass
</pre.

Actions #15

Updated by Loïc Dachary over 10 years ago

  • Due date set to 08/27/2013
  • % Done changed from 0 to 90
Actions #16

Updated by Greg Farnum over 10 years ago

  • Status changed from Fix Under Review to Resolved
  • % Done changed from 90 to 100

Sam reviewed and merged this into master, ea2fc85e091683ced062594ad25fa569e5c1bbd7

Actions

Also available in: Atom PDF