Actions
Bug #6982
closedosd crashed when running mixed versions of dumpling and master
Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
have a cluster up and running on dumpling, upgrading only the osds on one node to master branch followed by thrashing osds, causes the osd that is on master to crash with core dump.
the config file used:
overrides: ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - log bound mismatch roles: - - mon.a - mon.b - mds.a - osd.0 - osd.1 - osd.2 - - mon.c - osd.3 - osd.4 - osd.5 - client.0 targets: ubuntu@mira057.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC+IR8k7Hrnnl7zZUj9Kyb5GrlCodfuyvxkpokRLWZGLbjPOzd3gdszhLCWa0F2FVzl/2upKr9VfMzoVYF5Q3eKn7sQJ1AmDdvHINKM6hYnm2ruKxzLCjK11wdr5Gt/WFQ3g6U5YFjIX19cLVLhrPwj0aM+27cTv+6KZrl56dPwRj7vVnyB7CIUmc1NpbD/LN+Oan+DISnWNvSUrdq0e70owvuZv2uHgWOJstErLD/arxQ97A1AdxLcfi8sAA12Gu3if4t+Aq+6KmZorrxQimni06b7vWr9EC5NDuxcOm5ReGkyy45ED4QK7yVCmzmQFpRX+X1ZLbQq71zcQf7E3EW7 ubuntu@mira074.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCuzTGrQ9CaYud6lfhAXosbRh8/P1xCeTQfxuj5QYWYJf079r2b4IPlhW+rOc2ZfK5HkOatZH0+eV6eMZREYMLZNn8n+S3jQclWpyyoI6U0B0TP65ByYRtI2f+wvab5TGBWXHasGLNQh7zzxadhLWMVQ9AT/7c5oJTEHe1+BRIfvR0dBpK/cCrOlVjwcGUYkZn6s/My216zbVVuENHXa62NJBAlmNEWJsJHRh9IEDB+Cl+PmD+qD5zAWgJr2e2OtOWh+9v8v6YWOyO3KEhg/BKKxmBevkdcKZTcybjARDjU2IMu9nyeOhH1F+8xQQJ7dDRQ5TA7DYH6lKO9iEHD8YFr tasks: - internal.save_config: null - internal.check_lock: null - internal.connect: null - internal.check_conflict: null - internal.check_ceph_data: null - internal.vm_setup: null - internal.base: null - internal.archive: null - internal.coredump: null - internal.sudo: null - internal.syslog: null - internal.timer: null - chef: null - install: branch: dumpling - ceph: fs: xfs - install.upgrade: osd.0: branch: master - ceph.restart: daemons: - osd.0 - osd.1 - osd.2 - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - ceph.restart: daemons: - mon.a wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rados/test.sh - ceph.restart: daemons: - mon.b wait-for-healthy: false wait-for-osds-up: true - workunit: clients: client.0: - rados/test.sh - ceph.restart: daemons: - mon.c wait-for-healthy: false wait-for-osds-up: true - ceph.wait_for_mon_quorum: - a - b - c - workunit: clients: client.0: - rados/test.sh -1> 2013-12-11 14:01:50.054564 7fc76b7a1700 5 --OSD::tracker-- reqid: unknown.0.0:0, seq: 30236, time: 2013-12-11 14:01:50.054564, event: done, request: pg_query(281.1 epoch 902) v2 0> 2013-12-11 14:01:50.055138 7fc766797700 -1 osd/PG.cc: In function 'void PG::start_flush(ObjectStore::Transaction*, std::list<Context*>*, std::list<Context*>*)' thread 7fc766797700 time 2013-12-11 1 4:01:50.049548 osd/PG.cc: 4517: FAILED assert(!flushed) ceph version 0.67.4-36-g9875c8b (9875c8b1992c59cc0c40901a44573676cdff2669) 1: (PG::start_flush(ObjectStore::Transaction*, std::list<Context*, std::allocator<Context*> >*, std::list<Context*, std::allocator<Context*> >*)+0x1b4) [0x6f3e64] 2: (PG::RecoveryState::ReplicaActive::ReplicaActive(boost::statechart::state<PG::RecoveryState::ReplicaActive, PG::RecoveryState::Started, PG::RecoveryState::RepNotRecovering, (boost::statechart::history_ mode)0>::my_context)+0x11c) [0x70a6dc] 3: (boost::statechart::state<PG::RecoveryState::ReplicaActive, PG::RecoveryState::Started, PG::RecoveryState::RepNotRecovering, (boost::statechart::history_mode)0>::shallow_construct(boost::intrusive_ptr< PG::RecoveryState::Started> const&, boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>&)+0x 5c) [0x743c0c] 4: (boost::statechart::detail::safe_reaction_result boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::transit_impl<PG: :RecoveryState::ReplicaActive, PG::RecoveryState::RecoveryMachine, boost::statechart::detail::no_transition_function>(boost::statechart::detail::no_transition_function const&)+0x91) [0x7458f1] 5: (PG::RecoveryState::Stray::react(PG::MLogRec const&)+0x6df) [0x72d05f] 6: (boost::statechart::simple_state<PG::RecoveryState::Stray, PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::n a, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x 166) [0x74f3f6] 7: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::send_event(boost::statechart::even t_base const&)+0x5b) [0x7387cb] 8: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost::statechart::null_exception_translator>::process_event(boost::statechart::e vent_base const&)+0x11) [0x738ae1] 9: (PG::handle_peering_event(std::tr1::shared_ptr<PG::CephPeeringEvt>, PG::RecoveryCtx*)+0x313) [0x6fa143] 10: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x2cb) [0x68fc1b] 11: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x12) [0x6cd7a2] 12: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x8b4f06] 13: (ThreadPool::WorkThread::entry()+0x10) [0x8b6d10] 14: (()+0x7e9a) [0x7fc77a09be9a] 15: (clone()+0x6d) [0x7fc7781e7ccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
logs are copied to ubuntu@mira057.front.sepia.ceph.com:/home/ubuntu/bug
Updated by Sage Weil over 10 years ago
- Status changed from New to Duplicate
Actions