Project

General

Profile

Bug #8091

osd/SnapMapper.cc: 217: FAILED assert(r == -2)

Added by Sage Weil almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ubuntu@teuthology:/var/lib/teuthworker/archive/sage-2014-04-13_09:43:35-rados:thrash-testing-testing-basic-plana/189285

     0> 2014-04-13 18:36:45.919426 7f3577db5700 -1 osd/SnapMapper.cc: In function 'void SnapMapper::add_oid(const hobject_t&, std::set<snapid_t>, MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)' thread 7f3577db5700 time 2014-04-13 18:36:45.913029
osd/SnapMapper.cc: 217: FAILED assert(r == -2)

 ceph version 0.79-175-ga469374 (a469374a19eecf0142d0fc134f4158a15f70f16a)
 1: (SnapMapper::add_oid(hobject_t const&, std::set<snapid_t, std::less<snapid_t>, std::allocator<snapid_t> >, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x51e) [0x6b392e]
 2: (PG::update_snap_map(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, ObjectStore::Transaction&)+0x37b) [0x7353db]
 3: (PG::append_log(std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, eversion_t, ObjectStore::Transaction&, bool)+0x395) [0x73b765]
 4: (ECBackend::handle_sub_write(pg_shard_t, std::tr1::shared_ptr<OpRequest>, ECSubWrite&, Context*)+0x3f3) [0x900993]
 5: (ECBackend::start_write(ECBackend::Op*)+0xcc0) [0x907850]
 6: (ECBackend::submit_transaction(hobject_t const&, eversion_t const&, PGBackend::PGTransaction*, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&, Context*, Context*, Context*, unsigned long, osd_reqid_t, std::tr1::shared_ptr<OpRequest>)+0xc67) [0x9087c7]
 7: (ReplicatedPG::issue_repop(ReplicatedPG::RepGather*, utime_t)+0x49e) [0x7bec2e]
 8: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x1439) [0x802ab9]
 9: (CopyFromCallback::finish(boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x38) [0x842f58]
 10: (GenContext<boost::tuples::tuple<int, ReplicatedPG::CopyResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type> >::complete(boost::tuples::tuple<int, ReplicatedPG::Cop
yResults*, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>)+0x9) [0x817e49]
 11: (ReplicatedPG::process_copy_chunk(hobject_t, unsigned long, int)+0x4dd) [0x7ee8fd]
 12: (C_Copyfrom::finish(int)+0x84) [0x842c44]
 13: (Context::complete(int)+0x9) [0x65a7a9]
 14: (Finisher::finisher_thread_entry()+0x1b8) [0x98bc08]
 15: (()+0x8182) [0x7f35909d4182]
 16: (clone()+0x6d) [0x7f358ed7512d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.


Related issues

Related to Ceph - Bug #7892: osd/ReplicatedPG.cc: 7881: FAILED assert((data_included.empty() && data.length() == 0) || (!data_included.empty() && data.length() > 0)) Duplicate 03/28/2014
Related to Ceph - Bug #11565: "FAILED assert(r == -2)" in rados-firefly-distro-basic-magna run Can't reproduce 05/07/2015
Duplicated by Ceph - Bug #8044: osd/ReplicatedPG.cc: 2276: FAILED assert(p != snapset.clones.end()) Duplicate 04/08/2014
Duplicated by Ceph - Bug #8099: LibRBD.DiffIterateStress failure - extra extent in diff Duplicate 04/14/2014

Associated revisions

Revision 7e697b1b (diff)
Added by Samuel Just almost 10 years ago

ReplicatedPG::recover_replicas: do not recover clones while snap obj is missing

Otherwise, we cannot safely read the snapset for the clone.

Fixes: #8091
Signed-off-by: Samuel Just <>

History

#1 Updated by Samuel Just almost 10 years ago

  • Status changed from 12 to 7
  • Assignee set to Samuel Just

recover_replicas can cause us to read the snapset from an obsolete snapdir or head object. recover_replicas should not try to recover a clone until the head/snapdir is no longer missing.

#2 Updated by Sage Weil almost 10 years ago

  • Status changed from 7 to Resolved

Also available in: Atom PDF