Project

General

Profile

Actions

Bug #13395

closed

mixed infernalis+hammer crashes on temp object cleanup

Added by Sage Weil over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

 -2234> 2015-10-06 17:51:31.975587 7fd6f693c700 20 snap_mapper.remove_oid -1/00000000/temp_4.18s2_2_4115_1/head
     0> 2015-10-06 17:51:39.481109 7fd6f693c700 -1 osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::basic_string<char>, ceph::buffer::list>*)' thread 7fd6f693c700 time 2015-10-06 17:51:34.106607
osd/SnapMapper.cc: 282: FAILED assert(check(oid))

 ceph version 9.0.3-2037-gfb50ff6 (fb50ff6250d10c300bfa135739d4fda4ac55d7c1)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fd71a25fc6b]
 2: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x1ed) [0x7fd719d44ecd]
 3: (remove_dir(CephContext*, ObjectStore*, SnapMapper*, OSDriver*, ObjectStore::Sequencer*, coll_t, std::shared_ptr<DeletingState>, bool*, ThreadPool::TPHandle&)+0x454) [0x7fd719ca8dc4]
 4: (OSD::RemoveWQ::_process(std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, ThreadPool::TPHandle&)+0x1f4) [0x7fd719ca9834]
 5: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> >, std::pair<boost::intrusive_ptr<PG>, std::shared_ptr<DeletingState> > >::_void_process(void*, ThreadPool::TPHandle&)+0x10a) [0x7fd719cfceba]
 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56) [0x7fd71a2516e6]
 7: (ThreadPool::WorkThread::entry()+0x10) [0x7fd71a2525b0]
 8: (()+0x8182) [0x7fd718616182]
 9: (clone()+0x6d) [0x7fd71695d47d]

the primary is osd.8, still hammer, who picked a sloppy temp name. later, when we clean up this pg, we have problems deleting it.

here:

2015-10-06 17:47:55.316657 7fd6fc948700 10 osd.3 pg_epoch: 646 pg[4.18s0( v 640'424 (0'0,640'424] local-les=629 n=12 ec=34 les/c/f 629/616/0 646/646/646) [3,1,5,9] r=0 lpr=646 pi=571-645/5 crt=620'417 inactive NIBBLEWISE] on_change_cleanup: Removing oid -1/00000000/temp_4.18s2_2_4115_1/head from the temp collection
...
2015-10-06 17:47:55.337484 7fd70a0c7700 15 filestore(/var/lib/ceph/osd/ceph-3) remove 4.18s0_head/0:-1/00000000/temp_4.18s2_2_4115_1/head
2015-10-06 17:47:55.337558 7fd70a0c7700 10 filestore(/var/lib/ceph/osd/ceph-3) remove 4.18s0_head/0:-1/00000000/temp_4.18s2_2_4115_1/head = -2

and yet here it is
root@vpm057:/var/lib/ceph/osd/ceph-3/current/4.18s0_TEMP# ls -al
total 2748
drwxr-xr-x   2 root root    4096 Oct  6 17:48 .
drwxr-xr-x 114 root root    4096 Oct  6 17:51 ..
-rw-r--r--   1 root root 2797408 Oct  6 17:47 temp\u4.18s2\u2\u4115\u1__head_00000000__none_ffffffffffffffff_0
Actions #1

Updated by Sage Weil over 8 years ago

  • Status changed from New to Fix Under Review
Actions #2

Updated by Sage Weil over 8 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF