Project

General

Profile

Actions

Bug #23828

closed

ec gen object leaks into different filestore collection just after split

Added by Sage Weil about 6 years ago. Updated almost 4 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) _split_collection(5721): 2.1bs4_head bits: 7
2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4943): /var/lib/ceph/osd/ceph-9/current/2.1bs4_head
2018-04-23 16:46:11.811 7f65a126d700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4947): /var/lib/ceph/osd/ceph-9/current/2.1bs4_head = 0
2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4943): /var/lib/ceph/osd/ceph-9/current/2.5bs4_head
2018-04-23 16:46:11.811 7f65a126d700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4947): /var/lib/ceph/osd/ceph-9/current/2.5bs4_head = 0

split of 2.1b into 2.1b and 2.5b
2018-04-23 16:46:12.735 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE ob 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52

object correctly seen in 2.5b
2018-04-23 16:46:12.739 7f658ed4e700  5 filestore(/var/lib/ceph/osd/ceph-9) queue_transactions(2246): osr 0x55b291a7c780 osr(2.5bs4_head)
2018-04-23 16:46:12.739 7f658ed4e700 10 journal prepare_entry [Transaction(0x55b29055ad80)]
2018-04-23 16:46:12.739 7f658ed4e700 10 journal  len 5821 -> 8192 (head 40 pre_pad 0 bl 5821 post_pad 2291 tail 40) (bl alignment -1)
2018-04-23 16:46:12.739 7f658ed4e700 10 journal op_submit_start 12611
2018-04-23 16:46:12.739 7f658ed4e700  5 filestore(/var/lib/ceph/osd/ceph-9) queue_transactions(2288): (writeahead) 12611 [Transaction(0x55b29055ad80)]
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 #-1:c0371625:::snapmapper:0# (0x55b29177f860)
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:97# (0x55b291ff19e0)
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:ac# (0x55b291ff1920)
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo
oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:cd# (0x55b291ff1860)
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52 (0x55b291fef0a0)

a register_apply on the object, also in 2.5b
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5071): pool is 2 shard is 4 pgid 2.1bs4
2018-04-23 16:46:12.739 7f658ed4e700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5079): first checking temp pool
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5071): pool is -4 shard is 4 pgid 2.1bs4
2018-04-23 16:46:12.739 7f658ed4e700 20 _collection_list_partial start:GHMIN end:GHMAX-30 ls.size 0
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) objects: []
2018-04-23 16:46:12.739 7f658ed4e700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5087): fall through to non-temp collection, start 4#-1:00000000::::0#
2018-04-23 16:46:12.739 7f658ed4e700 20 _collection_list_partial start:4#-1:00000000::::0# end:GHMAX-30 ls.size 0
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B1000000
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B1000000 ob 4#2:d8000000::::head#
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:db#
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:106#
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE
2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE ob 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52

...and then object appears inside 2.1b.

a bit later, we crash when deleting the pg due to the snapmapper mask check:

2018-04-23 16:46:12.743 7f658ed4e700 -1 /build/ceph-13.0.2-1662-ge51e976/src/osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::__cxx11::basic_string<char>, ceph::buffer::list>*)' thread 7f658ed4e700 time 2018-04-23 16:46:12.745431
/build/ceph-13.0.2-1662-ge51e976/src/osd/SnapMapper.cc: 330: FAILED assert(check(oid))

/a/sage-2018-04-23_15:07:57-rados-wip-sage3-testing-2018-04-23-0831-distro-basic-smithi/2430790

Actions #1

Updated by Sage Weil about 6 years ago

  • Description updated (diff)
Actions #2

Updated by Kefu Chai over 5 years ago

 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5610a92ec574]
 6: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x152) [0x5610a963e9f2]
 7: (PG::_delete_some(ObjectStore::Transaction*)+0x2f4) [0x5610a94f73d4]
 8: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x38) [0x5610a94f8228]
 9: (boost::statechart::simple_state<PG::RecoveryState::Deleting, PG::RecoveryState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na
, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na
, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base
 const&, void const*)+0x16a) [0x5610a952e3fa]
 10: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost
::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x5a) [0x5610a950e1ea]
 11: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>, PG::RecoveryCtx*)+0x139) [0x5610a94f6a19]
 12: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1a4) [0x5610a943af24]
 13: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0x234) [0x5610a943b364]
 14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x641) [0x5610a942f421]
 15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3e6) [0x5610a9a4bb26]
 16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5610a9a532d0]
 17: (()+0x7e25) [0x7f78461c3e25]
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.0-3418-gd4c8323/rpm/el7/BUILD/ceph-14.0.0-3418-gd4c8323/src/osd/SnapMapper.cc: 341: FAILED ceph_assert(check(oid))

/a/kchai-2018-09-20_12:40:38-rados-wip-kefu-testing-2018-09-19-1821-distro-basic-smithi/3046768/remote/*/log/ceph-osd.3.log.gz

Actions #3

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions #4

Updated by Neha Ojha almost 4 years ago

  • Status changed from New to Can't reproduce
  • Priority changed from High to Normal
Actions

Also available in: Atom PDF