Actions
Bug #23828
closedec gen object leaks into different filestore collection just after split
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) _split_collection(5721): 2.1bs4_head bits: 7 2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4943): /var/lib/ceph/osd/ceph-9/current/2.1bs4_head 2018-04-23 16:46:11.811 7f65a126d700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4947): /var/lib/ceph/osd/ceph-9/current/2.1bs4_head = 0 2018-04-23 16:46:11.811 7f65a126d700 15 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4943): /var/lib/ceph/osd/ceph-9/current/2.5bs4_head 2018-04-23 16:46:11.811 7f65a126d700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_stat(4947): /var/lib/ceph/osd/ceph-9/current/2.5bs4_head = 0
split of 2.1b into 2.1b and 2.5b
2018-04-23 16:46:12.735 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE ob 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52
object correctly seen in 2.5b
2018-04-23 16:46:12.739 7f658ed4e700 5 filestore(/var/lib/ceph/osd/ceph-9) queue_transactions(2246): osr 0x55b291a7c780 osr(2.5bs4_head) 2018-04-23 16:46:12.739 7f658ed4e700 10 journal prepare_entry [Transaction(0x55b29055ad80)] 2018-04-23 16:46:12.739 7f658ed4e700 10 journal len 5821 -> 8192 (head 40 pre_pad 0 bl 5821 post_pad 2291 tail 40) (bl alignment -1) 2018-04-23 16:46:12.739 7f658ed4e700 10 journal op_submit_start 12611 2018-04-23 16:46:12.739 7f658ed4e700 5 filestore(/var/lib/ceph/osd/ceph-9) queue_transactions(2288): (writeahead) 12611 [Transaction(0x55b29055ad80)] 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 #-1:c0371625:::snapmapper:0# (0x55b29177f860) 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:97# (0x55b291ff19e0) 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:ac# (0x55b291ff1920) 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:cd# (0x55b291ff1860) 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore.osr(0x55b291a7c780) _register_apply 0x55b291139360 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52 (0x55b291fef0a0)
a register_apply on the object, also in 2.5b
2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5071): pool is 2 shard is 4 pgid 2.1bs4 2018-04-23 16:46:12.739 7f658ed4e700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5079): first checking temp pool 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5071): pool is -4 shard is 4 pgid 2.1bs4 2018-04-23 16:46:12.739 7f658ed4e700 20 _collection_list_partial start:GHMIN end:GHMAX-30 ls.size 0 2018-04-23 16:46:12.739 7f658ed4e700 20 filestore(/var/lib/ceph/osd/ceph-9) objects: [] 2018-04-23 16:46:12.739 7f658ed4e700 10 filestore(/var/lib/ceph/osd/ceph-9) collection_list(5087): fall through to non-temp collection, start 4#-1:00000000::::0# 2018-04-23 16:46:12.739 7f658ed4e700 20 _collection_list_partial start:4#-1:00000000::::0# end:GHMAX-30 ls.size 0 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B1000000 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B1000000 ob 4#2:d8000000::::head# 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:db# 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:106# 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix B183F193 ob 4#2:d81cf89c:::smithi04114511-3 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head# 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE 2018-04-23 16:46:12.739 7f658ed4e700 20 list_by_hash_bitwise prefix BD8ADDDE ob 4#2:db15bbb7:::smithi04114511-35 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head#52
...and then object appears inside 2.1b.
a bit later, we crash when deleting the pg due to the snapmapper mask check:
2018-04-23 16:46:12.743 7f658ed4e700 -1 /build/ceph-13.0.2-1662-ge51e976/src/osd/SnapMapper.cc: In function 'int SnapMapper::remove_oid(const hobject_t&, MapCacher::Transaction<std::__cxx11::basic_string<char>, ceph::buffer::list>*)' thread 7f658ed4e700 time 2018-04-23 16:46:12.745431 /build/ceph-13.0.2-1662-ge51e976/src/osd/SnapMapper.cc: 330: FAILED assert(check(oid))
/a/sage-2018-04-23_15:07:57-rados-wip-sage3-testing-2018-04-23-0831-distro-basic-smithi/2430790
Updated by Kefu Chai over 5 years ago
5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x5610a92ec574] 6: (SnapMapper::remove_oid(hobject_t const&, MapCacher::Transaction<std::string, ceph::buffer::list>*)+0x152) [0x5610a963e9f2] 7: (PG::_delete_some(ObjectStore::Transaction*)+0x2f4) [0x5610a94f73d4] 8: (PG::RecoveryState::Deleting::react(PG::DeleteSome const&)+0x38) [0x5610a94f8228] 9: (boost::statechart::simple_state<PG::RecoveryState::Deleting, PG::RecoveryState::ToDelete, boost::mpl::list<mpl_::na, mpl_::na , mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na , mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x16a) [0x5610a952e3fa] 10: (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine, PG::RecoveryState::Initial, std::allocator<void>, boost ::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x5a) [0x5610a950e1ea] 11: (PG::do_peering_event(std::shared_ptr<PGPeeringEvent>, PG::RecoveryCtx*)+0x139) [0x5610a94f6a19] 12: (OSD::dequeue_peering_evt(OSDShard*, PG*, std::shared_ptr<PGPeeringEvent>, ThreadPool::TPHandle&)+0x1a4) [0x5610a943af24] 13: (OSD::dequeue_delete(OSDShard*, PG*, unsigned int, ThreadPool::TPHandle&)+0x234) [0x5610a943b364] 14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x641) [0x5610a942f421] 15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x3e6) [0x5610a9a4bb26] 16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5610a9a532d0] 17: (()+0x7e25) [0x7f78461c3e25]
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.0.0-3418-gd4c8323/rpm/el7/BUILD/ceph-14.0.0-3418-gd4c8323/src/osd/SnapMapper.cc: 341: FAILED ceph_assert(check(oid))
/a/kchai-2018-09-20_12:40:38-rados-wip-kefu-testing-2018-09-19-1821-distro-basic-smithi/3046768/remote/*/log/ceph-osd.3.log.gz
Updated by Neha Ojha almost 4 years ago
- Status changed from New to Can't reproduce
- Priority changed from High to Normal
Actions