Bug #45195
closedceph_test_objectstore: src/os/bluestore/bluestore_types.h: 734: FAILED ceph_assert(p != extents.end())
0%
Description
(gdb) bt #0 raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x0000561ba61b15b3 in reraise_fatal (signum=6) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/global/signal_handler.cc:332 #2 handle_fatal_signal (signum=6) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/global/signal_handler.cc:332 #3 <signal handler called> #4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #5 0x00007f6b818ddcf5 in __GI_abort () at abort.c:79 #6 0x00007f6b83360411 in ceph::__ceph_assert_fail(char const*, char const*, int, char const*) () from /usr/lib64/ceph/libceph-common.so.2 #7 0x00007f6b833605da in ceph::__ceph_assert_fail(ceph::assert_data const&) () from /usr/lib64/ceph/libceph-common.so.2 #8 0x0000561ba6020811 in bluestore_blob_t::map<BlueStore::_prepare_read_ioc(BlueStore::blobs2read_t&, std::vector<ceph::buffer::v15_2_0::list>*, IOContext*)::<lambda(uint64_t, uint64_t)> > (f=..., x_len=8192, x_off=<optimized out>, this=<optimized out>) at /usr/include/c++/8/bits/stl_map.h:468 #9 BlueStore::_prepare_read_ioc (this=0x561bbcebe000, blobs2read=std::map with 4 elements = {...}, compressed_blob_bls=0x7f6b70453e30, ioc=0x7f6b70453f00) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/os/bluestore/BlueStore.cc:9700 #10 0x0000561ba604a0ff in BlueStore::_do_read (this=<optimized out>, c=0x561d587ef860, o=..., offset=0, length=<optimized out>, bl=..., op_flags=0, retry_count=0) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/os/bluestore/BlueStore.cc:9883 #11 0x0000561ba604b1b7 in BlueStore::read (this=this@entry=0x561bbcebe000, c_=..., oid=..., offset=offset@entry=0, length=212992, bl=..., op_flags=0) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/os/bluestore/BlueStore.h:3388 #12 0x0000561ba5f150f7 in SyntheticWorkloadState::C_SyntheticOnReadable::finish (this=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/include/buffer.h:1017 #13 0x0000561ba5ed2eed in Context::complete (this=0x561bbb50d180, r=<optimized out>) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/include/Context.h:77 #14 0x00007f6b833f1325 in Finisher::finisher_thread_entry() () from /usr/lib64/ceph/libceph-common.so.2 #15 0x00007f6b8dd1c2de in start_thread (arg=<optimized out>) at pthread_create.c:486 #16 0x00007f6b819b8133 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) f #9 BlueStore::_prepare_read_ioc (this=0x561bbcebe000, blobs2read=std::map with 4 elements = {...}, compressed_blob_bls=0x7f6b70453e30, ioc=0x7f6b70453f00) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/os/bluestore/BlueStore.cc:9700 9700 auto r = bptr->get_blob().map( (gdb) l 9695 << " reading 0x" << req.r_off 9696 << "~" << req.r_len << std::dec 9697 << dendl; 9698 9699 // read it 9700 auto r = bptr->get_blob().map( 9701 req.r_off, req.r_len, 9702 [&](uint64_t offset, uint64_t length) { 9703 int r = bdev->aio_read(offset, length, &req.bl, ioc); 9704 if (r < 0) (gdb) p req.r_off $6 = 40960 (gdb) p ((BlueStore::Blob*)0x561bbc1d7110)->blob->extents $9 = std::vector of length 1, capacity 1 = {{ <bluestore_interval_t<unsigned long, unsigned int>> = { static INVALID_OFFSET = 18446744073709551615, offset = 87281664, length = 40960 }, <No data fields>}}
So both the offset of the request and length of the extents here are 40960 so 'p' does appear to equal extents.end(). Perhaps some sort of off-by-one error?
/a/bhubbard-2020-04-16_09:57:54-rados-wip-badone-testing-distro-basic-smithi/4957897
Updated by Brad Hubbard about 4 years ago
- Severity changed from 3 - minor to 2 - major
Updated by Brad Hubbard about 4 years ago
- Subject changed from src/os/bluestore/bluestore_types.h: 734: FAILED ceph_assert(p != extents.end()) to ceph_test_objectstore: src/os/bluestore/bluestore_types.h: 734: FAILED ceph_assert(p != extents.end())
Not sure whether this is an issue with the test?
Updated by Brad Hubbard about 4 years ago
(gdb) t 16 [Switching to thread 16 (Thread 0x7f6b8e13a240 (LWP 18023))] #7 0x0000561ba5ead9a5 in StoreTest::doSyntheticTest (this=<optimized out>, num_ops=50000, max_obj=<optimized out>, max_wr=<optimized out>, align=<optimized out>) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/test/objectstore/store_test.cc:4664 4664 test_obj.clone_range(); (gdb) f #7 0x0000561ba5ead9a5 in StoreTest::doSyntheticTest (this=<optimized out>, num_ops=50000, max_obj=<optimized out>, max_wr=<optimized out>, align=<optimized out>) at /usr/src/debug/ceph-16.0.0-834.g3ef1bee.el8.x86_64/src/test/objectstore/store_test.cc:4664 4664 test_obj.clone_range();
I dumped out test_obj and it does show some objects with _len = 4096. Need to verify if this is correct or an error.
[{ hobj = { static POOL_META = -1, static POOL_TEMP_START = -2, oid = { name = "OBJ_4154" }, snap = { val = 1049056086 }, hash = 0, max = false, nibblewise_key_cache = 0, hash_reverse_bits = 0, pool = 555, nspace = "", key = "" }, generation = 18446744073709551615, shard_id = { id = -1 '\377', static NO_SHARD = { id = -1 '\377', static NO_SHARD = <same as static member of an already seen type> } }, max = false, static NO_GEN = 18446744073709551615 }] = { data = { _buffers = { _root = { next = 0x561d1d5bd000 }, _tail = 0x561d1d5bd000 }, _carriage = 0x561ba67ae540 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 4096, _num = 1, static always_empty_bptr = { _raw = 0x0, _off = 0, _len = 0 } }, attrs = std::map with 0 elements },
Updated by Igor Fedotov about 4 years ago
The following back trace is likely to be the symptom for the same bug.
-1> 2020-04-24T16:25:20.971+0300 7efe4897e0c0 -1 /home/if/ceph/src/common/Checksummer.h: In function 'static int Checksummer::calculate(typename Alg::init_value_t, size_t, size_t, size_t, const ceph::buffer::v15_2_0::list&, ceph::buffer::v15_2_0::ptr*) [with Alg = Checksummer::crc32c; typename Alg::init_value_t = unsigned int; size_t = long unsigned int]' thread 7efe4897e0c0 time 2020-04-24T16:25:20.968742+0300
/home/if/ceph/src/common/Checksummer.h: 214: FAILED ceph_assert(length % csum_block_size == 0)
ceph version Development (no_version) pacific (dev)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x151) [0x7efe495d71bb]
2: (()+0x26e3a5) [0x7efe495d73a5]
3: (bluestore_blob_t::calc_csum(unsigned long, ceph::buffer::v15_2_0::list const&)+0x546) [0x559f7192b526]
4: (BlueStore::_do_write_big_apply_deferred(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive::tree_iterator<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3u>, false>, BlueStore::BigDeferredWriteContext&, ceph::buffer::v15_2_0::list::iterator&, BlueStore::WriteContext*)+0x526) [0x559f718cfdf6]
5: (BlueStore::_do_write_big(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v15_2_0::list::iterator&, BlueStore::WriteContext*)+0x18ba) [0x559f718d1ada]
6: (BlueStore::_do_write_data(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, unsigned long, ceph::buffer::v15_2_0::list&, BlueStore::WriteContext*)+0x176) [0x559f718d6776]
Updated by Igor Fedotov about 4 years ago
- Status changed from New to In Progress
- Assignee set to Igor Fedotov
Updated by Igor Fedotov about 4 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 34754
Updated by Brad Hubbard about 4 years ago
Igor, does this need to be backported? If so could you set the appropriate releases please? I know you set 16 as the affected version but just double checking, thanks.
Updated by Brad Hubbard about 4 years ago
- Status changed from Fix Under Review to In Progress
Updated by Igor Fedotov about 4 years ago
Hi Brad,
no this is specific to master(Pacific) only
Updated by Igor Fedotov about 4 years ago
Well, this fix is required if/when 'deferring big writes' feature (https://github.com/ceph/ceph/pull/33434) is backported.
Updated by Igor Fedotov almost 4 years ago
- Status changed from In Progress to Fix Under Review
Updated by Kefu Chai almost 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Igor Fedotov almost 4 years ago
- Copied to Backport #45354: octopus: ceph_test_objectstore: src/os/bluestore/bluestore_types.h: 734: FAILED ceph_assert(p != extents.end()) added
Updated by Igor Fedotov almost 4 years ago
- Status changed from Pending Backport to Resolved
- Backport deleted (
octopus)
I doubt we'll backport deferring big writes to Octopus. Hence marking as resolved