Ceph : Issueshttps://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2023-11-03T19:36:39ZCeph
Redmine bluestore - Bug #63436 (Pending Backport): Typo in reshard examplehttps://tracker.ceph.com/issues/634362023-11-03T19:36:39ZAdam Kupczyk
<p>See <a class="external" href="https://tracker.ceph.com/issues/63353">https://tracker.ceph.com/issues/63353</a>.<br />I missed the fact that "o"->"O" should be done too.</p> bluestore - Bug #62730 (New): ceph-bluestore-tool reshard brokenhttps://tracker.ceph.com/issues/627302023-09-06T19:49:52ZAdam Kupczyk
<p>It is possible to specify same prefix twice. Here an example with "p" defined twice.</p>
<p>ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-1 --sharding="m(3) p(3,0-12) o(3,0-13)=block_cache={type=binned_lru} l p" reshard</p>
<p>After resharding we get:<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: do_open existing_cfs=11<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=l shard_idx=0 hash_l=0 hash_h=4294967295 handle=0x55e12cbf89e0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=m shard_idx=0 hash_l=0 hash_h=4294967295 handle=0x55e12cbf8300<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=m shard_idx=1 hash_l=0 hash_h=4294967295 handle=0x55e12cbf8980<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=m shard_idx=2 hash_l=0 hash_h=4294967295 handle=0x55e12cbf91c0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=o shard_idx=0 hash_l=0 hash_h=13 handle=0x55e12cbf8cc0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=o shard_idx=1 hash_l=0 hash_h=13 handle=0x55e12cbf8ec0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=o shard_idx=2 hash_l=0 hash_h=13 handle=0x55e12cbf8bc0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=p shard_idx=0 hash_l=0 hash_h=12 handle=0x55e12cbf8ce0<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=p shard_idx=1 hash_l=0 hash_h=12 handle=0x55e12cbf8c00<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: add_column_family column_name=p shard_idx=2 hash_l=0 hash_h=12 handle=0x55e12cc92e20<br />2023-07-11T15:34:08.601+0000 7f138baaf200 10 rocksdb: do_open missing_cfs=1<br />2023-07-11T15:34:08.605+0000 7f138baaf200 -1 /builddir/build/BUILD/ceph-16.2.10/src/kv/RocksDBStore.cc: In function 'int RocksDBStore::do_open(std::ostream&, bool, bool, const string&)' thread 7f138baaf200 time 2023-07-11T15:34:08.602980+<br />0000<br />/builddir/build/BUILD/ceph-16.2.10/src/kv/RocksDBStore.cc: 1215: FAILED ceph_assert(recreate_mode)</p> bluestore - Feature #62371 (New): Copies of superblock(s)https://tracker.ceph.com/issues/623712023-08-09T08:44:45ZAdam Kupczyk
<p>Currently there is only one BlueStore superblock, and only one BlueFS superblock.<br />There should be many copies of superblocks.</p>
<p>When one superblock is unreadable / corrupted the other one could be selected.<br />The locations of superblocks should be fixed.</p> bluestore - Bug #62361 (Won't Fix): BlueStore compatibility broken inside Pacific linehttps://tracker.ceph.com/issues/623612023-08-08T10:53:57ZAdam Kupczyk
<p>The commit 21ac4f918cef16ec5b3d59d45077353795deadaf<br />introduced OP_UPDATE_INC to BlueFS replay mechanism.<br />It is introduced by <a class="external" href="https://github.com/ceph/ceph/pull/48915">https://github.com/ceph/ceph/pull/48915</a>.</p>
<p>The result is that one cannot downgrade from v16.2.11+ to v16.2.10.</p> bluestore - Bug #62341 (New): Hybrid allocator fails free space consistency checkhttps://tracker.ceph.com/issues/623412023-08-05T18:10:25ZAdam Kupczyk
<p>The problem was discovered after updating aging test:<br /><a class="external" href="https://github.com/ceph/ceph/pull/52830">https://github.com/ceph/ceph/pull/52830</a></p>
<p>When testing hybrid with low memory given:<br /><pre>
./bin/unittest_alloc_aging --gtest_filter=\*/4 --bluestore_hybrid_alloc_mem_cap=1000000 --log-to-stderr=true --debug-bluestore=1/1
</pre></p>
<p>There is an inconsistency between reported free space and iterated free space/<br /><pre>
Expected equality of these values:
sum
Which is: 2199020109824
capacity
Which is: 2199023255552
</pre><br />It is still open which one of these is actually correct.</p> bluestore - Bug #62282 (New): BlueFS and BlueStore use the same space (init_rm_free assert)https://tracker.ceph.com/issues/622822023-08-02T12:19:05ZAdam Kupczyk
<p>The problem is triggered on BlueFS mounts and tries to reserve allocations on shared device.</p>
<pre>
ceph version 17.2.6-70.el9cp (fe62dcdbb2c6e05782a3e2b67d025b84ff5047cc) quincy (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x11e) [0x557821136c6b]
2: /usr/bin/ceph-osd(+0x3dbe27) [0x557821136e27]
3: /usr/bin/ceph-osd(+0xa432d1) [0x55782179e2d1]
4: (AvlAllocator::_try_remove_from_tree(unsigned long, unsigned long, std::function<void (unsigned long, unsigned long, bool)>)+0x24c) [0x5578217960ec]
5: (HybridAllocator::init_rm_free(unsigned long, unsigned long)+0xc0) [0x55782179dfd0]
6: (BlueFS::mount()+0x1f6) [0x5578217666e6]
7: (BlueStore::_open_bluefs(bool, bool)+0x82) [0x557821690b42]
8: (BlueStore::_prepare_db_environment(bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*)+0x5c0) [0x557821691800]
9: (BlueStore::_open_db(bool, bool, bool)+0x179) [0x557821693439]
10: (BlueStore::_open_db_and_around(bool, bool)+0x429) [0x557821694169]
11: (BlueStore::_mount()+0x2ec) [0x55782169a57c]
12: (OSD::init()+0x4fc) [0x55782127359c]
13: main()
</pre>
<p>The offending range is from BlueFS log: range 0x6dda040000~400000<br /><pre>
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 30 bluefs mount noting alloc for file(ino 1 size 0x89c000 mtime 2023-07-27T18:29:37.033690+0000 allocated c20000 alloc_commit 10000 extents [1:0x5d800000~10000,1:0x5d6f0000~10000,1:0x6dda040000~400000,1:0x1a3d0000~400000,1:0x4fd47c0000~400000])
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x5d800000 length 0x10000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x5d6f0000 length 0x10000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 10 HybridAllocator init_rm_free offset 0x6dda040000 length 0x400000
2023-07-28T18:53:56.313+0000 7fc5b803c2c0 -1 HybridAllocator init_rm_free lambda Uexpected extent: 0x6dda040000~400000
2023-07-28T18:53:56.317+0000 7fc5b803c2c0 -1 /builddir/build/BUILD/ceph-17.2.6/src/os/bluestore/HybridAllocator.cc: In function 'HybridAllocator::init_rm_free(uint64_t, uint64_t)::<lambda(uint64_t, uint64_t, bool)>' thread 7fc5b803c2c0 time 2023-07-28T18:53:56.315192+0000
/builddir/build/BUILD/ceph-17.2.6/src/os/bluestore/HybridAllocator.cc: 175: FAILED ceph_assert(false)
</pre></p>
<p>As a run of fsck shows:<br /><pre>
ceph-bluestore-tool --path /rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46/ --bluestore-allocator=bitmap fsck
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2c0000~10000 or a subset is already allocated (misreferenced)
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda280000~10000 or a subset is already allocated (misreferenced)
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda290000~10000 or a subset is already allocated (misreferenced)
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2a0000~10000 or a subset is already allocated (misreferenced)
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2b0000~10000 or a subset is already allocated (misreferenced)
2023-08-01T15:33:44.344+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3105a3cf:::disk_bw_test_41:0#, extent 0x6dda2d0000~10000 or a subset is already allocated (misreferenced)
.....
2023-08-01T15:33:44.361+0000 7fef99bc4600 -1 bluestore(/rootfs/var/lib/ceph/08416350-2b97-11ee-a9a2-ac1f6b40d3fc/osd.46) operator()::fsck error: oid #-1:3fc5a3cf:::disk_bw_test_40:0#, extent 0x4fd4890000~10000 or a subset is already allocated (misreferenced)
fsck status: remaining 128 error(s) and warning(s)
</pre></p>
<p>The space on disk is currently occcupied both by BlueFS and BlueStore.<br />The device is rotational and bitmap_freelist_manager is in use.</p> RADOS - Bug #59099 (New): PG move causes data duplicationhttps://tracker.ceph.com/issues/590992023-03-17T13:51:03ZAdam Kupczyk
<p>Lets imagine we have a pool TEST.<br />In the PG we have object OBJ of size 1M.</p>
<p>We create snap SNAP-1 and write some 4K to OBJ.<br />As result we get OBJ.1 that takes 1M and OBJ.head that reuses all but 4K.<br />The total data usage is 1M + 4K.</p>
<p>Now we move PG to other OSD.<br />In some cases OBJ.head + OBJ.1 will take 2M.</p>
<p>The example of this happening is in attachment snap-pg-move-history.sh.<br />When data is on original PG on OSD.0:</p>
<p>ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS<br /> 0 ssd 0.09859 1.00000 101 GiB 1.1 GiB 101 MiB 0 B 21 MiB 100 GiB 1.09 1.05 2 up<br /> 1 ssd 0.09859 1.00000 101 GiB 1.0 GiB 740 KiB 0 B 20 MiB 100 GiB 0.99 0.95 1 up<br /> TOTAL 202 GiB 2.1 GiB 101 MiB 0 B 41 MiB 200 GiB 1.04 <br />MIN/MAX VAR: 0.95/1.05 STDDEV: 0.05</p>
<p>And after forcibly moving PG to OSD.</p>
<p>ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS<br /> 0 ssd 0.09859 1.00000 101 GiB 1.0 GiB 756 KiB 0 B 21 MiB 100 GiB 0.99 0.91 1 up<br /> 1 ssd 0.09859 1.00000 101 GiB 1.2 GiB 201 MiB 0 B 21 MiB 100 GiB 1.18 1.09 2 up<br /> TOTAL 202 GiB 2.2 GiB 201 MiB 0 B 42 MiB 200 GiB 1.09 <br />MIN/MAX VAR: 0.91/1.09 STDDEV: 0.10</p>
<p>The script was tested on Reef, but I do not believe it is limited to it.</p> bluestore - Feature #59060 (New): Make alloc_size persistenthttps://tracker.ceph.com/issues/590602023-03-14T12:59:49ZAdam Kupczyk
<p>Currently we use ceph config to retrieve min_alloc_size for block device.<br />This value should be persisted, so even stand-alone ceph-bluestore-tool could operate properly.</p>
<p>There are 2 flavours of min_alloc_size for block:<br />1) the AU size that is used by core BlueStore<br />2) the AU size that is used by BlueFS to hold files when allocating from block.</p>
<p>Both should be part of superblocks info.</p> bluestore - Bug #58966 (Resolved): BlueFS does not prealloc spacehttps://tracker.ceph.com/issues/589662023-03-13T11:07:02ZAdam Kupczyk
<p>We were missing implementation of <br />`BlueRocksWritableFile::Allocate(off_t offset, off_t len)`.</p> bluestore - Bug #58963 (New): OSD is mute. No std::err, no log.https://tracker.ceph.com/issues/589632023-03-13T09:07:49ZAdam Kupczyk
<p>I encountered a problem of `ceph-osd` not printing any std::err and not putting anything to the --log-file.<br />Just starts and then exits.</p>
<p>Env:<br />- root user on virtual machine</p>
<p>Run 0:<br />`ceph-osd -f --cluster test --setuser ceph --setgroup ceph --id 0 --log-to-file=true --log-file=/tmp/log.txt --debug_deliberately_leak_memory=true --osd_data=/tmp/cephtmp`</p>
<p>Result:<br />Process runs for milisecond and exits.<br />No std::err output and empty /tmp/log.txt.</p>
<p>Modification 1: strace<br />Command line:<br />`strace -f -e execve ceph-osd -f --cluster test --setuser ceph --setgroup ceph --id 0 --log-to-file=true --log-file=/tmp/log.txt --debug_deliberately_leak_memory=true --osd_data=/tmp/cephtmp`<br />Result:<br />typical strace output+<br />`2023-03-10T11:33:31.286+0000 7ff07f95a640 -1 deliberately leaking some memory`<br />/tmp/log.txt:<br />```2023-03-10T11:33:31.286+0000 7ff07f95a640 0 set uid:gid to 167:167 (ceph:ceph)<br />2023-03-10T11:33:31.286+0000 7ff07f95a640 -1 deliberately leaking some memory<br />2023-03-10T11:33:31.286+0000 7ff07f95a640 0 ceph version 18.0.0-2859-g44e4dfd0 (44e4dfd0d8d31f3da4cb9cb96b8d7ba4f587b37c) reef (dev), process ceph-osd, pid 42895<br />2023-03-10T11:33:31.287+0000 7ff07f95a640 0 pidfile_write: ignore empty --pid-file<br />2023-03-10T11:33:31.287+0000 7ff07f95a640 -1 missing 'type' file and unable to infer osd type```</p>
<p>Modification 2: mkfifo<br />Preparation step, to force process to pause<br />mkfifo /tmp/cephtmp/type<br />Now ceph-osd will wait forever for someone to put data to the other side of the pipe.</p>
<p>Command line:<br />`ceph-osd -f --cluster test --setuser ceph --setgroup ceph --id 0 --log-to-file=true --log-file=/tmp/log.txt --debug_deliberately_leak_memory=true --osd_data=/tmp/cephtmp`</p>
<p>Result:<br />std::err:<br />`2023-03-10T13:16:31.590+0000 7ff9bcb54640 -1 deliberately leaking some memory`<br />/tmp/log.txt:<br />```2023-03-10T13:16:31.590+0000 7ff9bcb54640 0 set uid:gid to 167:167 (ceph:ceph)<br />2023-03-10T13:16:31.590+0000 7ff9bcb54640 -1 deliberately leaking some memory<br />2023-03-10T13:16:31.590+0000 7ff9bcb54640 0 ceph version 18.0.0-2859-g44e4dfd0 (44e4dfd0d8d31f3da4cb9cb96b8d7ba4f587b37c) reef (dev), process ceph-osd, pid 43061<br />2023-03-10T13:16:31.590+0000 7ff9bcb54640 0 pidfile_write: ignore empty --pid-file```</p> bluestore - Fix #58759 (New): BlueFS log runway space exhaustedhttps://tracker.ceph.com/issues/587592023-02-17T12:32:45ZAdam Kupczyk
<p>In BlueFS::_flush_and_sync_log_core we have following data integrity check:<br />ceph_assert(bl.length() <= runway);</p>
<p>It is there, because it is unacceptable to put transaction larger then currently available transaction.<br />If we do so, there would be no good way to get the data (we do _do_replay_recovery_read() heuristic, but it requires lengthy recovery).</p>
<p>The solution could be that if we have less runway than transaction, <br />we inject a log-extending transaction first.</p>
<p>It has been almost impossible before, but these commits help:<br /><a class="external" href="https://github.com/ceph/ceph/pull/42750">https://github.com/ceph/ceph/pull/42750</a> "incremental update" <br /><a class="external" href="https://github.com/ceph/ceph/pull/48854">https://github.com/ceph/ceph/pull/48854</a> "4K bluefs"</p> bluestore - Bug #54465 (Resolved): BlueFS broken sync compaction modehttps://tracker.ceph.com/issues/544652022-03-03T16:12:00ZAdam Kupczyk
<p>BlueFS fine grain locking refactor block sync compaction mode.</p>
<p>The problem is off-by-1 in seq which leads to drop of all but first _replay log entries.</p>
<p>022-03-03T07:55:39.765+0000 7ffff7fda840 20 bluefs _replay 0x0: op_dir_create sharding<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 20 bluefs _replay 0x0: op_dir_link sharding/def to 21<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 20 bluefs _replay 0x0: op_jump_seq 1025<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 10 bluefs _read h 0x555557c46400 0x1000~1000 from file(ino 1 size 0x1000 mtime 0.000000 allocated 410000 alloc_commit 410000 extents [1:0x1540000~410000])<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 20 bluefs _read left 0xff000 len 0x1000<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 20 bluefs _read got 4096<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 10 bluefs _replay 0x1000: stop: seq 1025 != expected 1026<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 10 bluefs _replay log file size was 0x1000<br />2022-03-03T07:55:39.765+0000 7ffff7fda840 10 bluefs _replay done</p>
<p>The default mode is async mode.</p> bluestore - Bug #54248 (Resolved): BlueFS improperly tracks vselector sizes in _flush_special()https://tracker.ceph.com/issues/542482022-02-10T15:27:23ZAdam Kupczyk
<p>This problem is introduced in fine grain locking refactor.</p> RADOS - Bug #53685 (New): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.https://tracker.ceph.com/issues/536852021-12-21T11:22:25ZAdam Kupczyk
<p>Test "rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_election/connectivity msgr-failures/few msgr/async-v2only objectstore/bluestore-low-osd-mem-target rados tasks/rados_api_tests validater/valgrind}"</p>
<p><a class="external" href="http://pulpito.front.sepia.ceph.com/yuriw-2021-12-17_19:17:02-rados-wip-yuri3-testing-2021-12-17-0825-distro-default-smithi/6569207/">http://pulpito.front.sepia.ceph.com/yuriw-2021-12-17_19:17:02-rados-wip-yuri3-testing-2021-12-17-0825-distro-default-smithi/6569207/</a></p>
<p>Caused:<br />Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.</p>
<p>2021-12-18T00:08:44.222 INFO:tasks.ceph.osd.5.smithi151.stderr:ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-9707-ga73102d4/rpm/el8/BUILD/ceph-17.0.0-9707-ga73102d4/src/messages/MOSDRepOp.h:127: virtual void MOSDRepOp::encode_payload(uint64_t): Assertion `HAVE_FEATURE(features, SERVER_OCTOPUS)' failed.<br />2021-12-18T00:08:44.253 INFO:tasks.ceph.osd.3.smithi120.stderr:2021-12-18T00:08:44.137+0000 ea6a700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper term env OPENSSL_ia32cap=~0x1000000000000000 valgrind --trace-children=no --child-silent-after-fork=yes --soname-synonyms=somalloc=*tcmalloc* --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/osd.3.log --time-stamp=yes --vgdb=yes --exit-on-first-error=yes --error-exitcode=42 --tool=memcheck ceph-osd -f --cluster ceph -i 3 (PID: 35068) UID: 0<br />2021-12-18T00:08:44.282 INFO:tasks.ceph.osd.2.smithi120.stderr:2021-12-18T00:08:44.136+0000 ee6a700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper term env OPENSSL_ia32cap=~0x1000000000000000 valgrind --trace-children=no --child-silent-after-fork=yes --soname-synonyms=somalloc=*tcmalloc* --num-callers=50 --suppressions=/home/ubuntu/cephtest/valgrind.supp --xml=yes --xml-file=/var/log/ceph/valgrind/osd.2.log --time-stamp=yes --vgdb=yes --exit-on-first-error=yes --error-exitcode=42 --tool=memcheck ceph-osd -f --cluster ceph -i 2 (PID: 35067) UID: 0<br />2021-12-18T00:08:44.284 INFO:tasks.ceph.osd.5.smithi151.stderr:*** Caught signal (Aborted) *<strong><br />2021-12-18T00:08:44.284 INFO:tasks.ceph.osd.5.smithi151.stderr: in thread 2de5d700 thread_name:tp_osd_tp<br />2021-12-18T00:08:44.418 INFO:tasks.ceph.osd.5.smithi151.stderr: ceph version 17.0.0-9707-ga73102d4 (a73102d4a8bb9378f707185ba2d1a9e105c3b138) quincy (dev)<br />2021-12-18T00:08:44.418 INFO:tasks.ceph.osd.5.smithi151.stderr: 1: /lib64/libpthread.so.0(+0x12b20) [0x6825b20]<br />2021-12-18T00:08:44.419 INFO:tasks.ceph.osd.5.smithi151.stderr: 2: gsignal()<br />2021-12-18T00:08:44.419 INFO:tasks.ceph.osd.5.smithi151.stderr: 3: abort()<br />2021-12-18T00:08:44.419 INFO:tasks.ceph.osd.5.smithi151.stderr: 4: /lib64/libc.so.6(+0x21c89) [0x7a52c89]<br />2021-12-18T00:08:44.419 INFO:tasks.ceph.osd.5.smithi151.stderr: 5: /lib64/libc.so.6(+0x2fa76) [0x7a60a76]<br />2021-12-18T00:08:44.420 INFO:tasks.ceph.osd.5.smithi151.stderr: 6: (MOSDRepOp::encode_payload(unsigned long)+0x2d0) [0xbb0960]<br />2021-12-18T00:08:44.420 INFO:tasks.ceph.osd.5.smithi151.stderr: 7: (Message::encode(unsigned long, int, bool)+0x2e) [0x101dade]<br />2021-12-18T00:08:44.420 INFO:tasks.ceph.osd.5.smithi151.stderr: 8: (ProtocolV2::prepare_send_message(unsigned long, Message</strong>)+0x44) [0x12a3fb4]<br />2021-12-18T00:08:44.420 INFO:tasks.ceph.osd.5.smithi151.stderr: 9: (ProtocolV2::send_message(Message*)+0x3ae) [0x12a460e]<br />2021-12-18T00:08:44.421 INFO:tasks.ceph.osd.5.smithi151.stderr: 10: (AsyncConnection::send_message(Message*)+0x53e) [0x1280cbe]<br />2021-12-18T00:08:44.421 INFO:tasks.ceph.osd.5.smithi151.stderr: 11: (OSDService::send_message_osd_cluster(int, Message*, unsigned int)+0xf5) [0x7cf3c5]<br />2021-12-18T00:08:44.421 INFO:tasks.ceph.osd.5.smithi151.stderr: 12: (ReplicatedBackend::issue_op(hobject_t const&, eversion_t const&, unsigned long, osd_reqid_t, eversion_t, eversion_t, hobject_t, hobject_t, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> > const&, std::optional<pg_hit_set_history_t>&, ReplicatedBackend::InProgressOp*, ceph::os::Transaction&)+0x557) [0xb99427]<br />2021-12-18T00:08:44.422 INFO:tasks.ceph.osd.5.smithi151.stderr: 13: (ReplicatedBackend::submit_transaction(hobject_t const&, object_stat_sum_t const&, eversion_t const&, std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, std::allocator<pg_log_entry_t> >&&, std::optional<pg_hit_set_history_t>&, Context*, unsigned long, osd_reqid_t, boost::intrusive_ptr<OpRequest>)+0xa94) [0xb9bc24]<br />2021-12-18T00:08:44.422 INFO:tasks.ceph.osd.5.smithi151.stderr: 14: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, PrimaryLogPG::OpContext*)+0xc80) [0x8ed9a0]<br />2021-12-18T00:08:44.422 INFO:tasks.ceph.osd.5.smithi151.stderr: 15: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x1097) [0x94ff97]<br />2021-12-18T00:08:44.422 INFO:tasks.ceph.osd.5.smithi151.stderr: 16: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x39be) [0x95455e]<br />2021-12-18T00:08:44.423 INFO:tasks.ceph.osd.5.smithi151.stderr: 17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xe2e) [0x95b52e]<br />2021-12-18T00:08:44.423 INFO:tasks.ceph.osd.5.smithi151.stderr: 18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x332) [0x7de142]<br />2021-12-18T00:08:44.423 INFO:tasks.ceph.osd.5.smithi151.stderr: 19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x6f) [0xa9fb8f]<br />2021-12-18T00:08:44.423 INFO:tasks.ceph.osd.5.smithi151.stderr: 20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xac8) [0x7fc088]<br />2021-12-18T00:08:44.423 INFO:tasks.ceph.osd.5.smithi151.stderr: 21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0xeefeb4]<br />2021-12-18T00:08:44.424 INFO:tasks.ceph.osd.5.smithi151.stderr: 22: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0xef1254]<br />2021-12-18T00:08:44.424 INFO:tasks.ceph.osd.5.smithi151.stderr: 23: /lib64/libpthread.so.0(+0x814a) [0x681b14a]<br />2021-12-18T00:08:44.424 INFO:tasks.ceph.osd.5.smithi151.stderr: 24: clone()</p> bluestore - Bug #53261 (Duplicate): pacific: OMAP upgrade to PER-PG format result in skipped firs...https://tracker.ceph.com/issues/532612021-11-13T10:39:33ZAdam Kupczyk
<p>This is a regression introduced by fix to omap upgrade: <a class="external" href="https://github.com/ceph/ceph/pull/43687">https://github.com/ceph/ceph/pull/43687</a><br />The problem is that we always skipped first omap entry. <br />This works fine with objects having omap header key.<br />For objects without header key we skipped first actual omap key.</p>