Ceph : Issues
https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2024-03-29T12:54:10Z
Ceph
Redmine
RADOS - Bug #65227 (New): noscrub cluster flag prevents deep-scrubs from starting
https://tracker.ceph.com/issues/65227
2024-03-29T12:54:10Z
Wes Dillingham
wes@wesdillingham.com
<p>Observed on a 17.2.7 cluster and confirmed on an additional 17.2.7 cluster.</p>
<p>Reproduction steps:<br />- On a cluster that reliably will have scrubs in flight set both the noscrub and nodeep-scrub flags.<br />- As expected ongoing scrubs and deep-scrubs will begin to be cancelled immediately and PGs that were scrubbing will be no longer.<br />- Unset the nodeep-scrub flag only (leave noscrub set) - at this point I would expect deep scrubs to be allowed but shallow scrubs not to be.<br />- Unset the "noscrub" flag as well. At this point deep-scrubs will begin on the cluster.</p>
<p>Expected behavior:<br />"noscrub" flag prevents shallow (non-deep) scrubs from starting but does not control deep scrubs.</p>
CephFS - Bug #65225 (New): ceph_assert on dn->get_projected_linkage()->is_remote
https://tracker.ceph.com/issues/65225
2024-03-29T10:16:03Z
Abhishek Lekshmanan
abhishek.lekshmanan@gmail.com
<p>In a workload that is heavily hardlinking and moving files, we see ceph-mds assert like the following</p>
<pre>
16.2.9/src/mds/Locker.cc: In function 'bool Locker::acquire_locks(MDRequestRef&, Mutati> /var/log/ceph/ceph-mds.minilevinson-983db664a3.log-20240329.gz
ceph-mds[2693077]: /builddir/build/BUILD/ceph-16.2.9/src/mds/Locker.cc: 404: FAILED ceph_assert(dn->get_projected_linkage()->is_remot>
ceph-mds[2693077]: ceph version 16.2.9-2 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
ceph-mds[2693077]: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f40e912ed98]
ceph-mds[2693077]: 2: /usr/lib64/ceph/libceph-common.so.2(+0x276fb2) [0x7f40e912efb2]
ceph-mds[2693077]: 3: (Locker::acquire_locks(boost::intrusive_ptr<MDRequestImpl>&, MutationImpl::LockOpVec&, CInode*, bool)+0x131e) >
ceph-mds[2693077]: 4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xb43) [0x55eeadce4f73]
ceph-mds[2693077]: 5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xf1b) [0x55eeadd12d3b]
ceph-mds[2693077]: 6: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x3fc) [0x55eeadd1348c]
ceph-mds[2693077]: 7: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x12b) [0x55eeadd179fb]
ceph-mds[2693077]: 8: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0xbb4) [0x55eeadc6e224]
ceph-mds[2693077]: 9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7bb) [0x55eeadc70bdb]
ceph-mds[2693077]: 10: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x55) [0x55eeadc711d5]
ceph-mds[2693077]: 11: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x108) [0x55eeadc60dc8]
ceph-mds[2693077]: 12: (DispatchQueue::entry()+0x126a) [0x7f40e9374b5a]
ceph-mds[2693077]: 13: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f40e9426841]
ceph-mds[2693077]: 14: /lib64/libpthread.so.0(+0x81ca) [0x7f40e810f1ca]
</pre>
<p>The full logs look like the following<br /><pre>
-1 /builddir/build/BUILD/ceph-16.2.9/src/mds/Locker.cc: In function 'bool Locker::acquire_locks(MDRequestRef&, MutationImpl::LockOpV
ec&, CInode*, bool)' thread 7f40e06ef700 time 2024-03-28T19:20:43.701971+0100
/builddir/build/BUILD/ceph-16.2.9/src/mds/Locker.cc: 404: FAILED ceph_assert(dn->get_projected_linkage()->is_remote())
ceph version 16.2.9-2 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7f40e912ed98]
2: /usr/lib64/ceph/libceph-common.so.2(+0x276fb2) [0x7f40e912efb2]
3: (Locker::acquire_locks(boost::intrusive_ptr<MDRequestImpl>&, MutationImpl::LockOpVec&, CInode*, bool)+0x131e) [0x55eeade5669e]
4: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xb43) [0x55eeadce4f73]
5: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xf1b) [0x55eeadd12d3b]
6: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x3fc) [0x55eeadd1348c]
7: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x12b) [0x55eeadd179fb]
8: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0xbb4) [0x55eeadc6e224]
9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7bb) [0x55eeadc70bdb]
10: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x55) [0x55eeadc711d5]
11: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x108) [0x55eeadc60dc8]
12: (DispatchQueue::entry()+0x126a) [0x7f40e9374b5a]
13: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f40e9426841]
14: /lib64/libpthread.so.0(+0x81ca) [0x7f40e810f1ca]
15: clone()
0> 2024-03-28T19:20:43.709+0100 7f40e06ef700 -1 *** Caught signal (Aborted) **
in thread 7f40e06ef700 thread_name:ms_dispatch
ceph version 16.2.9-2 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
1: /lib64/libpthread.so.0(+0x12cf0) [0x7f40e8119cf0]
2: gsignal()
3: abort()
4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f40e912ede9]
5: /usr/lib64/ceph/libceph-common.so.2(+0x276fb2) [0x7f40e912efb2]
6: (Locker::acquire_locks(boost::intrusive_ptr<MDRequestImpl>&, MutationImpl::LockOpVec&, CInode*, bool)+0x131e) [0x55eeade5669e]
7: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0xb43) [0x55eeadce4f73]
8: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xf1b) [0x55eeadd12d3b]
9: (Server::handle_client_request(boost::intrusive_ptr<MClientRequest const> const&)+0x3fc) [0x55eeadd1348c]
10: (Server::dispatch(boost::intrusive_ptr<Message const> const&)+0x12b) [0x55eeadd179fb]
11: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0xbb4) [0x55eeadc6e224]
12: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7bb) [0x55eeadc70bdb]
13: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x55) [0x55eeadc711d5]
14: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x108) [0x55eeadc60dc8]
15: (DispatchQueue::entry()+0x126a) [0x7f40e9374b5a]
16: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f40e9426841]
17: /lib64/libpthread.so.0(+0x81ca) [0x7f40e810f1ca]
18: clone()
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</pre></p>
<p>Some interesting log snippets from before <br /><pre>
-434> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2024-03-28T19:18:03.483427+0100)
-433> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client log_queue is 6 last_log 1380 sent 1374 num 6 unsent 6 sending 6
-432> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701758+0100 mds.minilevinson-983db664a3 (mds.0) 1375 : cluster [WRN] 5 slow requests, 5
included below; oldest blocked for > 13.152129 secs
-431> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701765+0100 mds.minilevinson-983db664a3 (mds.0) 1376 : cluster [WRN] slow request 7.203
507 seconds old, received at 2024-03-28T19:18:25.497610+0100: client_request(mds.0:181210 rename #0x100026f338d/sh15637 #0x606/1000295a11a caller_uid=0, caller_gid=0{}) currently sub
mit entry: journal_and_reply
-430> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701771+0100 mds.minilevinson-983db664a3 (mds.0) 1377 : cluster [WRN] slow request 6.519
057 seconds old, received at 2024-03-28T19:18:26.182061+0100: client_request(mds.0:181239 rename #0x100026f338d/sh15811 #0x608/1000295a58f caller_uid=0, caller_gid=0{}) currently sub
mit entry: journal_and_reply
-429> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701774+0100 mds.minilevinson-983db664a3 (mds.0) 1378 : cluster [WRN] slow request 6.372
089 seconds old, received at 2024-03-28T19:18:26.329029+0100: client_request(mds.0:181300 rename #0x1000185ba3c/file117_hardmv_hard #0x603/10002567d82 caller_uid=0, caller_gid=0{}) c
urrently failed to wrlock, waiting
-428> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701776+0100 mds.minilevinson-983db664a3 (mds.0) 1379 : cluster [WRN] slow request 6.371
098 seconds old, received at 2024-03-28T19:18:26.330020+0100: client_request(mds.0:181320 rename #0x1000185ba3c/file373_hardmv_hard #0x608/1000256807c caller_uid=0, caller_gid=0{}) c
urrently failed to wrlock, waiting
-427> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 log_client will send 2024-03-28T19:18:32.701781+0100 mds.minilevinson-983db664a3 (mds.0) 1380 : cluster [WRN] slow request 5.565
985 seconds old, received at 2024-03-28T19:18:27.135132+0100: client_request(mds.0:181349 rename #0x100026f338d/sh15708 #0x606/1000295a111 caller_uid=0, caller_gid=0{}) currently fai
led to wrlock, waiting
-426> 2024-03-28T19:18:33.482+0100 7f40df6ed700 10 monclient: _send_mon_message to mon.minilevinson-f52cbe8096 at v2:188.184.88.255:3300/0
-425> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client handle_log_ack log(last 1380) v1
-424> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701758+0100 mds.minilevinson-983db664a3 (mds.0) 1375 : cluster [WRN] 5 slow requests, 5 in
cluded below; oldest blocked for > 13.152129 secs
-423> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701765+0100 mds.minilevinson-983db664a3 (mds.0) 1376 : cluster [WRN] slow request 7.203507
seconds old, received at 2024-03-28T19:18:25.497610+0100: client_request(mds.0:181210 rename #0x100026f338d/sh15637 #0x606/1000295a11a caller_uid=0, caller_gid=0{}) currently submit
entry: journal_and_reply
-422> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701771+0100 mds.minilevinson-983db664a3 (mds.0) 1377 : cluster [WRN] slow request 6.519057
seconds old, received at 2024-03-28T19:18:26.182061+0100: client_request(mds.0:181239 rename #0x100026f338d/sh15811 #0x608/1000295a58f caller_uid=0, caller_gid=0{}) currently submit
entry: journal_and_reply
-421> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701774+0100 mds.minilevinson-983db664a3 (mds.0) 1378 : cluster [WRN] slow request 6.372089
seconds old, received at 2024-03-28T19:18:26.329029+0100: client_request(mds.0:181300 rename #0x1000185ba3c/file117_hardmv_hard #0x603/10002567d82 caller_uid=0, caller_gid=0{}) curr
ently failed to wrlock, waiting
-420> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701776+0100 mds.minilevinson-983db664a3 (mds.0) 1379 : cluster [WRN] slow request 6.371098
seconds old, received at 2024-03-28T19:18:26.330020+0100: client_request(mds.0:181320 rename #0x1000185ba3c/file373_hardmv_hard #0x608/1000256807c caller_uid=0, caller_gid=0{}) curr
ently failed to wrlock, waiting
-419> 2024-03-28T19:18:33.765+0100 7f40e06ef700 10 log_client logged 2024-03-28T19:18:32.701781+0100 mds.minilevinson-983db664a3 (mds.0) 1380 : cluster [WRN] slow request 5.565985
seconds old, received at 2024-03-28T19:18:27.135132+0100: client_request(mds.0:181349 rename #0x100026f338d/sh15708 #0x606/1000295a111 caller_uid=0, caller_gid=0{}) currently failed
to wrlock, waiting
</pre></p>
<p>We've configured the mds `mds_op_complaint_time` to 5s in order to track some potential slow locks</p>
CephFS - Bug #65224 (New): mds: fs subvolume rm fails
https://tracker.ceph.com/issues/65224
2024-03-29T09:52:40Z
Milind Changire
<p>`fs subvolume rm` fails when subvolume dir attempted to move to a different dir where the following code fails in src/mds/Server.cc:Server::handle_client_rename</p>
<pre>
if (src_realm != dest_realm &&
src_realm->get_subvolume_ino() != dest_realm->get_subvolume_ino()) {
respond_to_request(mdr, -CEPHFS_EXDEV);
return;
}
</pre>
CephFS - Backport #65223 (New): squid: cephfs-mirror: use snapdiff api for efficient tree traversal
https://tracker.ceph.com/issues/65223
2024-03-29T08:23:14Z
Backport Bot
CephFS - Backport #65222 (New): reef: cephfs-mirror: use snapdiff api for efficient tree traversal
https://tracker.ceph.com/issues/65222
2024-03-29T08:23:06Z
Backport Bot
Dashboard - Backport #65221 (New): reef: ceph-mixin: Add RBD Mirror monitoring alerts
https://tracker.ceph.com/issues/65221
2024-03-29T08:01:18Z
Backport Bot
Dashboard - Backport #65220 (New): squid: ceph-mixin: Add RBD Mirror monitoring alerts
https://tracker.ceph.com/issues/65220
2024-03-29T08:01:08Z
Backport Bot
Dashboard - Bug #65218 (New): mgr/dashboard: Grafana ceph-cluster.json doesn't support cluster label
https://tracker.ceph.com/issues/65218
2024-03-29T07:56:13Z
Ankush Behl
<a name="Description-of-problem"></a>
<h3 >Description of problem<a href="#Description-of-problem" class="wiki-anchor">¶</a></h3>
<p>Grafana ceph-cluster.json doesn't support cluster label.</p>
<p>We don't have jsonnet for now for ceph-cluster.json file. So the Fix just adds the label to JSON only.</p>
<a name="Environment"></a>
<h3 >Environment<a href="#Environment" class="wiki-anchor">¶</a></h3>
<ul>
<li><code>ceph version</code> string:</li>
<li>Platform (OS/distro/release):</li>
<li>Cluster details (nodes, monitors, OSDs):</li>
<li>Did it happen on a stable environment or after a migration/upgrade?:</li>
<li>Browser used (e.g.: <code>Version 86.0.4240.198 (Official Build) (64-bit)</code>):</li>
</ul>
<a name="How-reproducible"></a>
<h3 >How reproducible<a href="#How-reproducible" class="wiki-anchor">¶</a></h3>
<p>Steps:</p>
<ol>
<li> Add multiple cluster via ceph dashboard</li>
<li>The graphs are breaking on grafana Ceph Cluster dashboard.</li>
<li>...</li>
</ol>
<a name="Actual-results"></a>
<h3 >Actual results<a href="#Actual-results" class="wiki-anchor">¶</a></h3>
<p>Single graph shows multiple values</p>
<a name="Expected-results"></a>
<h3 >Expected results<a href="#Expected-results" class="wiki-anchor">¶</a></h3>
<p>Graphs should be adaptable to cluster label.</p>
rgw - Bug #65216 (New): rgw: only accept valid ipv4 from host header
https://tracker.ceph.com/issues/65216
2024-03-29T00:30:04Z
Seena Fallah
<p>Right now the validation for ipv4 from the host header is based on the number of periods - this leads to accepting invalid ips.</p>
Dashboard - Backport #65211 (New): reef: mgr/dashboard: Mark placement targets as non-required
https://tracker.ceph.com/issues/65211
2024-03-28T16:05:55Z
Backport Bot
Dashboard - Backport #65210 (New): squid: mgr/dashboard: Mark placement targets as non-required
https://tracker.ceph.com/issues/65210
2024-03-28T16:05:48Z
Backport Bot
Dashboard - Backport #65209 (New): reef: mgr/dashboard: Align security fieldset and tag fieldset ...
https://tracker.ceph.com/issues/65209
2024-03-28T16:05:40Z
Backport Bot
Dashboard - Backport #65208 (New): squid: mgr/dashboard: Align security fieldset and tag fieldset...
https://tracker.ceph.com/issues/65208
2024-03-28T16:05:32Z
Backport Bot
Dashboard - Cleanup #65207 (New): mgr/dashboard: Move features to advanced section in create imag...
https://tracker.ceph.com/issues/65207
2024-03-28T15:43:36Z
Afreen Misbah
<a name="Move-features-to-advanced-section-in-create-image-form"></a>
<h3 >Move features to advanced section in create image form<a href="#Move-features-to-advanced-section-in-create-image-form" class="wiki-anchor">¶</a></h3>
<p>A followup from the comment <a class="external" href="https://github.com/ceph/ceph/pull/56514#issuecomment-2022426715">https://github.com/ceph/ceph/pull/56514#issuecomment-2022426715</a></p>
crimson - Bug #65203 (New): ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, crimso...
https://tracker.ceph.com/issues/65203
2024-03-28T15:00:23Z
Matan Breizman
<p>osd.3: <a class="external" href="https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626294">https://pulpito.ceph.com/matan-2024-03-27_13:02:57-crimson-rados-main-distro-crimson-smithi/7626294</a></p>
<p>After adding a restart OSDs to the thrash tests: <a class="external" href="https://github.com/ceph/ceph/pull/56511">https://github.com/ceph/ceph/pull/56511</a></p>
<pre><code class="text syntaxhl"><span class="CodeRay">DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): starting start_pg_operation
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation in await_active stage
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation active, entering await_map
DEBUG 2024-03-27 13:26:06,805 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): start_pg_operation await_map stage
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): got map 26, entering get_pg_mapping
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): can_create=false, target-core=2
DEBUG 2024-03-27 13:26:06,806 [shard 0:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): send 37 to the remote pg core 2
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): entering create_or_wait_pg
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))})): have_pg
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - 0x603000429b00 RecoverySubRequest::with_pg: RecoverySubRequest::with_pg: background_recovery_sub(id=362, detail=MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))}))
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - handle_pull_response: MOSDPGPush(3.d 26/25 {PushOp(3:bd1211d5:::smithi05531420-40:1, version: 18'16, data_included: [655473~716476,2099033~332100], data_size: 1048576, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false), after_progress: ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false), before_progress: ObjectRecoveryProgress(first, data_recovered_to: 0, data_complete: false, omap_recovered_to: , omap_complete: false, error: false))}) v4
DEBUG 2024-03-27 13:26:06,806 [shard 2:main] osd - handle_pull_response ObjectRecoveryInfo(3:bd1211d5:::smithi05531420-40:1@0'0, size: 2655473, copy_subset: [(0, 2655473)], clone_subset: {}, snapset: 1=[]:{1: [1]}, object_exist: false) ObjectRecoveryProgress(!first, data_recovered_to: 2431133, data_complete: false, omap_recovered_to: , omap_complete: true, error: false) data.size() is 1048576 data_included: [(655473, 716476), (2099033, 332100)]
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::with_head_obc: object 3:bd1211d5:::smithi05531420-40:head
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::get_or_load_obc: cache hit on 3:bd1211d5:::smithi05531420-40:head
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - resolve_oid oid.snap=1,head snapset.seq=1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::get_or_load_obc: cache miss on 3:bd1211d5:::smithi05531420-40:1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - load_metadata: object 3:bd1211d5:::smithi05531420-40:1 doesn't exist, returning empty metadata
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::load_obc: loaded obs 3:bd1211d5:::smithi05531420-40:1(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) for 3:bd1211d5:::smithi05531420-40:1
DEBUG 2024-03-27 13:26:06,807 [shard 2:main] osd - pg_epoch 26 pg[3.d( v 26'20 lc 17'15 (0'0,26'20] local-lis/les=25/26 n=0 ec=14/14 lis/c=25/14 les/c/f=26/15/0 sis=25) [3,0] r=0 lpr=25 pi=[14,25)/1 luod=26'21 lua=21'18 crt=26'21 mlcod 17'15 active+recovering+undersized+degraded ObjectContextLoader::load_obc: returning obc 3:bd1211d5:::smithi05531420-40:1(0'0 unknown.0.0:0 s 0 uv 0 alloc_hint [0 0 0]) for 3:bd1211d5:::smithi05531420-40:1
ceph-osd: /home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-2476-g56e21662/rpm/el9/BUILD/ceph-19.0.0-2476-g56e21662/src/crimson/osd/replicated_recovery_backend.cc:886: void ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, crimson::osd::SnapSetContextRef): Assertion `ssc' failed.
Aborting on shard 2.
Backtrace:
0# 0x00007F182BAA154C in /lib64/libc.so.6
1# raise in /lib64/libc.so.6
2# abort in /lib64/libc.so.6
3# 0x00007F182BA2871B in /lib64/libc.so.6
4# 0x00007F182BA4DCA6 in /lib64/libc.so.6
5# ReplicatedRecoveryBackend::recalc_subsets(ObjectRecoveryInfo&, boost::intrusive_ptr<crimson::osd::SnapSetContext>) in ceph-osd
</span></code></pre>