https://tracker.ceph.com/
https://tracker.ceph.com/favicon.ico
2021-04-08T02:24:28Z
Ceph
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=189748
2021-04-08T02:24:28Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/189748/diff?detail_id=196549">diff</a>)</li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=191933
2021-04-16T02:55:51Z
Patrick Donnelly
pdonnell@redhat.com
<ul></ul><p><a class="external" href="https://pulpito.ceph.com/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/6047582/">https://pulpito.ceph.com/pdonnell-2021-04-15_01:35:57-fs-wip-pdonnell-testing-20210414.230315-distro-basic-smithi/6047582/</a></p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=199286
2021-07-16T17:11:06Z
Patrick Donnelly
pdonnell@redhat.com
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-4 priority-default closed" href="/issues/51706">Bug #51706</a>: pacific: qa: osd deep-scrub stat mismatch</i> added</li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=210562
2022-02-15T19:15:14Z
Laura Flores
<ul></ul><p>Looks similar, but different test.</p>
<p>/a/yuriw-2022-02-09_22:52:18-rados-wip-yuri5-testing-2022-02-09-1322-pacific-distro-default-smithi/6672171</p>
<p>Description: rados/upgrade/nautilus-x-singleton/{0-cluster/{openstack start} 1-install/nautilus 2-partial-upgrade/firsthalf 3-thrash/default 4-workload/{rbd-cls rbd-import-export readwrite snaps-few-objects} 5-workload/{radosbench rbd_api} 6-finish-upgrade 7-pacific 8-workload/{rbd-python snaps-many-objects} bluestore-bitmap mon_election/connectivity thrashosds-health ubuntu_18.04}</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=216746
2022-05-26T22:33:17Z
Laura Flores
<ul></ul><p>/a/yuriw-2022-05-13_14:13:55-rados-wip-yuri3-testing-2022-05-12-1609-octopus-distro-default-smithi/6832544</p>
<p>Description: rados/thrash-erasure-code-big/{ceph cluster/{12-osds openstack} msgr-failures/osd-dispatch-delay objectstore/bluestore-comp-zstd rados recovery-overrides/{default} supported-random-distro$/{ubuntu_latest} thrashers/careful thrashosds-health workloads/ec-rados-plugin=jerasure-k=4-m=2}</p>
<p>teuthology.log - Health check reports an inconsistent pg:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2022-05-14T08:13:30.891 INFO:tasks.rados.rados.0.smithi081.stdout:2767: got expected ENOENT (src dne)
2022-05-14T08:13:30.892 INFO:tasks.ceph.mon.a.smithi081.stderr:2022-05-14T08:13:30.884+0000 7fa6e9dab700 -1 log_channel(cluster) log [ERR] : Health check failed: 1 scrub errors (OSD_SCRUB_ERRORS)
2022-05-14T08:13:30.892 INFO:tasks.ceph.mon.a.smithi081.stderr:2022-05-14T08:13:30.884+0000 7fa6e9dab700 -1 log_channel(cluster) log [ERR] : Health check failed: Possible data damage: 1 pg inconsistent (PG_DAMAGED)
</span></code></pre></p>
<p>The affected pg in this instance is 3.1e. Here is the point ceph-osd.3.log.gz where it first becomes inconsistent. We can see that this occurs during deep scrubbing:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2022-05-14T08:13:24.228+0000 7fa5d70fd700 10 osd.3 pg_epoch: 621 pg[3.1es0( v 620'1485 (0'0,620'1485] local-lis/les=582/583 n=7 ec=246/26 lis/c=582/582 les/c/f=583/587/0 sis=582) [3,9,8,2,7,6]p3(0) r=0 lpr=582 crt=620'1485 lcod 619'1483 mlcod 619'1483 active+clean+scrubbing+deep trimq=[1ff~1] ps=[188~1,1d0~1,1d7~2,1db~1,1e5~2,1eb~1,1ed~4,1f4~2,1fa~4]] deep-scrub got 7/7 objects, 3/3 clones, 7/7 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 11272192/11845632 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes.
2022-05-14T08:13:24.228+0000 7fa5d70fd700 10 osd.3 pg_epoch: 621 pg[3.1es0( v 620'1485 (0'0,620'1485] local-lis/les=582/583 n=7 ec=246/26 lis/c=582/582 les/c/f=583/587/0 sis=582) [3,9,8,2,7,6]p3(0) r=0 lpr=582 crt=620'1485 lcod 619'1483 mlcod 619'1483 active+clean+scrubbing+deep trimq=[1ff~1] ps=[188~1,1d0~1,1d7~2,1db~1,1e5~2,1eb~1,1ed~4,1f4~2,1fa~4]] scrub_process_inconsistent: checking authoritative
2022-05-14T08:13:24.228+0000 7fa5d70fd700 20 osd.3 pg_epoch: 621 pg[3.1es0( v 620'1485 (0'0,620'1485] local-lis/les=582/583 n=7 ec=246/26 lis/c=582/582 les/c/f=583/587/0 sis=582) [3,9,8,2,7,6]p3(0) r=0 lpr=582 crt=620'1485 lcod 619'1483 mlcod 619'1483 active+clean+scrubbing+deep+inconsistent trimq=[1ff~1] ps=[188~1,1d0~1,1d7~2,1db~1,1e5~2,1eb~1,1ed~4,1f4~2,1fa~4]] prepare_stats_for_publish reporting purged_snaps [188~1,1d0~1,1d7~2,1db~1,1e5~2,1eb~1,1ed~4,1f4~2,1fa~4]
2022-05-14T08:13:24.228+0000 7fa5d70fd700 15 osd.3 pg_epoch: 621 pg[3.1es0( v 620'1485 (0'0,620'1485] local-lis/les=582/583 n=7 ec=246/26 lis/c=582/582 les/c/f=583/587/0 sis=582) [3,9,8,2,7,6]p3(0) r=0 lpr=582 crt=620'1485 lcod 619'1483 mlcod 619'1483 active+clean+scrubbing+deep+inconsistent trimq=[1ff~1] ps=[188~1,1d0~1,1d7~2,1db~1,1e5~2,1eb~1,1ed~4,1f4~2,1fa~4]] publish_stats_to_osd 620:3621
</span></code></pre></p>
<p>ceph.log.gz shows also that the "stat mismatch" first occurs during deep scrub on pg 3.1e, which causes the pg to enter an inconsistent state:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2022-05-14T08:13:23.220943+0000 osd.3 (osd.3) 50 : cluster [DBG] 3.1e deep-scrub starts
2022-05-14T08:13:23.850823+0000 mon.a (mon.0) 2182 : cluster [WRN] Health check failed: noscrub flag(s) set (OSDMAP_FLAGS)
2022-05-14T08:13:23.865188+0000 mon.a (mon.0) 2185 : cluster [DBG] osdmap e621: 12 total, 12 up, 12 in
2022-05-14T08:13:24.232155+0000 osd.3 (osd.3) 51 : cluster [ERR] 3.1es0 deep-scrub : stat mismatch, got 7/7 objects, 3/3 clones, 7/7 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 whiteouts, 11272192/11845632 bytes, 0/0 manifest objects, 0/0 hit_set_archive bytes.
2022-05-14T08:13:24.232175+0000 osd.3 (osd.3) 52 : cluster [ERR] 3.1e deep-scrub 1 errors
2022-05-14T08:13:24.461896+0000 mgr.x (mgr.4108) 388 : cluster [DBG] pgmap v936: 65 pgs: 1 recovery_wait+undersized+degraded+peered, 1 active+clean+scrubbing+deep, 1 active+clean+scrubbing, 1 active+remapped+backfill_toofull, 61 active+clean; 187 MiB data, 360 MiB used, 1.0 TiB / 1.1 TiB avail; 0 B/s wr, 5 op/s; 225/456 objects degraded (49.342%); 3/456 objects misplaced (0.658%); 0 B/s, 0 keys/s, 0 objects/s recovering
2022-05-14T08:13:24.860234+0000 mon.a (mon.0) 2187 : cluster [WRN] Health check failed: Degraded data redundancy: 225/456 objects degraded (49.342%), 1 pg degraded (PG_DEGRADED)
2022-05-14T08:13:24.872905+0000 mon.a (mon.0) 2189 : cluster [DBG] osdmap e622: 12 total, 12 up, 12 in
2022-05-14T08:13:26.462690+0000 mgr.x (mgr.4108) 392 : cluster [DBG] pgmap v938: 65 pgs: 1 recovery_wait+undersized+degraded+peered, 1 active+clean+scrubbing+deep, 1 active+clean+scrubbing, 1 active+remapped+backfill_toofull, 61 active+clean; 158 MiB data, 331 MiB used, 1.0 TiB / 1.1 TiB avail; 0 B/s wr, 4 op/s; 225/408 objects degraded (55.147%); 4/408 objects misplaced (0.980%); 0 B/s, 0 keys/s, 0 objects/s recovering
2022-05-14T08:13:26.720695+0000 mon.a (mon.0) 2194 : cluster [DBG] osdmap e623: 12 total, 12 up, 12 in
2022-05-14T08:13:27.876665+0000 mon.a (mon.0) 2195 : cluster [DBG] osdmap e624: 12 total, 12 up, 12 in
2022-05-14T08:13:28.463582+0000 mgr.x (mgr.4108) 393 : cluster [DBG] pgmap v941: 65 pgs: 1 recovering+undersized+degraded+peered, 1 active+clean+snaptrim, 1 active+clean+scrubbing+deep, 1 active+remapped+backfill_toofull, 61 active+clean; 158 MiB data, 335 MiB used, 1.0 TiB / 1.1 TiB avail; 2.3 MiB/s rd, 10 KiB/s wr, 3 op/s; 4/408 objects degraded (0.980%); 29/408 objects misplaced (7.108%); 0 B/s, 0 keys/s, 5 objects/s recovering
2022-05-14T08:13:28.882108+0000 mon.a (mon.0) 2197 : cluster [DBG] osdmap e625: 12 total, 12 up, 12 in
2022-05-14T08:13:29.878578+0000 mon.a (mon.0) 2199 : cluster [WRN] Health check update: nodeep-scrub flag(s) set (OSDMAP_FLAGS)
2022-05-14T08:13:29.885488+0000 mon.a (mon.0) 2201 : cluster [DBG] osdmap e626: 12 total, 12 up, 12 in
2022-05-14T08:13:30.464355+0000 mgr.x (mgr.4108) 396 : cluster [DBG] pgmap v944: 65 pgs: 1 active+clean+inconsistent+snaptrim, 1 active+clean+snaptrim_wait, 1 recovering+undersized+degraded+peered, 2 active+clean+snaptrim, 1 active+remapped+backfill_toofull, 59 active+clean; 161 MiB data, 333 MiB used, 1.0 TiB / 1.1 TiB avail; 4.3 MiB/s rd, 2.5 MiB/s wr, 12 op/s; 4/420 objects degraded (0.952%); 29/420 objects misplaced (6.905%); 0 B/s, 5 objects/s recovering
2022-05-14T08:13:30.709741+0000 mon.a (mon.0) 2203 : cluster [WRN] Health check update: Degraded data redundancy: 4/408 objects degraded (0.980%), 1 pg degraded (PG_DEGRADED)
2022-05-14T08:13:30.889499+0000 mon.a (mon.0) 2204 : cluster [ERR] Health check failed: 1 scrub errors (OSD_SCRUB_ERRORS)
2022-05-14T08:13:30.889554+0000 mon.a (mon.0) 2205 : cluster [ERR] Health check failed: Possible data damage: 1 pg inconsistent (PG_DAMAGED)
</span></code></pre></p>
<p>Later in ceph.log.gz, all pgs eventually become active+clean, but osd.0 is reported to fail:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2022-05-14T08:19:28.611362+0000 mgr.x (mgr.4108) 653 : cluster [DBG] pgmap v1418: 59 pgs: 59 active+clean; 0 B data, 129 MiB used, 1.0 TiB / 1.1 TiB avail
2022-05-14T08:19:29.397740+0000 mon.a (mon.0) 3183 : cluster [DBG] osd.0 reported immediately failed by osd.3
2022-05-14T08:19:29.397832+0000 mon.a (mon.0) 3184 : cluster [INF] osd.0 failed (root=default,host=smithi081) (connection refused reported by osd.3)
</span></code></pre></p>
<p>The teuthology.log reveals in the last recorded pgmap that pg 3.13 is "active+clean+inconsistent":<br /><pre><code class="json syntaxhl"><span class="CodeRay">{
<span class="key"><span class="delimiter">"</span><span class="content">pgs_by_state</span><span class="delimiter">"</span></span>: [
{
<span class="key"><span class="delimiter">"</span><span class="content">state_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">active+clean</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">count</span><span class="delimiter">"</span></span>: <span class="integer">81</span>
},
{
<span class="key"><span class="delimiter">"</span><span class="content">state_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">active+remapped+backfill_toofull</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">count</span><span class="delimiter">"</span></span>: <span class="integer">2</span>
},
{
<span class="key"><span class="delimiter">"</span><span class="content">state_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">peering</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">count</span><span class="delimiter">"</span></span>: <span class="integer">1</span>
},
{
<span class="key"><span class="delimiter">"</span><span class="content">state_name</span><span class="delimiter">"</span></span>: <span class="string"><span class="delimiter">"</span><span class="content">active+clean+inconsistent</span><span class="delimiter">"</span></span>,
<span class="key"><span class="delimiter">"</span><span class="content">count</span><span class="delimiter">"</span></span>: <span class="integer">1</span>
}
],
<span class="key"><span class="delimiter">"</span><span class="content">num_pgs</span><span class="delimiter">"</span></span>: <span class="integer">85</span>,
<span class="key"><span class="delimiter">"</span><span class="content">num_pools</span><span class="delimiter">"</span></span>: <span class="integer">3</span>,
<span class="key"><span class="delimiter">"</span><span class="content">num_objects</span><span class="delimiter">"</span></span>: <span class="integer">201</span>,
<span class="key"><span class="delimiter">"</span><span class="content">data_bytes</span><span class="delimiter">"</span></span>: <span class="integer">407617536</span>,
<span class="key"><span class="delimiter">"</span><span class="content">bytes_used</span><span class="delimiter">"</span></span>: <span class="integer">10346008576</span>,
<span class="key"><span class="delimiter">"</span><span class="content">bytes_avail</span><span class="delimiter">"</span></span>: <span class="integer">859384868864</span>,
<span class="key"><span class="delimiter">"</span><span class="content">bytes_total</span><span class="delimiter">"</span></span>: <span class="integer">869730877440</span>,
<span class="key"><span class="delimiter">"</span><span class="content">inactive_pgs_ratio</span><span class="delimiter">"</span></span>: <span class="float">0.0117647061124444</span>,
<span class="key"><span class="delimiter">"</span><span class="content">misplaced_objects</span><span class="delimiter">"</span></span>: <span class="integer">37</span>,
<span class="key"><span class="delimiter">"</span><span class="content">misplaced_total</span><span class="delimiter">"</span></span>: <span class="integer">1190</span>,
<span class="key"><span class="delimiter">"</span><span class="content">misplaced_ratio</span><span class="delimiter">"</span></span>: <span class="float">0.031092436974789917</span>,
<span class="key"><span class="delimiter">"</span><span class="content">recovering_objects_per_sec</span><span class="delimiter">"</span></span>: <span class="integer">1</span>,
<span class="key"><span class="delimiter">"</span><span class="content">recovering_bytes_per_sec</span><span class="delimiter">"</span></span>: <span class="integer">4470879</span>,
<span class="key"><span class="delimiter">"</span><span class="content">recovering_keys_per_sec</span><span class="delimiter">"</span></span>: <span class="integer">0</span>,
<span class="key"><span class="delimiter">"</span><span class="content">num_objects_recovered</span><span class="delimiter">"</span></span>: <span class="integer">7</span>,
<span class="key"><span class="delimiter">"</span><span class="content">num_bytes_recovered</span><span class="delimiter">"</span></span>: <span class="integer">22675456</span>,
<span class="key"><span class="delimiter">"</span><span class="content">num_keys_recovered</span><span class="delimiter">"</span></span>: <span class="integer">0</span>,
<span class="key"><span class="delimiter">"</span><span class="content">read_bytes_sec</span><span class="delimiter">"</span></span>: <span class="integer">3431497</span>,
<span class="key"><span class="delimiter">"</span><span class="content">write_bytes_sec</span><span class="delimiter">"</span></span>: <span class="integer">0</span>,
<span class="key"><span class="delimiter">"</span><span class="content">read_op_per_sec</span><span class="delimiter">"</span></span>: <span class="integer">2</span>,
<span class="key"><span class="delimiter">"</span><span class="content">write_op_per_sec</span><span class="delimiter">"</span></span>: <span class="integer">4</span>
}
</span></code></pre></p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=216747
2022-05-26T22:36:27Z
Laura Flores
<ul><li><strong>Backport</strong> set to <i>quincy,pacific,octopus</i></li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=217974
2022-06-13T22:26:41Z
Laura Flores
<ul></ul><p>/a/yuriw-2022-06-07_19:48:58-rados-wip-yuri6-testing-2022-06-07-0955-pacific-distro-default-smithi/6866688</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=217977
2022-06-13T22:27:58Z
Laura Flores
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-10 priority-3 priority-lowest closed" href="/issues/52737">Bug #52737</a>: osd/tests: stat mismatch </i> added</li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=218804
2022-06-23T20:44:00Z
Laura Flores
<ul><li><strong>Related to</strong> deleted (<i><a class="issue tracker-1 status-10 priority-3 priority-lowest closed" href="/issues/52737">Bug #52737</a>: osd/tests: stat mismatch </i>)</li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=218806
2022-06-23T20:44:10Z
Laura Flores
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-3 priority-lowest closed" href="/issues/52737">Bug #52737</a>: osd/tests: stat mismatch </i> added</li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=218807
2022-06-23T20:44:49Z
Laura Flores
<ul></ul><p>/a/yuriw-2022-06-14_20:42:00-rados-wip-yuri2-testing-2022-06-14-0949-octopus-distro-default-smithi/6878271</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=219310
2022-06-29T18:29:40Z
Radoslaw Zarzynski
rzarzyns@redhat.com
<ul><li><strong>Assignee</strong> set to <i>Laura Flores</i></li></ul><p>Not a terribly high priority.</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=220763
2022-07-19T16:11:19Z
Rishabh Dave
<ul></ul><p>This error showed up in QA runs -<br /><a class="external" href="http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6921153">http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6921153</a><br /><a class="external" href="http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6921085">http://pulpito.front.sepia.ceph.com/rishabh-2022-07-08_23:53:34-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6921085</a></p>
<p>Showed up in re-run as well -<br /><a class="external" href="http://pulpito.front.sepia.ceph.com/rishabh-2022-07-15_06:42:04-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6931942">http://pulpito.front.sepia.ceph.com/rishabh-2022-07-15_06:42:04-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6931942</a><br /><a class="external" href="http://pulpito.front.sepia.ceph.com/rishabh-2022-07-15_06:42:04-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6931950">http://pulpito.front.sepia.ceph.com/rishabh-2022-07-15_06:42:04-fs-wip-rishabh-testing-2022Jul08-1820-testing-default-smithi/6931950</a></p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=220765
2022-07-19T16:24:13Z
Laura Flores
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>In Progress</i></li></ul><p>Thanks Rishabh, I am having a look into this.</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=221314
2022-07-26T15:14:57Z
Laura Flores
<ul></ul><p>All the tests that this has failed on involve thrashing. Specifically, they all use thrashosds-health.yaml (<a class="external" href="https://github.com/ceph/ceph/blob/main/qa/tasks/thrashosds-health.yaml">https://github.com/ceph/ceph/blob/main/qa/tasks/thrashosds-health.yaml</a>).</p>
<p>The tests are failing due to a shallow scrub error:<br /><pre><code class="text syntaxhl"><span class="CodeRay">2022-07-15T10:38:05.497+0000 7f2cd10f3700 10 osd.6 pg_epoch: 4260 pg[5.11s0( v 4101'12884 (3893'7859,4101'12884] local-lis/les=4141/4142 n=77 ec=25/25 lis/c=4141/4141 les/c/f=4142/4142/0 sis=4141) [6,4,7,1]p6(0) r=0 lpr=4141 crt=4101'12884 lcod 4101'12883 mlcod 0'0 active+clean+scrubbing+deep [ 5.11s0: REQ_SCRUB ] ] scrubber <WaitDigestUpdate/>: m_pg->recovery_state.update_stats() errors:1/0 deep? 1
</span></code></pre></p>
<p>What I'm noticing is that we do not have auto-repair enabled during these tests:<br /><pre><code class="text syntaxhl"><span class="CodeRay">4 lcod 4101'12883 mlcod 0'0 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] verify_scrub_mode pg: 5.11s0 allow: 1/1 deep errs: 0 auto-repair: 0 (0)
</span></code></pre></p>
<p>A solution we can try is enabling auto-repair by setting `osd_scrub_auto_repair=true` in thrashosds-health.yaml. This config option automatically repairs up to 5 scrub errors found by scrubs or deep-scrubs (see <a class="external" href="https://github.com/ceph/ceph/blob/f5857acd68e2349dc10207cd4110b84225a3ea42/src/common/options/osd.yaml.in#L366-L384">https://github.com/ceph/ceph/blob/f5857acd68e2349dc10207cd4110b84225a3ea42/src/common/options/osd.yaml.in#L366-L384</a>).</p>
<p>I am currently running some tests to reproduce the issue, and then I will try enabling auto-repair to see if this helps.</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=226622
2022-09-29T19:31:13Z
Kamoltat (Junior) Sirivadhna
<ul></ul><p>yuriw-2022-09-27_23:37:28-rados-wip-yuri2-testing-2022-09-27-1455-distro-default-smithi/7046253</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=227058
2022-10-05T18:46:18Z
Radoslaw Zarzynski
rzarzyns@redhat.com
<ul></ul><p>Hi Laura. In luck with verification of the hypothesis from the comment <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a>?</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=235750
2023-04-28T21:58:36Z
Laura Flores
<ul><li><strong>Tags</strong> set to <i>test-failure</i></li></ul><p>Radoslaw Zarzynski wrote:</p>
<blockquote>
<p>Hi Laura. In luck with verification of the hypothesis from the comment <a class="issue tracker-1 status-6 priority-4 priority-default closed" title="Bug: rm -r failure (Rejected)" href="https://tracker.ceph.com/issues/17">#17</a>?</p>
</blockquote>
<p>I ran this solution by Ronen, but he said auto-repair wouldn't help with deep scrub errors. I will revisit this.</p>
<p>/a/yuriw-2023-04-25_14:15:40-rados-pacific-release-distro-default-smithi/7251528<br />/a/yuriw-2023-04-26_01:16:19-rados-wip-yuri11-testing-2023-04-25-1605-pacific-distro-default-smithi/7253902<br />/a/yuriw-2023-05-10_14:47:51-rados-wip-yuri5-testing-2023-05-09-1324-pacific-distro-default-smithi/7269830<br />/a/yuriw-2023-06-06_18:42:04-rados-wip-yuri8-testing-2023-06-06-0830-reef-distro-default-smithi/7297252</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=240330
2023-06-07T07:02:19Z
Milind Changire
<ul></ul><p>quincy:<br /><a class="external" href="http://pulpito.front.sepia.ceph.com/yuriw-2023-05-31_21:56:15-fs-wip-yuri6-testing-2023-05-31-0933-quincy-distro-default-smithi/7292558">http://pulpito.front.sepia.ceph.com/yuriw-2023-05-31_21:56:15-fs-wip-yuri6-testing-2023-05-31-0933-quincy-distro-default-smithi/7292558</a></p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=240533
2023-06-12T17:14:27Z
Radoslaw Zarzynski
rzarzyns@redhat.com
<ul><li><strong>Assignee</strong> changed from <i>Laura Flores</i> to <i>Ronen Friedman</i></li></ul>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=243138
2023-07-28T16:01:49Z
Laura Flores
<ul></ul><p>/a/yuriw-2023-07-19_14:33:14-rados-wip-yuri11-testing-2023-07-18-0927-pacific-distro-default-smithi/7343461</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=243307
2023-07-31T18:06:07Z
Radoslaw Zarzynski
rzarzyns@redhat.com
<ul></ul><p>bump up.</p>
RADOS - Bug #50222: osd: 5.2s0 deep-scrub : stat mismatch
https://tracker.ceph.com/issues/50222?journal_id=245509
2023-09-01T18:55:13Z
Patrick Donnelly
pdonnell@redhat.com
<ul></ul><p>/teuthology/pdonnell-2023-08-31_15:31:51-fs-wip-batrick-testing-20230831.124848-pacific-distro-default-smithi/7385689/teuthology.log</p>