https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2018-04-10T21:09:10ZCeph RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1108322018-04-10T21:09:10ZSage Weilsage@newdream.net
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-6 priority-high2 closed" href="/issues/22881">Bug #22881</a>: scrub interaction with HEAD boundaries and snapmapper repair is broken</i> added</li></ul> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1108332018-04-10T21:09:20ZSage Weilsage@newdream.net
<ul><li><strong>Backport</strong> set to <i>luminous</i></li></ul> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1116002018-04-20T16:45:48ZDavid Zafmandzafman@redhat.com
<ul></ul><p>We don't start trimming if scrubbing is happening, so maybe the only hole is that scrubbing doesn't check for trimming.</p>
<pre>
bool can_trim() {
return
pg->is_clean() &&
!pg->scrubber.active &&
!pg->snap_trimq.empty() &&
!pg->get_osdmap()->test_flag(CEPH_OSDMAP_NOSNAPTRIM);
}
</pre> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1116032018-04-20T17:19:05ZDavid Zafmandzafman@redhat.com
<ul></ul><p>I guess the intention is that scrubbing takes priority and proceeds even if trimming is in progress. Before more trim work is going to be done it stops if scrubbing is active. So the only way we could have a problem here, is that scrub might proceed with the most recent trims in flight? No more trim work will be queued once scrubber.active is set in chunky_scrub().</p> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1116752018-04-23T21:40:27ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>In Progress</i></li></ul><p>The osd log for primary osd.1 shows that pg 3.0 is a cache pool in a cache tiering configuration. The message "_delete_oid has or will have clones but no_whiteout=1" diagnostic could indicate a problem because we shouldn't evict a head object with any clones still there in the cache pool.</p>
<pre>
2018-04-10 02:31:40.048066 7f5c99570700 10 osd.1 pg_epoch: 506 pg[3.0( v 506'2158 (177'603,506'2158] local-lis/les=253/254 n=49 ec=20/20 lis/c 253/253 les/c/f 254/258/0 253/253/20) [1,2,6] r=0 lpr=253 luod=505'2154 lua=506'2155 crt=506'2158 lcod 504'2153 mlcod 504'2153 active+clean+scrubbing] agent_maybe_evict evicting 3:2525d12f:::smithi03315943-21 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head(502'2141 osd.1.0:3764 data_digest|omap_digest s 3949843 uv 1651 dd 51da6cc7 od ffffffff alloc_hint [0 0 0])
2018-04-10 02:31:40.048071 7f5c99570700 20 osd.1 pg_epoch: 506 pg[3.0( v 506'2158 (177'603,506'2158] local-lis/les=253/254 n=49 ec=20/20 lis/c 253/253 les/c/f 254/258/0 253/253/20) [1,2,6] r=0 lpr=253 luod=505'2154 lua=506'2155 crt=506'2158 lcod 504'2153 mlcod 504'2153 active+clean+scrubbing] simple_opc_create 3:2525d12f:::smithi03315943-21 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head
2018-04-10 02:31:40.048082 7f5c99570700 20 osd.1 pg_epoch: 506 pg[3.0( v 506'2158 (177'603,506'2158] local-lis/les=253/254 n=49 ec=20/20 lis/c 253/253 les/c/f 254/258/0 253/253/20) [1,2,6] r=0 lpr=253 luod=505'2154 lua=506'2155 crt=506'2158 lcod 504'2153 mlcod 504'2153 active+clean+scrubbing] _delete_oid has or will have clones but no_whiteout=1
2018-04-10 02:31:40.048087 7f5c99570700 20 osd.1 pg_epoch: 506 pg[3.0( v 506'2158 (177'603,506'2158] local-lis/les=253/254 n=49 ec=20/20 lis/c 253/253 les/c/f 254/258/0 253/253/20) [1,2,6] r=0 lpr=253 luod=505'2154 lua=506'2155 crt=506'2158 lcod 504'2153 mlcod 504'2153 active+clean+scrubbing] _delete_oid 3:2525d12f:::smithi03315943-21 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head whiteout=0 no_whiteout=1 try_no_whiteout=0
2018-04-10 02:31:40.050885 7f5c99570700 10 bluestore(/var/lib/ceph/osd/ceph-1) _remove 3.0_head #3:2525d12f:::smithi03315943-21 oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo:head# = 0
</pre> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1116772018-04-24T00:42:11ZDavid Zafmandzafman@redhat.com
<ul></ul><p>The commit below adds code to honor the no_whiteout flag even when it looks like clones exist or will exist soon. The agent_maybe_evict() used _verify_no_head_clones() to see if clones still exist.</p>
<pre>
if (!snapset.clones.empty() ||
(!ctx->snapc.snaps.empty() && ctx->snapc.snaps[0] > snapset.seq)) {
if (no_whiteout) {
dout(20) << __func__ << " has or will have clones but no_whiteout=1"
<< dendl;
</pre>
<pre>
commit de6f09f43a5213b16a9bc952e5fa092c991abb40
Author: Sage Weil <sage@redhat.com>
Date: Fri Apr 7 13:18:50 2017 -0400
osd/PrimaryLogPG: delete + ignore_cache is a soft hint
We may still need to create a whiteout because clones still exist.
Arguably delete+ignore_cache is not the right way to remove whiteouts and
we should have a separate RADOS operation for this. But we don't.
Signed-off-by: Sage Weil <sage@redhat.com>
</pre> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1117542018-04-24T20:36:26ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Fix Under Review</i></li></ul><p><a class="external" href="https://github.com/ceph/ceph/pull/21628">https://github.com/ceph/ceph/pull/21628</a></p>
<p>I think this will fix it?</p> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1118202018-04-25T15:58:01ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li></ul> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1118212018-04-25T16:00:01ZSage Weilsage@newdream.net
<ul></ul><p>master commit says:</p>
<p>Consider a scenario like:<br />- scrub [3:2525d100:::earlier:head,3:2525d12f:::foo:200]<br /> - we see 3:2525d12f:::foo:100 and include it in scrub map<br />- scrub [3:2525d12f:::foo:200, 3:2525dfff:::later:head]<br />- some op(s) that cause scrub to be preempted<br />- agent_work wants to evict 3:2525d12f:::foo:100<br /> - write_blocked_by_scrub sees scrub is preempted, returns false<br /> - 3:2525d12f:::foo:100 is removed, :head SnapSet is updated<br />- scrub rescrubs [3:2525d12f:::foo:200, 3:2525dfff:::later:head]<br /> - includes (updated) :head SnapSet<br /> - issues error like "3:2525d12f:::foo:100 is an unexpected clone"</p>
<p>Fix the problem by checking if anything part of the object-to-evict and<br />its head touch the scrub range; if so, back off. Do not let eviction<br />preempt scrub; we can come back and do it later.</p>
<p>Fixes: <a class="external" href="http://tracker.ceph.com/issues/23646">http://tracker.ceph.com/issues/23646</a><br />Signed-off-by: Sage Weil <<a class="email" href="mailto:sage@redhat.com">sage@redhat.com</a>></p>
<p>The same bug can happen without scrub preemption if there is a scrub chunk between the clone-to-remove and the head (i.e., object has lots of clones). Thus we should still backport this to luminous!</p> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1118262018-04-25T16:23:27ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/23863">Backport #23863</a>: luminous: scrub interaction with HEAD boundaries and clones is broken</i> added</li></ul> RADOS - Bug #23646: scrub interaction with HEAD boundaries and clones is brokenhttps://tracker.ceph.com/issues/23646?journal_id=1142682018-05-29T18:49:04ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>