https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2010-07-23T11:14:01ZCeph Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=6922010-07-23T11:14:01ZSage Weilsage@newdream.net
<ul><li><strong>Target version</strong> changed from <i>v0.21</i> to <i>v0.22</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=10932010-09-23T09:22:29ZSage Weilsage@newdream.net
<ul><li><strong>Target version</strong> changed from <i>v0.22</i> to <i>v0.23</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=13322010-10-26T09:05:26ZSage Weilsage@newdream.net
<ul><li><strong>Target version</strong> changed from <i>v0.23</i> to <i>v0.24</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=16162010-11-10T09:46:47ZSage Weilsage@newdream.net
<ul><li><strong>Target version</strong> deleted (<del><i>v0.24</i></del>)</li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=57402011-08-31T18:19:28ZSage Weilsage@newdream.net
<ul><li><strong>Priority</strong> changed from <i>High</i> to <i>Normal</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=67122011-10-18T11:50:16ZSage Weilsage@newdream.net
<ul><li><strong>translation missing: en.field_position</strong> deleted (<del><i>614</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>1</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=67142011-10-18T14:06:55ZAnonymous
<ul></ul><p>Isn't the idempotency in that case "clone foo_head -> foo_2 IFF foo_2 does not exist" ?</p> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=67162011-10-18T14:25:19ZSage Weilsage@newdream.net
<ul></ul><p>Tommi Virtanen wrote:</p>
<blockquote>
<p>Isn't the idempotency in that case "clone foo_head -> foo_2 IFF foo_2 does not exist" ?</p>
</blockquote>
<p>That's almost enough for clone() (if we add O_EXCL and whitelist EEXIST for non-btrfs). It wouldn't catch something like</p>
<pre><code>1 clone A->B<br /> 2 modify A<br /> ...<br /> 3 delete B<br /> 4 &lt;crash&gt;<br /> &lt;replay from 1&gt;</code></pre>
<p>That trick also wouldn't work for clone_range(), which doesn't create a file.</p>
<p>It may be that we need to make transactions idempotent at a higher level, but it'd be dependent on what you clone to, and whether it is ever modified/removed... it'd be dependent on the particular, though, and hard to analyze/verify.</p> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=67182011-10-18T14:52:16ZSage Weilsage@newdream.net
<ul></ul><p>FWIW even if we know what not to replay, we could still be screwed with ext4 (which does not commit everything in order):</p>
<pre><code>clone A->B<br /> modify A<br /> &lt;fs commits A (before B)&gt;<br /> &lt;crash&gt;</code></pre>
<p>On replay, we don't actually have the old A to clone to B. :(</p> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=67512011-10-21T11:07:08ZSage Weilsage@newdream.net
<ul></ul><p>I think the simplest solution would be:</p>
<pre><code>- for all operations, set an xattr with the last op_seq to write to that file.<br /> - for any operation that is potentially non-idempotent, fsync(2) after doing it.<br /> - on replay, verify the xattr isn't == or newer to avoid re-doing the operation.</code></pre>
<p>Those operations would be:</p>
<pre><code>- create collection<br /> - clone<br /> - clone range</code></pre>
<p>We need to set the attr on all operations to avoid something like</p>
<pre><code>1 truncate B<br /> 2 clone A->B<br /> 3 modify A<br /> &lt;crash&gt;<br /> &lt;replay 1&gt;<br /> &lt;skip 2 due to xattr&gt;<br /> &lt;replay 3&gt;</code></pre> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=68672011-10-31T09:06:58ZSage Weilsage@newdream.net
<ul><li><strong>translation missing: en.field_story_points</strong> deleted (<del><i>0</i></del>)</li><li><strong>translation missing: en.field_position</strong> deleted (<del><i>6</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>6</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=69102011-10-31T10:28:55ZSage Weilsage@newdream.net
<ul><li><strong>translation missing: en.field_position</strong> deleted (<del><i>9</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>7</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=69112011-10-31T10:33:51ZSage Weilsage@newdream.net
<ul><li><strong>translation missing: en.field_story_points</strong> set to <i>5</i></li><li><strong>translation missing: en.field_position</strong> deleted (<del><i>7</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>7</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=69132011-10-31T10:34:28ZSage Weilsage@newdream.net
<ul><li><strong>translation missing: en.field_position</strong> deleted (<del><i>7</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>4</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=69492011-10-31T11:02:36ZSage Weilsage@newdream.net
<ul><li><strong>Target version</strong> set to <i>v0.39</i></li><li><strong>translation missing: en.field_position</strong> deleted (<del><i>1</i></del>)</li><li><strong>translation missing: en.field_position</strong> set to <i>971</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=71012011-11-09T14:55:44ZSage Weilsage@newdream.net
<ul></ul><p>Update: the current first pass plan is to initiate a FileStore sync after any non-idempotent operation. This updates commit_op_seq on disk and ensures that it won't be replayed.</p>
<p>It's also heavyweight as it calls sync(2). So it's a big hammer, but at least it's correct.</p>
<p>We can set a non-idempotent bool in the do_transaction method on CLONE or anything similar and then do the commit at the end (before any other operations occur under the<br />current OpSequencer).</p> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=71042011-11-09T16:38:05ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>7</i></li><li><strong>Assignee</strong> set to <i>Sage Weil</i></li></ul> Ceph - Bug #213: non-idempotent transactions (clone) under ext3 may not replay correct resulthttps://tracker.ceph.com/issues/213?journal_id=71802011-11-10T21:38:33ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>7</i> to <i>Resolved</i></li></ul><p><a class="changeset" title="test_filestore_idempotent: detect commit cycles due to non-idempotent ops If we do a non-idempot..." href="https://tracker.ceph.com/projects/ceph/repository/revisions/dae6c956543276e103a272eb1e897db17b840348">dae6c956543276e103a272eb1e897db17b840348</a></p>