https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2017-03-03T02:53:01ZCeph Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=870422017-03-03T02:53:01ZSage Weilsage@newdream.net
<ul></ul><p>Okay, the root cause here was a bug in my path that was setting/unsetting hte full flag on the osdmap. <strong>But</strong>... there is a client bug that was triggered by it, so leaving this bug open!</p>
<p>We could probably just run 'ceph osd set full' randomly during thrashing to trigger these, or something similar (quickly set/unset the flag just to exercise the objecter code paths)</p> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=870452017-03-03T03:15:07ZSage Weilsage@newdream.net
<ul></ul><p>Okay, two bugs:</p>
<p>1. In jewel,<br /><pre>
} else if ((op->target.flags & CEPH_OSD_FLAG_WRITE) &&
!(op->target.flags & (CEPH_OSD_FLAG_FULL_TRY |
CEPH_OSD_FLAG_FULL_FORCE)) &&
(_osdmap_full_flag() ||
_osdmap_pool_full(op->target.base_oloc.pool))) {
</pre><br />in objecter, but it should check RWORDERED. THis is 07b2a22210e26eac1b2825c30629788da05e5e12, which needs to get backports.</p>
<p>2. in master, the resend logic is also broken. the 'pay attention to full' condition is this complex beast:</p>
<pre>
} else if ((op->target.flags & (CEPH_OSD_FLAG_WRITE | CEPH_OSD_FLAG_RWORDERED)) &&
!(op->target.flags & (CEPH_OSD_FLAG_FULL_TRY |
CEPH_OSD_FLAG_FULL_FORCE)) &&
</pre><br />but on resend the force_resend_write (misnamed) check is<br /><pre>
case RECALC_OP_TARGET_NO_ACTION:
if (!force_resend &&
(!force_resend_writes || !(op->target.flags & CEPH_OSD_FLAG_WRITE)))
break;
// -- fall-thru --
</pre><br />I think the fix is to make an Op method bool respects_full() to capture this and use it in both places. Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=870462017-03-03T03:21:52ZSage Weilsage@newdream.net
<ul></ul><p><a class="external" href="https://github.com/ceph/ceph/pull/13759">https://github.com/ceph/ceph/pull/13759</a></p> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=871392017-03-03T11:37:18ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-4 priority-default closed" href="/issues/19139">Bug #19139</a>: osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag</i> added</li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=871402017-03-03T11:37:52ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Fix Under Review</i></li><li><strong>Backport</strong> set to <i>jewel, kraken</i></li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=871532017-03-03T11:59:56ZNathan Cutlerncutler@suse.cz
<ul></ul><p>I guess it would make sense to backport <a class="issue tracker-1 status-3 priority-4 priority-default closed" title="Bug: osdc/Objecter: If osd full, it should pause read op which w/ rwordered flag (Resolved)" href="https://tracker.ceph.com/issues/19139">#19139</a> and this to jewel in a single PR.</p> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=873202017-03-08T03:32:07ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>Fix Under Review</i> to <i>Pending Backport</i></li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=873332017-03-08T08:40:04ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/19224">Backport #19224</a>: jewel: osd ops (sent and?) arrive at osd out of order</i> added</li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=873352017-03-08T08:40:06ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-6 priority-4 priority-default closed" href="/issues/19225">Backport #19225</a>: kraken: osd ops (sent and?) arrive at osd out of order</i> added</li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=882402017-03-30T17:45:02ZSage Weilsage@newdream.net
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-3 priority-6 priority-high2 closed" href="/issues/19430">Bug #19430</a>: objecter: full_try behavior not consistent with osd</i> added</li></ul> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=946732017-07-07T08:20:23ZNathan Cutlerncutler@suse.cz
<ul></ul><p>The jewel and kraken backports of this fix wreaked havoc in the rados and upgrade suites, and had to be reverted/cancelled.</p> Ceph - Bug #19133: osd ops (sent and?) arrive at osd out of orderhttps://tracker.ceph.com/issues/19133?journal_id=1018012017-11-03T13:44:20ZKefu Chaitchaikov@gmail.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>