https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2018-01-05T11:26:26ZCeph Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1042902018-01-05T11:26:26Zmingxin liuliumxnl@foxmail.com
<ul></ul><p>assert(repop_queue.front() == repop);</p> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1046222018-01-09T18:52:05ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>Do you have logs or more about how this happened? There are a bunch of guards to prevent exactly this in cases where a connection reset happens. They might be leaky, but we'll need a little more to go on in identifying what went wrong.</p> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1046622018-01-10T14:59:58ZSage Weilsage@newdream.net
<ul><li><strong>Subject</strong> changed from <i>out of order caused by letting old msg from down peer be processed</i> to <i>RESETSESSION and OSD peer connections fundamentally racy</i></li><li><strong>Status</strong> changed from <i>New</i> to <i>12</i></li></ul><p>Last time I looked at this I came to the conclusion that (1) there was a fundamental problem, (2) the best hope for properly fixing it is moving peer connection managment into the OSD and out of hte messenger, and (3) that the workaround in the can_discard_request() (or whatever it is) is a good enough workaround for now.</p>
<p><a class="external" href="https://github.com/ceph/ceph/pull/17217#issuecomment-324997960">https://github.com/ceph/ceph/pull/17217#issuecomment-324997960</a></p>
<p>Note that mingxin's improvement merged: <a class="external" href="https://github.com/ceph/ceph/pull/19796">https://github.com/ceph/ceph/pull/19796</a></p>
<p>Leaving this ticket open for now.</p> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1046632018-01-10T15:00:24ZSage Weilsage@newdream.net
<ul><li><strong>Related to</strong> <i><a class="issue tracker-1 status-10 priority-6 priority-high2 closed" href="/issues/21143">Bug #21143</a>: bad RESETSESSION between OSDs?</i> added</li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1046652018-01-10T15:00:44ZSage Weilsage@newdream.net
<ul><li><strong>Subject</strong> changed from <i>RESETSESSION and OSD peer connections fundamentally racy</i> to <i>out of order caused by letting old msg from down peer be processed to RESETSESSION</i></li><li><strong>Status</strong> changed from <i>12</i> to <i>Resolved</i></li></ul><p>actaully, see existing ticket <a class="issue tracker-1 status-10 priority-6 priority-high2 closed" title="Bug: bad RESETSESSION between OSDs? (Duplicate)" href="https://tracker.ceph.com/issues/21143">#21143</a></p> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1047002018-01-11T02:23:30Zmingxin liuliumxnl@foxmail.com
<ul></ul><p>i wonder if <a class="external" href="http://tracker.ceph.com/issues/21287">http://tracker.ceph.com/issues/21287</a> related.</p> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1318112019-03-12T23:16:24ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Project</strong> changed from <i>RADOS</i> to <i>Messengers</i></li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1501492019-10-31T16:24:23ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Pending Backport</i></li><li><strong>Backport</strong> set to <i>luminous, mimic</i></li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1501502019-10-31T16:25:36ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/42586">Backport #42586</a>: luminous: out of order caused by letting old msg from down peer be processed to RESETSESSION</i> added</li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1501542019-10-31T16:26:14ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Pull request ID</strong> set to <i>19796</i></li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1501552019-10-31T16:28:30ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Backport</strong> changed from <i>luminous, mimic</i> to <i>luminous</i></li></ul> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1501562019-10-31T16:29:54ZNathan Cutlerncutler@suse.cz
<ul></ul><p>"git describe" on the <a class="external" href="https://github.com/ceph/ceph/pull/19796">https://github.com/ceph/ceph/pull/19796</a> merge commit:</p>
<pre>
v13.0.1-845-ga7dc224536
</pre> Messengers - Bug #22570: out of order caused by letting old msg from down peer be processed to RESETSESSIONhttps://tracker.ceph.com/issues/22570?journal_id=1506092019-11-05T13:20:01ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul><p>While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".</p>