https://tracker.ceph.com/https://tracker.ceph.com/favicon.ico2016-12-19T22:53:50ZCeph RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=831662016-12-19T22:53:50ZSamuel Justsjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Won't Fix</i></li></ul><p>This machinery was rewritten in jewel to make it work properly in this case. The fix can't really be backported to hammer, so I'm marking this one won't fix. Please reopen if reproducible on jewel.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=837302017-01-04T18:49:30ZSamuel Justsjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>Won't Fix</i> to <i>12</i></li><li><strong>Priority</strong> changed from <i>Normal</i> to <i>High</i></li></ul><p>The bug is still present and fairly straightforward: it's possible that the newest version isn't on an osd in the current up or acting sets. It might be as simple as removing this assert.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=837312017-01-04T18:50:58ZSamuel Justsjust@redhat.com
<ul></ul><p>If anyone else hits this, you can work around it by extracting the object from the osd which has it, using mark_unfound_lost delete, and using the rados tool to replace it.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=839472017-01-09T18:35:00ZSamuel Justsjust@redhat.com
<ul><li><strong>Priority</strong> changed from <i>High</i> to <i>Immediate</i></li></ul><p>I looked at it more closely. This is kind of wierd. Really, missing_loc is what's supposed to be the location-of-record for the ephemeral state related to who has what objects. We do update that in failed_push. The problem is that once it becomes empty, pick_newest_available goes back to using the missing sets directly. I suppose we can just update the missing sets as well in failed-push, but I'm a bit worried about letting the primary's copy of the replica's missing set diverge from the replica's. I guess nothing for it though.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=839492017-01-09T18:35:14ZSamuel Justsjust@redhat.com
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-7 priority-highest closed" href="/issues/18365">Bug #18365</a>: failed_push does not update missing set</i> added</li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=840082017-01-10T16:17:54ZSamuel Justsjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>7</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=841292017-01-12T19:56:53ZSamuel Justsjust@redhat.com
<ul><li><strong>Assignee</strong> set to <i>Samuel Just</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=841322017-01-12T20:18:35ZSage Weilsage@newdream.net
<ul><li><strong>Status</strong> changed from <i>7</i> to <i>Pending Backport</i></li><li><strong>Priority</strong> changed from <i>Immediate</i> to <i>Urgent</i></li><li><strong>Backport</strong> set to <i>kraken,jewel</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=841332017-01-12T20:39:09ZNathan Cutlerncutler@suse.cz
<ul></ul><p><strong>master PR</strong>: <a class="external" href="https://github.com/ceph/ceph/pull/12888">https://github.com/ceph/ceph/pull/12888</a></p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=841342017-01-12T21:00:58ZSamuel Justsjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=841362017-01-12T21:03:52ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Pending Backport</i></li></ul><p>Sam, this issue has "Backport: kraken, jewel" set. Have the backports been done already?</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=843082017-01-17T08:35:33ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-3 priority-4 priority-default closed" href="/issues/18567">Backport #18567</a>: kraken: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))</i> added</li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=843102017-01-17T08:35:36ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Copied to</strong> <i><a class="issue tracker-9 status-6 priority-4 priority-default closed" href="/issues/18568">Backport #18568</a>: jewel: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))</i> added</li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=847042017-01-25T00:25:36ZSamuel Justsjust@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>12</i></li></ul><p>Nope, that fix didn't work. Backfill doesn't put objects into the needs_recovery_map. Reverting.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=847052017-01-25T00:26:53ZSamuel Justsjust@redhat.com
<ul><li><strong>Assignee</strong> changed from <i>Samuel Just</i> to <i>David Zafman</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=914552017-05-23T19:37:08ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Duplicated by</strong> <i><a class="issue tracker-1 status-10 priority-5 priority-high3 closed" href="/issues/16259">Bug #16259</a>: ceph osd processes crashes when doing a revert on unfound object</i> added</li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=914592017-05-23T20:24:22ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Status</strong> changed from <i>12</i> to <i>In Progress</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=927812017-06-14T04:21:57ZGreg Farnumgfarnum@redhat.com
<ul><li><strong>Project</strong> changed from <i>Ceph</i> to <i>RADOS</i></li><li><strong>Category</strong> set to <i>Correctness/Safety</i></li><li><strong>Component(RADOS)</strong> <i>OSD</i> added</li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=939402017-06-28T15:43:05ZGreg Farnumgfarnum@redhat.com
<ul></ul><p>David, anything up with this? Is it an urgent bug?</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=940782017-06-29T21:18:50ZDavid Zafmandzafman@redhat.com
<ul></ul><p><a class="external" href="https://github.com/ceph/ceph/pull/14760">https://github.com/ceph/ceph/pull/14760</a></p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=940792017-06-29T21:19:19ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Status</strong> changed from <i>In Progress</i> to <i>Resolved</i></li></ul> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=1079152018-02-23T11:54:32ZNathan Cutlerncutler@suse.cz
<ul><li><strong>Status</strong> changed from <i>Resolved</i> to <i>Pending Backport</i></li></ul><p>This should not have been marked Resolved when one of the backports was still open.</p> RADOS - Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))https://tracker.ceph.com/issues/18165?journal_id=1084272018-03-02T22:40:46ZDavid Zafmandzafman@redhat.com
<ul><li><strong>Status</strong> changed from <i>Pending Backport</i> to <i>Resolved</i></li></ul>