Project

General

Profile

Activity

From 12/12/2016 to 01/10/2017

01/10/2017

04:17 PM Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))
Samuel Just
01:57 PM Bug #18467: ceph ping mon.* can fail
That's with "ms inject socket failures: 500" which is unchanged. What's a reasonable higher value to try - 1000? 5000? Nathan Cutler

01/09/2017

10:42 PM Bug #18467: ceph ping mon.* can fail
This isn't a particularly frequent error: http://pulpito.ceph.com/sage-2017-01-09_21:59:24-rados-wip-sage-testing---b... Sage Weil
10:21 PM Bug #18467: ceph ping mon.* can fail
The offending code in ... Nathan Cutler
09:57 PM Bug #18467 (Resolved): ceph ping mon.* can fail
... Sage Weil
10:07 PM Bug #18368 (Resolved): bluestore: bluefs reclaim broken
Sage Weil
08:25 PM Bug #18445 (Fix Under Review): ceph: ping <mon.id> doesn't connect to cluster
Dan Mick
06:35 PM Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))
I looked at it more closely. This is kind of wierd. Really, missing_loc is what's supposed to be the location-of-rec... Samuel Just

01/07/2017

04:55 AM Bug #18445 (Won't Fix): ceph: ping <mon.id> doesn't connect to cluster
I guess no one uses this feature. Misplaced 'connect()' call only applies to mon.* rados/singleton should have caug... Dan Mick

01/06/2017

02:24 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Here's a patch for jewel that, on top of the David Zafman's patch for 13937 and Kefu Chai's for 17857, enables osds t... Alexandre Oliva

01/04/2017

06:50 PM Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))
If anyone else hits this, you can work around it by extracting the object from the osd which has it, using mark_unfou... Samuel Just
06:49 PM Bug #18165: OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targets(peer))
The bug is still present and fairly straightforward: it's possible that the newest version isn't on an osd in the cur... Samuel Just

01/03/2017

06:09 PM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
I don't think so, (0'0,234'1034] is the master log, 234'1034 isn't divergent. You should trace back in the log to tr... Samuel Just
07:11 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Nevermind the patch; clearing backfills_in_flight seems to be wrong and dangerous; I think it might even make the obj... Alexandre Oliva
01:34 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
I've been trying to address the problem that backfills abort when encountering a read error or a corrupt file in jewe... Alexandre Oliva

12/30/2016

05:23 PM Bug #18368 (Fix Under Review): bluestore: bluefs reclaim broken
https://github.com/ceph/ceph/pull/12725 Sage Weil
05:22 PM Bug #18368 (Resolved): bluestore: bluefs reclaim broken
return extent is always offset 0, and only one extent Sage Weil

12/24/2016

02:35 AM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
it is a bit weird that the divergent entry is 234'1034
olog is (0'0,234'1034]
log is (0'0,234'1033]
so the lowe...
huang jun

12/22/2016

08:47 PM Bug #14115: crypto: race in nss init
https://github.com/ceph/ceph/pull/12624 Sage Weil
03:48 AM Bug #14115: crypto: race in nss init
Ah, I just discovered something. I was hitting this reliably and it was because I was leaking some objects, which pr... Sage Weil
03:32 AM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
hi, we got this error again,
we plug out the power cable when doing heavy writes,
after machine boot, one osd crash...
huang jun
01:35 AM Bug #18329 (Can't reproduce): pure virtual method called in rocksdb from bluestore
... Sage Weil
12:21 AM Bug #18328 (Closed): crush: flaky unitest:
This failed:
https://jenkins.ceph.com/job/ceph-pull-requests/16045/
This succeeded:
https://jenkins.ceph.com...
Yehuda Sadeh

12/19/2016

10:53 PM Bug #18165 (Won't Fix): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_targe...
This machinery was rewritten in jewel to make it work properly in this case. The fix can't really be backported to h... Samuel Just
10:52 PM Bug #18178: Unfound objects lost after OSD daemons restarted
David: I know why it's losing the unfound status, but it should be losing the inconsistent pg state, right? Samuel Just
10:33 PM Bug #17257 (Can't reproduce): ceph_test_rados_api_lock fails LibRadosLockPP.LockExclusiveDurPP
Samuel Just
10:32 PM Bug #17945: ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
633e81f3a6343ef4d5b8b421e693356c17f698b9 will hexdump to help debug this next time it happens Sage Weil
10:10 PM Bug #18054 (Resolved): os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
Sage Weil

12/15/2016

08:32 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Aaron Ten Clay wrote:
> I'm running an experiment against the current kraken branch, preventing the ceph_abort() cal...
Aaron T
07:58 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
I'm running an experiment against the current kraken branch, preventing the ceph_abort() call which produces this err... Aaron T

12/14/2016

06:45 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Find all the "_failed_push: Read error" message in the logs of crashing OSDs and note the erro...
Aaron T
06:09 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
David Zafman

12/13/2016

07:07 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")

teuthology:/a/dzafman-2016-12-13_08:27:18-rados-master-distro-basic-smithi/630584...
David Zafman
02:07 PM Bug #18240 (Can't reproduce): Deep scrub errors running cephfs kernel client on jewel

I have no idea why this is showing up in kcephfs and knfs tests but not on rados tests. The good news is that at l...
John Spray
02:05 PM Bug #18239 (Duplicate): nan in ceph osd df again
This bug is back: http://tracker.ceph.com/issues/10695
With jewel 10.2.5 an out osd will show up with -nan in osd ...
Dan van der Ster
01:31 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...

Find all the "_failed_push: Read error" message in the logs of crashing OSDs and note the errors={osd#(shard#)=-5}....
David Zafman
 

Also available in: Atom