Project

General

Profile

Activity

From 11/16/2016 to 12/15/2016

12/15/2016

08:32 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Aaron Ten Clay wrote:
> I'm running an experiment against the current kraken branch, preventing the ceph_abort() cal...
Aaron T
07:58 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
I'm running an experiment against the current kraken branch, preventing the ceph_abort() call which produces this err... Aaron T

12/14/2016

06:45 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Find all the "_failed_push: Read error" message in the logs of crashing OSDs and note the erro...
Aaron T
06:09 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
David Zafman

12/13/2016

07:07 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")

teuthology:/a/dzafman-2016-12-13_08:27:18-rados-master-distro-basic-smithi/630584...
David Zafman
02:07 PM Bug #18240 (Can't reproduce): Deep scrub errors running cephfs kernel client on jewel

I have no idea why this is showing up in kcephfs and knfs tests but not on rados tests. The good news is that at l...
John Spray
02:05 PM Bug #18239 (Duplicate): nan in ceph osd df again
This bug is back: http://tracker.ceph.com/issues/10695
With jewel 10.2.5 an out osd will show up with -nan in osd ...
Dan van der Ster
01:31 AM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...

Find all the "_failed_push: Read error" message in the logs of crashing OSDs and note the errors={osd#(shard#)=-5}....
David Zafman

12/09/2016

05:59 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not...
Aaron T
07:21 AM Bug #18209 (Duplicate): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
... Zheng Yan
01:31 AM Feature #18206 (Resolved): osd: osd_scrub_during_recovery only considers primary, not replicas
We should also avoid scrubbing if the replica is busy doing recovery. That means the scrub scheduler should decline t... Sage Weil
12:13 AM Bug #18204: jewel: finish_promote unexpected promote error (34) Numerical result out of range
samuelj@teuthology:/a/loic-2016-12-08_18:04:28-rados-jewel-distro-basic-smithi/617789/remote Samuel Just
12:12 AM Bug #18204: jewel: finish_promote unexpected promote error (34) Numerical result out of range
samuelj@teuthology:/a/loic-2016-12-08_18:04:28-rados-jewel-distro-basic-smithi/617794 Samuel Just
12:05 AM Bug #18204 (Can't reproduce): jewel: finish_promote unexpected promote error (34) Numerical resul...
This caused the run to fail with an incorrect read return. Samuel Just

12/08/2016

02:45 PM Bug #17743 (Need More Info): ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa r...
Loïc Dachary

12/07/2016

11:49 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
I have also seen this error during my testing. During a 4KB random write test using libRBD FIO, OSDs start to fail. I... Orlando Moreno
06:30 AM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
In one of the core, We see here that gift is around 2GB and same is reserved in the allocator:
In the loop below:
...
Ramesh Chander
09:45 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not...
Aaron T
09:02 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not be impacted, but I tho... David Zafman
08:53 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...

There is an object with peer missing:
peer osd.20(0) missing {2:d0cc2acb:::10000080236.0000007d:head=0'0}
An ...
David Zafman
12:16 AM Bug #18162 (Resolved): osd/ReplicatedPG.cc: recover_replicas: object added to missing set for bac...
I encountered the bug in #13937. I wanted to help test PR12088, and may have encountered an unrelated bug as a result... Aaron T
05:28 PM Bug #18178 (Won't Fix): Unfound objects lost after OSD daemons restarted
Steps to reproduce in both Hammer and Jewel:
1. Create an EC pool, in my case, named ecpool-01, in k3+m2;
2. Fill...
shawn y
09:26 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15414/ Loïc Dachary
07:21 AM Bug #18165 (Resolved): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_target...
... Pawel Sadowski

12/05/2016

09:38 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15371/console fails despite https://github.com/ceph/ceph/pull/12281
Loïc Dachary
07:06 PM Bug #18054 (Need More Info): os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed...
Sage Weil

12/04/2016

03:10 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-12-03_19:34:03-rados-wip-sage-testing---basic-smithi/599494 Sage Weil
02:53 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-12-03_19:33:05-rados-master---basic-smithi/599285 Sage Weil

12/02/2016

03:54 PM Bug #17743 (Fix Under Review): ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa...
https://github.com/ceph/ceph/pull/12281 Loïc Dachary
03:21 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15166/console... Loïc Dachary
02:15 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15167/console... Loïc Dachary
01:40 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
repeating the failure as instructed by Kefu and trying to get a core but cannot (3a9bcaa4aa6042587886c0eaae0ce4eeeb8f... Loïc Dachary
10:58 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
Updating the title so that it matches when looking for test_objectstore_memstore.sh failures Loïc Dachary

12/01/2016

04:42 AM Bug #17949: make check: unittest_bit_alloc get_used_blocks() >= 0
Seen again on master branch trying to reproduce other make check issues on rex001. David Zafman
03:14 AM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-11-30_17:15:54-rados-wip-sage-testing---basic-smithi/589941 Sage Weil
03:13 AM Bug #14115: crypto: race in nss init
/a/sage-2016-11-30_17:15:54-rados-wip-sage-testing---basic-smithi/590025 Sage Weil

11/29/2016

04:32 PM Bug #17968 (Need More Info): Ceph:OSD can't finish recovery+backfill process due to assertion fai...
This is due to the BALANCE_READS option, right? Sage Weil
08:09 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
i am able to reproduce the above failure using "ctest -R test_objectstore_memstore.sh -V --repeat-until-fail 400".
...
Kefu Chai

11/28/2016

08:22 PM Bug #18054 (Resolved): os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
plenty of space, but bitmap allocator fails...... Sage Weil

11/27/2016

04:19 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
"ceph_test_objectstore --gtest_filter=\*/0" also, see
https://jenkins.ceph.com/job/ceph-pull-requests/14959/consol...
Kefu Chai

11/25/2016

02:44 PM Bug #18043 (Closed): ceph-mon prioritizes public_network over mon_host address
Problem description:
Not using sections to declare ceph monitors results in monitors listening on the public_clust...
Sébastien Han

11/24/2016

04:54 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14911/console from https://github.com/ceph/ceph/pull/12081
timesou...
Loïc Dachary
03:52 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14906/console from https://github.com/ceph/ceph/pull/12061
It time...
Loïc Dachary
06:47 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
Loïc Dachary
06:27 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
Loïc Dachary
08:55 AM Bug #18021: Assertion "needs_recovery" fails when balance_read reaches a replica OSD where the ta...
In my test, when encountering a large number of "balance_reads", the OSDs can be so busy that they can't send heartbe... Xuehan Xu
08:43 AM Bug #18021 (Duplicate): Assertion "needs_recovery" fails when balance_read reaches a replica OSD ...
2016-10-25 19:00:00.626567 7f9a63bff700 -1 error_msg osd/ReplicatedPG.cc: In function 'void ReplicatedPG::wait_for_un... Xuehan Xu
08:27 AM Bug #17949: make check: unittest_bit_alloc get_used_blocks() >= 0
https://jenkins.ceph.com/job/ceph-pull-requests/14894/console
Loïc Dachary

11/22/2016

12:03 PM Bug #15653: crush: low weight devices get too many objects for num_rep > 1
Does this issue explain our uneven distribution? We have four racks, with 7, 8, 8, 4 hosts in each, respectively. The... Dan van der Ster
03:42 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
David Zafman

11/21/2016

04:02 PM Bug #17945 (Need More Info): ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
Sage Weil
06:15 AM Bug #17929: rados tool should bail out if you combine listing and setting the snap ID
PR https://github.com/ceph/ceph/pull/12092 Xinxin Shu

11/20/2016

09:41 AM Bug #17968 (Resolved): Ceph:OSD can't finish recovery+backfill process due to assertion failure
Under some condition, OSD could be aborted during the recovery process due to the following assertion failure:
201...
Xuehan Xu
12:10 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
More fixes and reenabled: https://github.com/ceph/ceph/pull/12072 David Zafman

11/18/2016

07:23 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
For the record, over 25 occurrences of failed make check restarted because of the eio failure. Loïc Dachary
06:55 AM Bug #17949 (Resolved): make check: unittest_bit_alloc get_used_blocks() >= 0
https://jenkins.ceph.com/job/ceph-pull-requests/14471/console... Loïc Dachary

11/17/2016

11:59 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14411 Loïc Dachary
11:08 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14400/console Loïc Dachary
11:05 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14402/ Loïc Dachary
11:03 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
I propose to temporarily disable it while it is worked on : https://github.com/ceph/ceph/pull/12058 Loïc Dachary
10:47 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14398/console Loïc Dachary
10:45 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14395/console Loïc Dachary
09:54 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14397/console Loïc Dachary
02:50 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14340/console Loïc Dachary
06:40 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11926 Kefu Chai
09:51 PM Bug #17945 (Need More Info): ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
... Sage Weil
10:27 AM Bug #17929 (New): rados tool should bail out if you combine listing and setting the snap ID
hi,
i've got found problem/feature in pool snapshots
when i delete some object from pool which was previously s...
Jan Krcmar

11/16/2016

01:03 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/544085/
http://pulpito.ceph....
Kefu Chai
01:00 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
i just tested on ext4 the problem disappears. and seems it is reproducible on btrfs. Kefu Chai
07:31 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11979/commits/8854cca4164f9184cc549ba0b90b44515933de8c disables osd-scrub-repair.sh... Loïc Dachary
07:16 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11926 Loïc Dachary
 

Also available in: Atom