Project

General

Profile

Activity

From 11/10/2016 to 12/09/2016

12/09/2016

05:59 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not...
Aaron T
07:21 AM Bug #18209 (Duplicate): src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
... Zheng Yan
01:31 AM Feature #18206 (Resolved): osd: osd_scrub_during_recovery only considers primary, not replicas
We should also avoid scrubbing if the replica is busy doing recovery. That means the scrub scheduler should decline t... Sage Weil
12:13 AM Bug #18204: jewel: finish_promote unexpected promote error (34) Numerical result out of range
samuelj@teuthology:/a/loic-2016-12-08_18:04:28-rados-jewel-distro-basic-smithi/617789/remote Samuel Just
12:12 AM Bug #18204: jewel: finish_promote unexpected promote error (34) Numerical result out of range
samuelj@teuthology:/a/loic-2016-12-08_18:04:28-rados-jewel-distro-basic-smithi/617794 Samuel Just
12:05 AM Bug #18204 (Can't reproduce): jewel: finish_promote unexpected promote error (34) Numerical resul...
This caused the run to fail with an incorrect read return. Samuel Just

12/08/2016

02:45 PM Bug #17743 (Need More Info): ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa r...
Loïc Dachary

12/07/2016

11:49 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
I have also seen this error during my testing. During a 4KB random write test using libRBD FIO, OSDs start to fail. I... Orlando Moreno
06:30 AM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
In one of the core, We see here that gift is around 2GB and same is reserved in the allocator:
In the loop below:
...
Ramesh Chander
09:45 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
David Zafman wrote:
> Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not...
Aaron T
09:02 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...
Another commit has been added to PR12088 to fix a problem. The scenario in this bug might not be impacted, but I tho... David Zafman
08:53 PM Bug #18162: osd/ReplicatedPG.cc: recover_replicas: object added to missing set for backfill, but ...

There is an object with peer missing:
peer osd.20(0) missing {2:d0cc2acb:::10000080236.0000007d:head=0'0}
An ...
David Zafman
12:16 AM Bug #18162 (Resolved): osd/ReplicatedPG.cc: recover_replicas: object added to missing set for bac...
I encountered the bug in #13937. I wanted to help test PR12088, and may have encountered an unrelated bug as a result... Aaron T
05:28 PM Bug #18178 (Won't Fix): Unfound objects lost after OSD daemons restarted
Steps to reproduce in both Hammer and Jewel:
1. Create an EC pool, in my case, named ecpool-01, in k3+m2;
2. Fill...
shawn y
09:26 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15414/ Loïc Dachary
07:21 AM Bug #18165 (Resolved): OSD crash with osd/ReplicatedPG.cc: 8485: FAILED assert(is_backfill_target...
... Pawel Sadowski

12/05/2016

09:38 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15371/console fails despite https://github.com/ceph/ceph/pull/12281
Loïc Dachary
07:06 PM Bug #18054 (Need More Info): os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed...
Sage Weil

12/04/2016

03:10 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-12-03_19:34:03-rados-wip-sage-testing---basic-smithi/599494 Sage Weil
02:53 PM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-12-03_19:33:05-rados-master---basic-smithi/599285 Sage Weil

12/02/2016

03:54 PM Bug #17743 (Fix Under Review): ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa...
https://github.com/ceph/ceph/pull/12281 Loïc Dachary
03:21 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15166/console... Loïc Dachary
02:15 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
https://jenkins.ceph.com/job/ceph-pull-requests/15167/console... Loïc Dachary
01:40 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
repeating the failure as instructed by Kefu and trying to get a core but cannot (3a9bcaa4aa6042587886c0eaae0ce4eeeb8f... Loïc Dachary
10:58 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
Updating the title so that it matches when looking for test_objectstore_memstore.sh failures Loïc Dachary

12/01/2016

04:42 AM Bug #17949: make check: unittest_bit_alloc get_used_blocks() >= 0
Seen again on master branch trying to reproduce other make check issues on rex001. David Zafman
03:14 AM Bug #18054: os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
/a/sage-2016-11-30_17:15:54-rados-wip-sage-testing---basic-smithi/589941 Sage Weil
03:13 AM Bug #14115: crypto: race in nss init
/a/sage-2016-11-30_17:15:54-rados-wip-sage-testing---basic-smithi/590025 Sage Weil

11/29/2016

04:32 PM Bug #17968 (Need More Info): Ceph:OSD can't finish recovery+backfill process due to assertion fai...
This is due to the BALANCE_READS option, right? Sage Weil
08:09 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
i am able to reproduce the above failure using "ctest -R test_objectstore_memstore.sh -V --repeat-until-fail 400".
...
Kefu Chai

11/28/2016

08:22 PM Bug #18054 (Resolved): os/bluestore/BlueStore.cc: 3576: FAILED assert(0 == "allocate failed, wtf")
plenty of space, but bitmap allocator fails...... Sage Weil

11/27/2016

04:19 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
"ceph_test_objectstore --gtest_filter=\*/0" also, see
https://jenkins.ceph.com/job/ceph-pull-requests/14959/consol...
Kefu Chai

11/25/2016

02:44 PM Bug #18043 (Closed): ceph-mon prioritizes public_network over mon_host address
Problem description:
Not using sections to declare ceph monitors results in monitors listening on the public_clust...
Sébastien Han

11/24/2016

04:54 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14911/console from https://github.com/ceph/ceph/pull/12081
timesou...
Loïc Dachary
03:52 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14906/console from https://github.com/ceph/ceph/pull/12061
It time...
Loïc Dachary
06:47 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
Loïc Dachary
06:27 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
Loïc Dachary
08:55 AM Bug #18021: Assertion "needs_recovery" fails when balance_read reaches a replica OSD where the ta...
In my test, when encountering a large number of "balance_reads", the OSDs can be so busy that they can't send heartbe... Xuehan Xu
08:43 AM Bug #18021 (Duplicate): Assertion "needs_recovery" fails when balance_read reaches a replica OSD ...
2016-10-25 19:00:00.626567 7f9a63bff700 -1 error_msg osd/ReplicatedPG.cc: In function 'void ReplicatedPG::wait_for_un... Xuehan Xu
08:27 AM Bug #17949: make check: unittest_bit_alloc get_used_blocks() >= 0
https://jenkins.ceph.com/job/ceph-pull-requests/14894/console
Loïc Dachary

11/22/2016

12:03 PM Bug #15653: crush: low weight devices get too many objects for num_rep > 1
Does this issue explain our uneven distribution? We have four racks, with 7, 8, 8, 4 hosts in each, respectively. The... Dan van der Ster
03:42 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
David Zafman

11/21/2016

04:02 PM Bug #17945 (Need More Info): ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
Sage Weil
06:15 AM Bug #17929: rados tool should bail out if you combine listing and setting the snap ID
PR https://github.com/ceph/ceph/pull/12092 Xinxin Shu

11/20/2016

09:41 AM Bug #17968 (Resolved): Ceph:OSD can't finish recovery+backfill process due to assertion failure
Under some condition, OSD could be aborted during the recovery process due to the following assertion failure:
201...
Xuehan Xu
12:10 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
More fixes and reenabled: https://github.com/ceph/ceph/pull/12072 David Zafman

11/18/2016

07:23 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
For the record, over 25 occurrences of failed make check restarted because of the eio failure. Loïc Dachary
06:55 AM Bug #17949 (Resolved): make check: unittest_bit_alloc get_used_blocks() >= 0
https://jenkins.ceph.com/job/ceph-pull-requests/14471/console... Loïc Dachary

11/17/2016

11:59 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14411 Loïc Dachary
11:08 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14400/console Loïc Dachary
11:05 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14402/ Loïc Dachary
11:03 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
I propose to temporarily disable it while it is worked on : https://github.com/ceph/ceph/pull/12058 Loïc Dachary
10:47 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14398/console Loïc Dachary
10:45 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14395/console Loïc Dachary
09:54 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14397/console Loïc Dachary
02:50 PM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://jenkins.ceph.com/job/ceph-pull-requests/14340/console Loïc Dachary
06:40 AM Bug #17830 (Resolved): osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11926 Kefu Chai
09:51 PM Bug #17945 (Need More Info): ceph_test_rados_api_tier: failed to decode hitset in HitSetWrite test
... Sage Weil
10:27 AM Bug #17929 (New): rados tool should bail out if you combine listing and setting the snap ID
hi,
i've got found problem/feature in pool snapshots
when i delete some object from pool which was previously s...
Jan Krcmar

11/16/2016

01:03 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
http://pulpito.ceph.com/kchai-2016-11-13_07:03:13-rados-wip-kefu-testing---basic-smithi/544085/
http://pulpito.ceph....
Kefu Chai
01:00 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
i just tested on ext4 the problem disappears. and seems it is reproducible on btrfs. Kefu Chai
07:31 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11979/commits/8854cca4164f9184cc549ba0b90b44515933de8c disables osd-scrub-repair.sh... Loïc Dachary
07:16 AM Bug #17830: osd-scrub-repair.sh is failing (intermittently?) on Jenkins
https://github.com/ceph/ceph/pull/11926 Loïc Dachary

11/15/2016

09:25 AM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
Sage, i will try to fix this if you don't have enough bandwidth today. Kefu Chai

11/14/2016

07:45 AM Bug #16279: assert(objiter->second->version > last_divergent_update) failed
Yao Ning wrote:
> Hi, we got the crash because of the same reason in Ceph 0.94.5
>
> I think it is possible that ...
Honggang Yang

11/11/2016

07:55 PM Documentation #17871 (Closed): crush-map document could use clearer warning about impact of chang...
In http://docs.ceph.com/docs/jewel/rados/operations/crush-map/ this section is the closest thing to documenting the i... Jason Jensen

11/10/2016

10:58 PM Bug #17862: manager: add high level summary of pending scheduled and forced scrubs
It's in the stats already I think (so we don't forget forced scrubs between intervals)? If so, the manager is alread... Samuel Just
10:56 PM Bug #17862 (New): manager: add high level summary of pending scheduled and forced scrubs
Samuel Just
07:32 PM Bug #17743: ceph_test_objectstore & test_objectstore_memstore.sh crashes in qa run (kraken)
I've tried a few different machines now but I can't reproduce this.
Can you generate a filestore = 20 log for me?
Sage Weil
06:25 PM Bug #17830 (In Progress): osd-scrub-repair.sh is failing (intermittently?) on Jenkins

Here is the portion of the test that ran. When osd.0 went down to perform the ceph-objectstore-tool list-attrs, th...
David Zafman
02:48 PM Bug #12659: Can't delete cache pool
That didn't work. At. All.
I could not delete the alt images (OSDs kept crashing). I finally decided to just rip o...
Christian Theune
12:00 PM Bug #12659: Can't delete cache pool
As my development environment is down anyway, I'm now trying to:
* rename all images (mv foo foo.alt)
* copy them...
Christian Theune
11:42 AM Bug #12659: Can't delete cache pool
Ok, this is weird. I deleted all snapshots. This means effectively there *can't* be any clones any longer as I can on... Christian Theune
11:38 AM Bug #12659: Can't delete cache pool
Ah, and I misread. It's not about snapshots, it's about clones. Right. So I do have clones, but all of them have been... Christian Theune
11:37 AM Bug #12659: Can't delete cache pool
The specific check that triggers is the one from here:
http://tracker.ceph.com/issues/8003
I'm still trying to ...
Christian Theune
11:30 AM Bug #12659: Can't delete cache pool
I'm also being bitten by this. I shut down all VMs and supposedly all clients that talk to our ceph cluster, but I st... Christian Theune
 

Also available in: Atom