Project

General

Profile

Activity

From 05/24/2017 to 06/22/2017

06/22/2017

09:44 PM Bug #20388: combination of kvm using librbd from kraken and online resize leads to data corruption
Just to add some confusion, I'm unable to reproduce this issue on a ubuntu-based machine with librbd from kraken.
So...
Yann Dupont
08:38 PM Bug #20388 (Closed): combination of kvm using librbd from kraken and online resize leads to data ...
Hi everybody. We experimented big data corruption recently. I've been able to reproduce it and I suspect librbd from ... Yann Dupont
11:33 AM Bug #18844 (Need More Info): import-diff failed: (33) Numerical argument out of domain - if image...
There have been several related fixes in v10.2.6 [1], one of them fixed the crashes you observed [2]
So I believe ...
Mykola Golub

06/21/2017

11:43 PM Bug #20054 (Need More Info): librbd memory overhead when used with KVM
If anyone can provide an example job reproducing this with fio utilizing the direct rbd engine (i.e. take QEMU out-of... Jason Dillaman
11:30 PM Bug #20054: librbd memory overhead when used with KVM
I am seeing the same issue with my Jewel cluster. Plenty of VMs with over 100% memory overhead.
Can we get the pri...
Brendan Moloney

06/20/2017

08:51 PM Bug #12018 (Resolved): rbd and pool quota do not go well together
Nathan Cutler
08:50 PM Backport #14824 (Rejected): hammer: rbd and pool quota do not go well together
Nathan Cutler
08:50 PM Backport #14824 (New): hammer: rbd and pool quota do not go well together
Attempted backport https://github.com/ceph/ceph/pull/10871 was closed Nathan Cutler

06/19/2017

10:37 PM Bug #20333 (Rejected): RBD bench in EC pool w/ overwrites overwhelms OSDs
Sorry, I heard today from Josh that this report involved a vstart cluster and wasn't unique to EC pools in any case. Greg Farnum
11:58 AM Bug #20333 (Need More Info): RBD bench in EC pool w/ overwrites overwhelms OSDs
I'm not really sure what RBD can do in this situation. That test was only 16 concurrent IOs in-flight, so when you ha... Jason Dillaman
08:52 PM Backport #20351 (Resolved): kraken: test_librbd_api.sh fails in upgrade test
https://github.com/ceph/ceph/pull/16195 Nathan Cutler
08:06 PM Backport #19957: jewel: rbd: Lock release requests not honored after watch is re-acquired
@Jason - is this something you want to tackle? Nathan Cutler

06/18/2017

08:23 AM Bug #20333: RBD bench in EC pool w/ overwrites overwhelms OSDs
Hopefully the RBD client can do something to be a little friendlier? Tracking OSD throttling improvements in the orig... Greg Farnum
08:22 AM Bug #20333 (Rejected): RBD bench in EC pool w/ overwrites overwhelms OSDs
When running "rbd bench-write" using an RBD image stored in an EC pool, the some OSD threads start to timeout and eve... Greg Farnum

06/16/2017

02:48 PM Bug #20175 (Pending Backport): test_librbd_api.sh fails in upgrade test
Kefu Chai
02:19 PM Bug #20175: test_librbd_api.sh fails in upgrade test
@Kefu: thanks -- that's a different issue. I'll take care of that today. Jason Dillaman
01:47 PM Bug #20175: test_librbd_api.sh fails in upgrade test
Jason, the test still fails with this fix. see http://tracker.ceph.com/issues/20175#note-12 Kefu Chai
01:01 PM Bug #20175 (Pending Backport): test_librbd_api.sh fails in upgrade test
Jason Dillaman

06/14/2017

05:03 PM Subtask #18786: rbd-mirror A/A: create simple image distribution policy
Splitting into multiple PRs. https://github.com/ceph/ceph/pull/15691 introduces simple policy for image distribution ... Venky Shankar
11:24 AM Subtask #18786: rbd-mirror A/A: create simple image distribution policy
I'm rebasing my branch (https://github.com/vshankar/ceph/commits/rbd-mirror-image-distribution) with master now. Will... Venky Shankar
01:41 PM Feature #10037 (Resolved): cache-tier: Optimise RBD image removal
RBD only issues remove ops against all possible objects -- and with object map enabled it only issues them against ob... Jason Dillaman
06:26 AM Feature #10037: cache-tier: Optimise RBD image removal
I think the proxy changes have fixed this for deletes on the RADOS side; does rbd do anything which would force promo... Greg Farnum
04:18 AM Bug #18122: unittest_journal TestJournalTrimmer.RemoveObjectsWithOtherClient (intermitent)
Journal failures belong to rbd. Greg Farnum

06/13/2017

01:51 AM Bug #20175: test_librbd_api.sh fails in upgrade test
Jason, could you help take a look? by inspecting @qa/suites/upgrade/client-upgrade/jewel-client-x/basic@, i think it'... Kefu Chai
01:47 AM Bug #20175: test_librbd_api.sh fails in upgrade test
tested at http://qa-proxy.ceph.com/teuthology/kchai-2017-06-12_12:19:18-upgrade-wip-20175-kefu---basic-mira/1279912/t... Kefu Chai

06/12/2017

08:34 PM Backport #20267 (Resolved): jewel: [api] is_exclusive_lock_owner shouldn't return -EBUSY
https://github.com/ceph/ceph/pull/16296 Nathan Cutler
08:34 PM Backport #20266 (Resolved): kraken: [api] is_exclusive_lock_owner shouldn't return -EBUSY
https://github.com/ceph/ceph/pull/16187 Nathan Cutler
08:34 PM Backport #20265 (Resolved): jewel: [cli] ensure positional arguments exist before casting
https://github.com/ceph/ceph/pull/16295 Nathan Cutler
08:34 PM Backport #20264 (Resolved): kraken: [cli] ensure positional arguments exist before casting
https://github.com/ceph/ceph/pull/16186 Nathan Cutler
07:26 AM Feature #18984: RFE: let rbd export write directly to a block device
Jason Dillaman wrote:
> Couldn't you just run "rbd export <image-spec> - | dd of=/dev/someblockdevice" and achieve t...
Ruben Kerkhof
07:25 AM Feature #18984: RFE: let rbd export write directly to a block device
Mykola Golub wrote:
> Note, write serialization is not the only difference between writing to a file and to stdout. ...
Ruben Kerkhof

06/11/2017

08:48 PM Feature #18984: RFE: let rbd export write directly to a block device
Couldn't you just run "rbd export <image-spec> - | dd of=/dev/someblockdevice" and achieve the desired outcome? Jason Dillaman
01:58 PM Feature #18984: RFE: let rbd export write directly to a block device
Note, write serialization is not the only difference between writing to a file and to stdout. Another difference is t... Mykola Golub

06/10/2017

06:31 PM Backport #19611 (In Progress): kraken: Issues with C API image metadata retrieval functions
Nathan Cutler
05:11 PM Bug #19942 (Duplicate): "[ FAILED ] TestLibRBD.Metadata" in upgrade:client-upgrade-kraken-distr...
Mykola Golub
04:59 PM Bug #19942: "[ FAILED ] TestLibRBD.Metadata" in upgrade:client-upgrade-kraken-distro-basic-smithi
This tests shows a real bug that was fixed in master #19588 and the test was extended to catch it then. I believe it ... Mykola Golub
03:24 PM Bug #20223 (Resolved): rbd.ImageNotFound: __init__() takes exactly 3 positional arguments, 1 given
Mykola Golub

06/09/2017

04:07 PM Bug #20175 (Fix Under Review): test_librbd_api.sh fails in upgrade test
https://github.com/ceph/ceph/pull/15602 Kefu Chai
03:29 PM Bug #20175: test_librbd_api.sh fails in upgrade test
because the user applications do not link against libcommon, i think it'd be fine to backport the change of @prio_adj... Kefu Chai
03:23 PM Bug #20175: test_librbd_api.sh fails in upgrade test
because the layout of @PerfCounters@ was changed in luminous: we added a new field of @ @prio_adjust@ to it. so the ... Kefu Chai
12:59 PM Bug #20175: test_librbd_api.sh fails in upgrade test
... Kefu Chai
12:09 PM Bug #20175: test_librbd_api.sh fails in upgrade test
... Kefu Chai
11:34 AM Bug #20175: test_librbd_api.sh fails in upgrade test
... Kefu Chai
03:26 PM Bug #19889 (Need More Info): rbd/compatibility: rbd import fails with Jewel client, Kraken OSDs (...
Jason Dillaman

06/08/2017

02:53 PM Bug #20175: test_librbd_api.sh fails in upgrade test
it's not relevant. and jewel's "ceph_test_librbd_api" crashed in a different way when dynamically linked against libr... Kefu Chai
01:32 PM Bug #20223 (Fix Under Review): rbd.ImageNotFound: __init__() takes exactly 3 positional arguments...
*PR*: https://github.com/ceph/ceph/pull/15574 Jason Dillaman
01:29 PM Bug #20223 (In Progress): rbd.ImageNotFound: __init__() takes exactly 3 positional arguments, 1 g...
Jason Dillaman
01:29 PM Bug #20223: rbd.ImageNotFound: __init__() takes exactly 3 positional arguments, 1 given
OpenAttic serializes the exception via multiprocessing pipes. The new OSError exception cannot be properly serialized... Jason Dillaman
01:25 PM Bug #20223: rbd.ImageNotFound: __init__() takes exactly 3 positional arguments, 1 given
<jdillaman> sebastian-w: I wonder if the issue is that the new rbd.OSError is not picklable for the multiprocessing.P... Sebastian Wagner
11:46 AM Bug #20223 (Resolved): rbd.ImageNotFound: __init__() takes exactly 3 positional arguments, 1 given
While investigating https://tracker.openattic.org/browse/OP-2311, I came across this lines:... Sebastian Wagner

06/07/2017

01:48 PM Feature #3499 (Resolved): qemu-rbd: support bdrv_has_zero_init
Addressed in upstream commit 3ac21627 Jason Dillaman
01:34 PM Feature #18917: rbd: show the latest snapshot in rbd info
Note that the most recent snapshot might not be what the HEAD revision of the image is based upon if a rollback was u... Jason Dillaman
06:25 AM Subtask #18789 (Resolved): rbd-mirror A/A: coordinate image syncs with leader
Mykola Golub

06/06/2017

10:49 AM Bug #20185 (Pending Backport): [cli] ensure positional arguments exist before casting
Mykola Golub
10:49 AM Bug #20182 (Pending Backport): [api] is_exclusive_lock_owner shouldn't return -EBUSY
Mykola Golub
07:39 AM Bug #18963 (Resolved): rbd-mirror: forced failover does not function when peer is unreachable
Mykola Golub

06/05/2017

05:19 PM Bug #20185 (Fix Under Review): [cli] ensure positional arguments exist before casting
*PR*: https://github.com/ceph/ceph/pull/15492 Jason Dillaman
05:01 PM Bug #20185 (Resolved): [cli] ensure positional arguments exist before casting
For example: "rbd feature enable --image xyz" will crash since the feature name positional was not specified. Jason Dillaman
03:16 PM Backport #20023 (In Progress): jewel: rbd-mirror replay fails on attempting to reclaim data to lo...
Jason Dillaman
12:32 PM Backport #20023 (New): jewel: rbd-mirror replay fails on attempting to reclaim data to local site...
Jason agreed to do this one. Nathan Cutler
01:51 PM Backport #20022 (In Progress): kraken: rbd-mirror replay fails on attempting to reclaim data to l...
Jason Dillaman
12:41 PM Support #20183 (New): Ceph RBD image-feature
How can I run an image with all the features?
I am running:
cephuser@ceph01u:~$ ceph -v
ceph version 11.2.0 (f...
Jorge Pinilla
12:33 PM Bug #19811: rbd-mirror replay fails on attempting to reclaim data to local site (LS) from distant...
OK, #20023 reassigned to Jason. Nathan Cutler
11:41 AM Bug #19811: rbd-mirror replay fails on attempting to reclaim data to local site (LS) from distant...
@Nathan: feel free to re-assign this one to me and I'll make the necessary changes. Jason Dillaman
12:31 PM Bug #19907 (Resolved): rbd-mirror: admin socket path names collision
Nathan Cutler
11:40 AM Bug #19907: rbd-mirror: admin socket path names collision
@Nathan: since this is more of a nice-to-have, I am also fine just dropping the need for a backport to jewel (and kra... Jason Dillaman
12:31 PM Backport #20009 (Rejected): jewel: rbd-mirror: admin socket path names collision
Backport is complicated and not worth the effort for a mere "nice-to-have" backport. Nathan Cutler
12:31 PM Backport #20008 (Rejected): kraken: rbd-mirror: admin socket path names collision
Backport is complicated and not worth the effort for a mere "nice-to-have" backport. Nathan Cutler
12:18 PM Bug #20182 (Fix Under Review): [api] is_exclusive_lock_owner shouldn't return -EBUSY
*PR*: https://github.com/ceph/ceph/pull/15483 Jason Dillaman
12:06 PM Bug #20182 (Resolved): [api] is_exclusive_lock_owner shouldn't return -EBUSY
This error code indicates that another client owns the exclusive lock. Instead, it should return 0 with the boolean s... Jason Dillaman

06/04/2017

09:33 AM Backport #20153 (In Progress): jewel: Potential IO hang if image is flattened while read request ...
Nathan Cutler
09:12 AM Backport #19808 (In Progress): jewel: [test] remove hard-coded image name from TestLibRBD.Mirror
Nathan Cutler
08:23 AM Backport #19808 (Need More Info): jewel: [test] remove hard-coded image name from TestLibRBD.Mirror
Nathan Cutler
08:19 AM Backport #19808 (In Progress): jewel: [test] remove hard-coded image name from TestLibRBD.Mirror
Nathan Cutler
08:58 AM Backport #20017 (In Progress): jewel: rbd-nbd: kernel reported invalid device size (0, expected 1...
Nathan Cutler
08:40 AM Bug #19907: rbd-mirror: admin socket path names collision
@Jason There is no PoolReplayer::print_status in jewel, so the backport would have to be done manually in src/tools/r... Nathan Cutler
08:40 AM Backport #20009 (Need More Info): jewel: rbd-mirror: admin socket path names collision
There is no PoolReplayer::print_status in jewel, so the backport would have to be done manually in src/tools/rbd_mirr... Nathan Cutler
08:34 AM Backport #20023 (Need More Info): jewel: rbd-mirror replay fails on attempting to reclaim data to...
@Jason The test in the master commit does not backport cleanly to jewel. It appears to depend on 19dd5a82bb8 but the ... Nathan Cutler
08:33 AM Bug #19811: rbd-mirror replay fails on attempting to reclaim data to local site (LS) from distant...
@Jason The test in the master commit does not backport cleanly to jewel. It appears to depend on 19dd5a82bb8 but the ... Nathan Cutler
08:25 AM Backport #19957 (Need More Info): jewel: rbd: Lock release requests not honored after watch is re...
Non-trivial backport Nathan Cutler
08:13 AM Backport #19795 (In Progress): jewel: [test] test_notify.py: assert(not image.is_exclusive_lock_o...
Nathan Cutler
08:11 AM Backport #17843 (In Progress): jewel: object-map: batch updates during trim operation
Nathan Cutler
05:58 AM Bug #20175: test_librbd_api.sh fails in upgrade test
TracepointProvider.cc was in libcommon.a which is in turn included by librbd.so in jewel, but in luminous, it is move... Kefu Chai
05:51 AM Bug #20175 (Resolved): test_librbd_api.sh fails in upgrade test
... Kefu Chai

06/03/2017

06:32 PM Backport #18137 (Need More Info): jewel: rbd-mirror: image sync should send NOCACHE advise flag
Nathan Cutler

06/02/2017

03:01 PM Bug #20168 (Resolved): IO work queue does not process failed lock request
If the attempt to request the exclusive lock fails, the IO work queue will not attempt to recover. For example, in Je... Jason Dillaman
07:36 AM Backport #20154 (Resolved): kraken: Potential IO hang if image is flattened while read request is...
https://github.com/ceph/ceph/pull/16184 Nathan Cutler
07:36 AM Backport #20153 (Resolved): jewel: Potential IO hang if image is flattened while read request is ...
https://github.com/ceph/ceph/pull/15464 Nathan Cutler
07:36 AM Backport #20152 (Rejected): hammer: Potential IO hang if image is flattened while read request is...
https://github.com/ceph/ceph/pull/15980 Nathan Cutler

06/01/2017

05:19 PM Bug #18963 (Fix Under Review): rbd-mirror: forced failover does not function when peer is unreach...
Jason Dillaman
12:29 AM Bug #18367: Zombie image snapshot problem
Jason Dillaman wrote:
> @daolong: can you please provide the output from "rados -p volumes listomapvals rbd_header.2...
daolong zhang

05/31/2017

11:18 PM Bug #20111 (Rejected): Python RBD: diff_iterate_cb() does not acquire GIL before calling user-pro...
Indeed -- the callback cdef functions have the "with gil" suffix to tell Cython to re-acquire the GIL. Jason Dillaman
09:03 AM Bug #20111: Python RBD: diff_iterate_cb() does not acquire GIL before calling user-provided callb...
Cython internally ensures (and acquires) GIL in such functions. Please close the bug. I verified, that GIL is locked ... Марк Коренберг
08:21 PM Support #20120: libvirt creat volume io very slow
Definitely not enough information to attempt to address. Jason Dillaman
09:22 AM Support #20120 (Closed): libvirt creat volume io very slow
use ceph client creat volume for test bw=260442KB/s, iops=65110
but cloudstack creat volume for test bw=9523.2KB/s...
ming li
06:12 PM Bug #18367 (Need More Info): Zombie image snapshot problem
@daolong: can you please provide the output from "rados -p volumes listomapvals rbd_header.2eb9e622cdd48" and "rados ... Jason Dillaman
01:05 PM Bug #20110: RBD aio_ API does not provide awaiting of any completion from a list.
@Марк: can you provide an example? Your ticket description clearly states "next I want to wait until any of them is c... Jason Dillaman
12:42 PM Bug #20110: RBD aio_ API does not provide awaiting of any completion from a list.
So, how should I wait for whole transfer completion?
What is the difference between rbd_read2() and rbd_aio_read2(...
Марк Коренберг
12:08 PM Bug #20110 (Need More Info): RBD aio_ API does not provide awaiting of any completion from a list.
@Марк: I am probably not understanding your goal, but since you can associate a callback with a completion, and said ... Jason Dillaman
09:45 AM Documentation #20119: Documentation of Python RBD API does not say that aio_* functions call thei...
Also it does not say, that `data` argument may be None, which signs error of read operation (I'm not sure, figured ou... Марк Коренберг
09:35 AM Documentation #20119: Documentation of Python RBD API does not say that aio_* functions call thei...
also, exceptions are silently ignored from these callbacks! Марк Коренберг
09:19 AM Documentation #20119 (Closed): Documentation of Python RBD API does not say that aio_* functions ...
Documentation of Python RBD API does not say that aio_* functions call their callbacks in DIFFERENT (dummy) thread
...
Марк Коренберг

05/30/2017

10:13 PM Bug #20111 (Rejected): Python RBD: diff_iterate_cb() does not acquire GIL before calling user-pro...
rbd_diff_iterate2() is called with GIL released. Callback it calls must acquire GIL.
Bug was not detected since Cy...
Марк Коренберг
09:37 PM Bug #20110 (Closed): RBD aio_ API does not provide awaiting of any completion from a list.
Suppose I want to copy RBD image in parallel 10 streams. Well, I can run 10 aio_read() functions and associate them w... Марк Коренберг

05/29/2017

07:49 AM Feature #18430 (In Progress): Transparently support migrating images with minimal/zero downtime
Mykola Golub

05/26/2017

02:12 PM Bug #19832 (Pending Backport): Potential IO hang if image is flattened while read request is in-f...
Jason Dillaman
02:12 PM Bug #19962 (Resolved): Discard related IO should skip op if object map marks object as non-existent
Jason Dillaman

05/24/2017

11:52 AM Feature #20070 (Rejected): [qemu] implement bdrv_co_write_zeroes via discard
Jason Dillaman
 

Also available in: Atom