Project

General

Profile

Activity

From 12/09/2012 to 01/07/2013

01/07/2013

03:35 PM Subtask #2854: krbd: write path
rbd write path.. 'guard' in the sense that the write has a check to verify the object already exists. Sage Weil
03:22 PM Subtask #2854: krbd: write path
Pretty sure this is about the rbd locking and fencing. Greg Farnum
03:11 PM Subtask #2854: krbd: write path
I'm about to mark bug 3418 as a duplicate of this one.
I'm adding the following from that bug here first.
I did...
Alex Elder
03:11 PM Subtask #2854: krbd: write path
I'm not sure what "guard writes" is supposed to mean.
But I'm going to interpret it as simply implementing the
writ...
Alex Elder
03:14 PM Feature #3419 (Duplicate): krbd: copy-up on write to clone
This is a duplicate of http://tracker.newdream.net/issues/2855. Alex Elder
03:14 PM Subtask #2855: krbd: copy-up on write to clone
I don't know how to change the one-line bug description or I
would.
I need some clarification about the intended ...
Alex Elder
03:12 PM Feature #3418 (Duplicate): krbd: write path (layering)
This is a duplicate of http://tracker.newdream.net/issues/2854. Alex Elder
03:07 PM Feature #3417 (Duplicate): krbd: read path (layering)
This is a duplicate of tracker.newdream.net/issues/2854. Alex Elder
03:06 PM Tasks #2853: krbd: read path
I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work o...
Alex Elder
03:05 PM Feature #3416 (Duplicate): krbd: open parent on open
Marking this as a duplicate of http://tracker.newdream.net/issues/2852. Alex Elder
02:51 PM Bug #3743: krbd: errors on submitted requests are ignored
If I could figure out how, I'd change the title of this
to say "krbd" rather than "rbd" to help make it clear
which...
Alex Elder
02:27 PM Bug #3743 (Won't Fix): krbd: errors on submitted requests are ignored
When a Linux request comes down to the rbd driver via rbd_rq_fn(),
rbd_dev_do_request() is called after validating t...
Alex Elder
02:50 PM Bug #3745 (Rejected): krbd: individual response errors are ignored
A Linux I/O request on an rbd image is broken into one or
more rbd requests, one request directed to each osd object...
Alex Elder
12:07 PM Subtask #3741: krbd: rework request tracking code
... Alex Elder
11:54 AM Subtask #3741 (Resolved): krbd: rework request tracking code
This is actually work that's mostly complete, but it never
got a bug assigned to it.
In order to handle layering ...
Alex Elder
11:09 AM Subtask #2852: krbd: open parent on open
This work is essentially done, and has been since
October 2012 (or even earlier). However I held off
posting it fo...
Alex Elder
06:30 AM Bug #3737 (Resolved): Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache e...
Oliver Francke

01/04/2013

06:11 PM Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
commit:0978dc4963fe441fb67afecb074bc7b01798d59d Dan Mick
03:12 PM Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre...
Tamilarasi muthamizhan
02:38 PM Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5 Josh Durgin
11:31 AM Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
in branch wip-rbd-watch Josh Durgin
11:23 AM Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
Josh Durgin
10:32 AM Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734... Tamilarasi muthamizhan

01/03/2013

02:32 PM Bug #3697: rbd copy.sh test failing in nightly
When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te... Dan Mick
09:17 AM Bug #3685: xfs test 186 fails in the nightlies
I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,...
Alex Elder

01/02/2013

09:56 PM Bug #3697: rbd copy.sh test failing in nightly
Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ... Dan Mick
09:39 AM Bug #3697 (In Progress): rbd copy.sh test failing in nightly
Sage Weil
05:23 PM Bug #3685: xfs test 186 fails in the nightlies
It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u...
Alex Elder
02:34 PM Bug #3685: xfs test 186 fails in the nightlies
OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera...
Alex Elder
12:06 PM Bug #3685: xfs test 186 fails in the nightlies
Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co...
Alex Elder
11:56 AM Bug #3685: xfs test 186 fails in the nightlies
I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x...
Alex Elder
03:08 PM Bug #3619: librbd: read_iterate sparse behavior broken
Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed.
Dan Mick
01:34 PM Feature #3456 (Closed): make exit code of ceph status commands status dependent
Josh Durgin
01:29 PM Documentation #2992 (Resolved): doc: RBD parent/child snapshot
Josh Durgin
01:26 PM Documentation #2992: doc: RBD parent/child snapshot
This should be resolved. John Wilkins
09:41 AM Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ... Sage Weil

12/31/2012

06:35 PM Bug #3697: rbd copy.sh test failing in nightly
FWIW I ran this in a loop and reproduced it after 7 iterations (well, a slightly different error actually, when it re... Sage Weil
05:42 PM Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
Dan Mick
05:08 PM Bug #3697: rbd copy.sh test failing in nightly
Hm, doesn't reproduce on local vstart cluster. Pondering possible failure modes. Dan Mick
04:23 PM Bug #3697: rbd copy.sh test failing in nightly
Trying to reproduce now Dan Mick
05:38 PM Bug #3703: osd: crash while encrypting
This is an osd crash.... Josh Durgin
02:55 PM Bug #3703 (Can't reproduce): osd: crash while encrypting
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_19:00:03-regression-next-testing-basic/32113... Tamilarasi muthamizhan
08:37 AM Bug #3701 (Can't reproduce): qemu xfstest hung BUG: unable to handle kernel NULL pointer derefere...
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_03:00:06-regression-master-testing-gcov/31929... Tamilarasi muthamizhan

12/29/2012

08:37 AM Bug #3697 (Duplicate): rbd copy.sh test failing in nightly
... Sage Weil

12/28/2012

12:18 PM Bug #2689 (In Progress): qemu iozone test hangs
This seems to still be a problem. I'll try to get more information about what's going on. It looks like there's an er... Josh Durgin
12:08 PM Bug #3692: OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
Chronology of events (UTC) in the latest example of this happening, in case it's relevant:
15:50:46 mon.b is s...
Justin Lott
12:01 PM Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
I've seen this happen twice:
- Reboot a node running a number of OSD's
- Within a short period of time, seemingly...
Justin Lott
10:34 AM Bug #3600: rbd: assert in objectcacher destructor after flatten
recent log: ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28713 Tamilarasi muthamizhan

12/27/2012

04:52 PM Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
ceph version : 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
rbd allows images created with zero ...
Tamilarasi muthamizhan
12:32 PM Bug #3427: krbd: unmap does not remove block device properly
I am going to assume that the racing open is the cause of
the problem reported by Nikola Kotur.
To fix it, I will...
Alex Elder
12:17 PM Bug #3427: krbd: unmap does not remove block device properly
> For RBD, wasn't the use_count something we just added? Would it cover this situation?
No.
The first warning i...
Alex Elder
08:53 AM Bug #3427: krbd: unmap does not remove block device properly
For cephfs, the vfs normally handles that.
For RBD, wasn't the use_count something we *just* added? Would it cove...
Sage Weil
08:37 AM Bug #3427: krbd: unmap does not remove block device properly
I also note, having taken a little closer look at Nikola Kotur's
kernel log that both an open and a close appear to ...
Alex Elder
08:31 AM Bug #3427: krbd: unmap does not remove block device properly
It looks to me like the osd client code has nothing in place
to protect itself from one of its users (ceph client, m...
Alex Elder
11:28 AM Bug #2689: qemu iozone test hangs
Testing again since some possible causes were fixed. Josh Durgin
10:54 AM Bug #3685 (Closed): xfs test 186 fails in the nightlies
logs: ubuntu@teuthology:/a/teuthology-2012-12-26_19:00:03-regression-next-testing-basic/28039
...
Tamilarasi muthamizhan

12/26/2012

02:38 PM Bug #3427: krbd: unmap does not remove block device properly
I haven't spent time on this in almost a month so wanted to just
provide an update. We have been looking at and try...
Alex Elder

12/24/2012

02:56 PM Feature #3580 (Resolved): rbd import from stdin could try harder to sparsify images
Sage Weil
09:22 AM Bug #3654 (Fix Under Review): libvirt: colons in ipv6 monitor addresses are not escaped when sent...
Sage Weil
08:45 AM Fix #3665: librbd: deadlock during flatten
the problem is that we are holding the snap_lock and then waiting for io. but we mostly use snap_lock as a tight inne... Sage Weil

12/21/2012

10:03 AM Fix #3665 (Resolved): librbd: deadlock during flatten
Ran into this trying to reproduce #3631.
The test_librbd_fsx process is still running on plana34 for debugging.
...
Josh Durgin
08:32 AM Bug #3664 (Resolved): osdc/ObjectCacher.cc: 517: FAILED assert(!i->size())
... Sage Weil

12/20/2012

03:49 PM Bug #3524: test_librbd_fsx: crash after flatten
Sam saw this come up again in: ubuntu@teutholog:/a/sam-ooo3/19022
It's a different cause of the same symptom. In t...
Josh Durgin

12/19/2012

04:38 PM Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
Given xml like:... Josh Durgin
11:47 AM Bug #3611 (Resolved): rbd.py: segfault with many snapshots
This was caused by c3107009f66bc06b5e14c465142e14120f9a4412. Reverting it fixes the problem. There is a corrected imp... Josh Durgin

12/18/2012

01:26 PM Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
Instead of using an assert version op, establish the watch before reading the header. This hasn't actually caused any... Josh Durgin
08:54 AM Bug #3611: rbd.py: segfault with many snapshots
This survived overnight testing (with the python librbd tests) with 56 passes. Josh Durgin

12/17/2012

10:59 PM Bug #3611 (Fix Under Review): rbd.py: segfault with many snapshots
wip-3611 contains a respin of the bad commit. It's passing test_stress_watch with failure injection and the python te... Josh Durgin
11:09 AM Bug #3611: rbd.py: segfault with many snapshots
also, ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16289 Tamilarasi muthamizhan
11:08 AM Bug #3611: rbd.py: segfault with many snapshots
recent log: ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16281 Tamilarasi muthamizhan
10:53 PM Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Thanks for the logs. All the differences there are zeroes where actual data should be, but the librbd debug log shows... Josh Durgin
12:40 PM Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
Trivial change. Biggest decision is which libc routine to use to spawn the command... Dan Mick
11:42 AM Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
The rbd command line interface creates mappings by sending
output to the /sys/bus/rbd/add file system entry, and rem...
Alex Elder
10:53 AM Bug #3600: rbd: assert in objectcacher destructor after flatten
Tried to reproduce this behavior to no avail.
There are operations on the test that do hang for a long time, but a...
Joao Eduardo Luis

12/16/2012

12:44 PM Bug #2872 (Resolved): RBD resize command allows image size -1
Sage Weil
11:02 AM Bug #2689: qemu iozone test hangs
let's retest this with all of the recent caching fixes? Sage Weil

12/15/2012

09:45 AM Fix #3588 (In Progress): rbd.py's clone should take stripe parms, call rbd_clone2
Sage Weil
09:45 AM Feature #2601 (Resolved): rbd: Show image size with an "ls"
Sage Weil
09:44 AM Feature #2634 (Resolved): teuthology: add networking to qemu task
Sage Weil
09:43 AM Bug #3619 (In Progress): librbd: read_iterate sparse behavior broken
Sage Weil
09:42 AM Bug #2689: qemu iozone test hangs
Sage Weil

12/14/2012

07:15 PM Bug #3611: rbd.py: segfault with many snapshots
It looks like the op in the objecter has been corrupted, similar to #3613. In this case, op->objver ends up pointing ... Josh Durgin
04:07 PM Bug #3589 (Resolved): rbd.py should check for method existence before calling new methods
Josh Durgin
07:28 AM Bug #2410: hung xfstest #68
This appears to be an XFS problem, where the file
system is having trouble getting space in its
journal. I inquire...
Alex Elder
07:13 AM Bug #2608: rbd: hung xfstest 270
We should re-evaluate this with XFS found in newer kernels.
Maybe this should just be closed and re-opened (or open
...
Alex Elder
06:08 AM Feature #3418: krbd: write path (layering)
I did a little research on this before starting on the write
path. This work will require the kernel rbd client, th...
Alex Elder
06:00 AM Feature #3417: krbd: read path (layering)
Work on this really started in November 2012.
In October there were a number of cleanup tasks we agreed
should ge...
Alex Elder

12/13/2012

11:48 PM Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Attached files as requested.
Compare was stopped early to save on file size.
Matt Anderson
05:03 PM Bug #3611: rbd.py: segfault with many snapshots
Finally got a backtrace. It seems something is overwriting a Mutex::Locker on the stack:... Josh Durgin
01:31 PM Bug #3619 (Resolved): librbd: read_iterate sparse behavior broken
Instead of getting a NULL for a hole, we get a zeroed buffer.
Reported on the ML
Sage Weil

12/12/2012

04:13 PM Bug #3611: rbd.py: segfault with many snapshots
Downgrading priority since this isn't an actual bug. Josh Durgin
03:41 PM Bug #3611: rbd.py: segfault with many snapshots
Without lockdep, I could not reproduce a crash.
Running with lockdep enabled results in this backtrace:...
Josh Durgin
11:52 AM Bug #3611 (Resolved): rbd.py: segfault with many snapshots
From the nightly python api tests, test_many_snaps failed with a segfault in all runs.
Logs are in:...
Josh Durgin

12/11/2012

05:04 PM Bug #3413: rbd bench-write fails with assert when rbd caching turned on
Josh Durgin
04:58 PM Bug #3589 (Fix Under Review): rbd.py should check for method existence before calling new methods
wip-rbdpy-compat Josh Durgin
04:53 PM Feature #2568 (Resolved): qa: run xfstests on qemu+rbd
Josh Durgin
03:34 PM Bug #3600: rbd: assert in objectcacher destructor after flatten
The hang on selfmanaged_snap_create seems like a monitor issue; bouncing the monitor, or hanging out in gdb long enou... Dan Mick

12/10/2012

07:57 PM Bug #3600: rbd: assert in objectcacher destructor after flatten
Hm #2. Both runs are deadlocked on the mutex IoCtxImpl::selfmanaged_snap_create::mylock. Will look for owner next. Dan Mick
07:23 PM Bug #3600: rbd: assert in objectcacher destructor after flatten
Looking at the code, I don't really see a clean "stop caching" mechanism. While I look, hacked fsx to do only write/... Dan Mick
11:40 AM Bug #3600 (Duplicate): rbd: assert in objectcacher destructor after flatten
From ubuntu@teuthology:/a/teuthology-2012-12-09_19:00:03-regression-master-testing-gcov/10977:... Josh Durgin
11:40 AM Bug #3524 (Resolved): test_librbd_fsx: crash after flatten
That's a different bug. Created #3600 to track it. Josh Durgin
11:36 AM Bug #3524 (In Progress): test_librbd_fsx: crash after flatten
recent log :
ubuntu@teuthology:/a/teuthology-2012-12-09_19:00:03-regression-master-testing-gcov/10977
Tamilarasi muthamizhan
11:11 AM Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Since the size isn't an issue, it'd be great if you could:

1) generate a log of qemu-img convert with 'rbd cache ...
Josh Durgin
 

Also available in: Atom