Project

General

Profile

Activity

From 12/17/2012 to 01/15/2013

01/15/2013

10:07 PM Revision c8a9a9a8 (ceph): Add cram task
This runs cram tests, which are an easy way to test output
stays consistent. We already use cram for basic cli tests ...
Josh Durgin
09:39 PM Bug #3811 (Fix Under Review): rados.cc getomapval implementation is broken, should use omap_get_v...
Samuel Just
09:21 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
Samuel Just
09:38 PM Bug #3812 (Fix Under Review): rados.cc listomapvals usage is wrong, <key> <val> are ignored and n...
Samuel Just
09:22 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
Samuel Just
08:53 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
08:41 PM Revision cf149c8c (ceph): Merge branch 'wip-rpm-update'
Clean-up the handling of ceph java bindings in the rpm specfile and
configure.ac.
Gary Lowell
08:38 PM CephFS Feature #3730: Support replication factor in Hadoop
pool ids are currently exposed via libcephfs from ceph_file_layout, which uses a 32bit integer for pool id. However, ... Noah Watkins
08:34 PM CephFS Feature #3730: Support replication factor in Hadoop
Someone could toss a 'ceph osd pool set size' Hadoop's way, so a static mapping between pg pool size and pool name co... Noah Watkins
07:51 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
I'm not sure yet whether the problem has to do with this
or whether it's in the existing "new request" code. But
I...
Alex Elder
06:23 PM Documentation #3808: Block device quick start page need update
Fixed description formatting. Also, 3784 is in master now (e94b06a19218decaf7d2d7b009bd862040f20285) Dan Mick
04:46 PM Documentation #3808: Block device quick start page need update
The current writeup also assumes that the mount is local to the cluster so it hides (for the beginner) important deta... Ken Franklin
03:38 PM Documentation #3808: Block device quick start page need update
-c and --secret aren't needed if you're using the default ceph.conf and your keyring can be found based on your ceph.... Josh Durgin
03:30 PM Documentation #3808 (Resolved): Block device quick start page need update
The instructions don't match well with the bobtail release.
- should include a note that ceph-common needs to be ins...
Ken Franklin
06:21 PM Feature #3805: log: detect dup messages
I tend to think there aren't very many dups we could usefully compress. It's pretty easy to add a one-string buffer ... Dan Mick
02:25 PM Feature #3805: log: detect dup messages
What kind of dups are we trying to detect?
This sounds to me like a wishlist item that requires much more work to...
Greg Farnum
02:17 PM Feature #3805 (New): log: detect dup messages
If a log message comes through and is a dup of the previous, increment a counter or something and only log it once wi... Sage Weil
05:35 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
No. So far I'm focus on stabilize basic fs function for multiple MDS setup, completely ignore snapshot. Zheng Yan
03:28 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
Hmm, did this get fixed by some of Zheng's later patches? I remember things about snaprealms and migration... Greg Farnum
05:33 PM Bug #3810 (Resolved): btrfs corrupts file size on 3.7
After creating a new ceph cluster pg's become inconsistent after using the qemu client. Logs indicate that the prima... Mike Lowe
04:54 PM Bug #3809 (Won't Fix): crush compiler errors are not helpful
Small, or large, errors in the CRUSH input are apparently all treated the same by crushtool -c:
error: parse error a...
Dan Mick
04:44 PM CephFS Feature #3289: ceph-fuse: somehow exert pressure on the VFS to remove dentries from the cache
#3575 should be kept in mind while doing this/instead of this — there's a forget_multi as well. Greg Farnum
04:44 PM CephFS Bug #3601 (New): client: With multiple clients, file remove doesn't free up space
Whoops, didn't mean to change that status. Greg Farnum
04:43 PM CephFS Bug #3601 (Duplicate): client: With multiple clients, file remove doesn't free up space
The LRU actually already exists; check out Client::lru. (Unless I'm misunderstanding something?) So we might want to ... Greg Farnum
04:37 PM CephFS Bug #925: mds: update replica snaprealm on rename
De-prioritizing multi-MDS issues... Greg Farnum
04:34 PM CephFS Bug #1117: mds: rename rollback broken on slaves during replay
De-prioritizing multi-mds issues for now. Greg Farnum
04:27 PM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I'm guessing we want to move this up the queue; will discuss in bug scrub tomorrow! Greg Farnum
04:23 PM CephFS Bug #1511: fsstress failure with 3 active mds
De-prioritizing multi-mds failures at this time. Greg Farnum
04:23 PM CephFS Bug #1535: concurrent creating and removing directories crashes cmds
De-prioritizing multi-MDS bugs at this time. Greg Farnum
03:51 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
Fair enough, but if I can just make a suggestion, perhaps you might want to explain these procedures somewhere in the... Florian Haas
03:45 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
I agree it's a bug, but given the procedures we have now (ack! changing procedures coming alert!) I don't think we wa... Greg Farnum
03:43 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
No, please. A write pretending to succeed while actually not writing data _is_ a bug. The filesystem _not lying to it... Florian Haas
03:33 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
This is a great suggestion but falls into feature rather than bug-fix category. My initial thought is keeping a list ... Greg Farnum
03:42 PM CephFS Bug #1675 (Can't reproduce): mds: failed rstat assert
The logs are long gone. This will presumably pop up again; it's a pretty common failure mode, but there's nothing in ... Greg Farnum
03:38 PM CephFS Bug #1938: mds: snaptest-2 doesn't pass with 3 MDS system
De-prioritizing all multi-MDS bugs for now. Greg Farnum
03:27 PM CephFS Bug #3267: Multiple active MDSes stall when listing freshly created files
Currently de-prioritizing multi-MDS bugs. Greg Farnum
03:23 PM Bug #3537: Logs can run root out of space and crash ceph cluster (need more aggressive log rotation)
Not an FS bug, and #3775 has a lot more conversation on this subject. Greg Farnum
03:22 PM Bug #3552: After ceph-deploy installation a reboot breaks OSDs
Whoops, not an FS bug!
I've put this in the main Ceph project for now, but it might also belong in devops. We need...
Greg Farnum
03:18 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
I know you guys did a couple rounds on this one, what's the status? Greg Farnum
02:39 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Yes, the question is why they're 'getting unlucky'. Josh Durgin
02:22 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Haven't looked into this, but my guess is a couple PGs are getting unlucky with their replica selection. I assume you... Greg Farnum
02:17 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
Small 3 node cluster running 0.56.1-1~bpo60+1 on Debian/Squeeze, with "tuneables" enabled
I recently changed the r...
Ben Poliakoff
02:27 PM RADOS Feature #3807 (Resolved): crush: simple commands to create common rules
These should be in CrushWrapper or similar, and available via crushtool and via some 'ceph osd crush ...' commands.
...
Sage Weil
02:16 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
I agree. If there are lots of log messages at the default levels, that is the problem. I don't think there is much ... Sage Weil
01:59 PM Feature #3775 (Need More Info): log: stop logging in statfs reports usage above some threshold
So I suggest we split this into two issues:
1) the documentation examples show an awfully-high logging value for s...
Dan Mick
12:03 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
so, a couple ideas of what can be done.
if we do set size and frequency (or inform the user how to), then it could...
Anonymous
11:39 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
So a couple of thoughts:
1) changing size in logrotate.conf doesn't help unless we also change frequency
2) with ...
Dan Mick
02:15 PM Documentation #3804 (Resolved): Logging section recommends fairly high levels, doesn't stress how...
3775 introduced the observation that logs can fill very quickly and bury a small root disk.
Our documentation could ...
Dan Mick
02:03 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
commit:15bb00cafc31305cacf3c4684a429c2c9ee6f804 in master
Dan Mick
02:03 PM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
02:02 PM rbd Feature #3784: rbd: issue modprobe when rbd map is called
commit:e94b06a19218decaf7d2d7b009bd862040f20285 in master
Dan Mick
02:01 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
Dan Mick
01:47 PM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
nevermind.. this is fixed in v0.48.3argonaut too. Sage Weil
01:45 PM Bug #3803: rados parsing error with hostnames in mon_host
Responed to the upstraem bug. This is fixed in master and bobtail, but not backported to argonaut. Should we? Sage Weil
08:37 AM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
In /etc/ceph/ceph.conf, if I set hostnames in the mon_host variable and separate them with spaces, the parsing algori... Ian Colle
01:25 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
Sage has a different proposed fix than what's in the branch. Still needs to be tested. Sam Lang
12:50 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
I don't remember where this ended up. Was the proposed fix problematic, or did it never get looked at? Greg Farnum
01:16 PM Bug #3770: OSD crashes on boot
Yeah, I just pushed a work-around branch (which I haven't tested much, so ideally you would try it on a node you can ... Samuel Just
12:08 PM rbd Subtask #3741: krbd: rework request tracking code
I found the source of my trouble, and in the process understood
a little more about some subtlety in bio reference c...
Alex Elder
11:39 AM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
This apparently is only a problem under re-export, which I believe we are not focusing on right now. Greg Farnum
11:35 AM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
Given what we know so far (the Op got sent to the wrong OSD) this is a bug in the Objecter, not the MDS. Or possibly ... Greg Farnum
11:17 AM Bug #3771: ceph does not have startup scripts in Centos
Not an FS bug! :) Greg Farnum
10:17 AM Bug #3771 (In Progress): ceph does not have startup scripts in Centos
Anonymous
11:16 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
Whoops, this was never an FS bug. :) Greg Farnum
10:15 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
Anonymous
10:54 AM Bug #3747: PGs stuck in active+remapped
No I didn't, just the CRUSH rule. Faidon Liambotis
10:46 AM Bug #3747 (Need More Info): PGs stuck in active+remapped
Faidon: did you also change the replication level of pool 3 (.rgw.buckets) ? Samuel Just
10:18 AM Feature #3505 (In Progress): default to libnss
This may already have been done. Will double check. Anonymous
10:16 AM Feature #3733 (In Progress): osd: update leveldb submodule
Anonymous
10:10 AM Bug #3797 (Need More Info): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest ...
Ian Colle
07:09 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Can you try reupgrading one of the nodes and start it with debug file store = 20? That will tell is what it is writing. Sage Weil
02:49 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just downgraded to 0.48.2argonaut and everything seems to be running normally again now:
Before downgrade:
ii ...
Corin Langosch
02:28 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Here's the output of dstat http://pastie.org/5687470.text
I'm not sure why it is writing so much now, before the ...
Corin Langosch
02:17 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just noticed the second osd is now consuming 100% cpu too. Before it was properly running for around 15 minutes. Gu... Corin Langosch
02:14 AM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
I just upgraded one of my production servers (2 osds) from 0.48.2argonaut to the latest 0.48.3argonaut and now of the... Corin Langosch
08:33 AM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
When copying an object the x-amz-acl header is ignored. To replicate; copy a private object and send the 'x-amz-acl' ... JuanJose Galvez
07:43 AM Bug #3801 (Won't Fix): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED a...
0.48.2argonaut
Relevant logs are attached. Core dumps are available if needed....
Justin Lott
07:25 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
You're right, as long as you are using matching
code it's fine.
If it occurred, it's a serious problem. It just
...
Alex Elder
07:17 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
Is this really a problem? It seems like this could only bite someone building mixed versions out of tree. Sage Weil
06:57 AM Linux kernel client Bug #3800 (Resolved): libceph: check compatibility between ceph modules
It's possible for semantic changes to occur in one of the
ceph modules (fs/ceph, net/libceph, or block/rbd) that is
...
Alex Elder
06:58 AM Linux kernel client Bug #3799: libceph/rbd: bio refs are messed up
Because this suggests a semantically-incompatible change
between modules, this should probably be completed first:
...
Alex Elder
06:56 AM Linux kernel client Bug #3799 (Resolved): libceph/rbd: bio refs are messed up
There is an ugly reference counting dance that occurs with bio
pointers in the kernel osd I/O path, and it needs to ...
Alex Elder
06:57 AM Linux kernel client Bug #3798: libceph/rbd: take reference to all bio's in list
The other bug related to this is:
http://tracker.newdream.net/issues/3799
Alex Elder
06:56 AM Linux kernel client Bug #3798 (Resolved): libceph/rbd: take reference to all bio's in list
In a separate bug ("libceph/rbd: bio refs are messed up") I
describe how reference counting of bio's interact betwee...
Alex Elder
03:20 AM Revision d56af797 (ceph): osd: note must_scrub* flags in PG operator<<
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:20 AM Revision 26a63df9 (ceph): osd: fix scrub scheduling for 0.0
The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-...
Sage Weil
03:20 AM Revision 2baf1253 (ceph): osd: based INCONSISTENT pg state on persistent scrub errors
This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we r...
Sage Weil
02:24 AM Revision 16d67c79 (ceph): osd/PG: remove useless osd_scrub_min_interval check
This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we ...
Sage Weil
02:24 AM Revision 29954802 (ceph): osd: change scrub min/max thresholds
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' wou...
Sage Weil
02:24 AM Revision 6f6a4193 (ceph): osd: fix object_stat_sum_t dump signedness
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision d7383284 (ceph): osd: add last_clean_scrub_stamp to pg_stat_t, pg_history_t
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 2475066c (ceph): osd: add num_scrub_errors to object_stat_t
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 389bed5d (ceph): osd: note last_clean_scrub_stamp, last_scrub_errors
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 796907e2 (ceph): osd/PG: move scrub schedule registration into a helper
Simplifies callers, and will let us easily modify the decision of when
to schedule the PG for scrub.
Signed-off-by: ...
Sage Weil
02:24 AM Revision 1441095d (ceph): osd/PG: introduce flags to indicate explicitly requested scrubs
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 62ee6e09 (ceph): osd/PG: trigger scrub via scrub schedule, must_ flags
When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. ...
Sage Weil
02:24 AM Revision a1481207 (ceph): osd: move scrub schedule random backoff to seperate helper
Separate this from the load check, which will soon vary dependon on the
PG.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
12:25 AM Revision 123a2dc4 (ceph): rados: adjust socket injection rate down
See #3795. Sage Weil
12:14 AM Revision 71097b7b (ceph): Revert "task/kclient: chmod root to 1777."
This reverts commit f17847e537802671c6f90bd1a0cdaa0e9d1e6f7a. It had
a typo and we hopefully don't need it.
Signed-o...
Greg Farnum

01/14/2013

10:11 PM Revision be0c4b34 (ceph): ac_prog_javah.m4: Use AC_CANONICAL_TARGET instead of AC_CANONICAL_SYSTEM.
Gary Lowell
10:07 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
oh *fine*. :) Dan Mick
10:04 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
Funny you should mention it: that is step #1 (or maybe 2 or 3) for the management API work, IMHO. :) Sage Weil
09:41 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
I sorta think we ought to clean up how the various output channels are used in this code in general. This fixes the ... Dan Mick
09:23 PM Revision e182c1fd (ceph): Merge branch 'wip-java-sync'
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <jbbuck@gmail.com>
Noah Watkins
09:11 PM Revision fb8a488e (ceph): java: remove create/release synchronization
The constructor calls create, and finalize() calls release. Since each
of these can only happen once (enforced by Jav...
Noah Watkins
09:11 PM Revision 2b9da45d (ceph): java: remove unnecessary synchronization
The body of ceph_unmount is a call to a synchronized method.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
09:11 PM Revision 85c10357 (ceph): java: remove all intrinsic locks
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
09:11 PM Revision 13cb196e (ceph): java: add fine grained synchronization
Adds r/w lock to protect against some races.
1. Mutual exclusion for mount/unmount prevents races between the two in...
Noah Watkins
08:02 PM rbd Subtask #3741: krbd: rework request tracking code
OK, I ran a test and got a crash. The bio built for
an object request gets handed off to an osd request.
I need to...
Alex Elder
07:32 PM rbd Subtask #3741: krbd: rework request tracking code
I spent the day trying to find the memory leak and finally
found it. The structure being leaked was a bio. It was
...
Alex Elder
06:48 AM rbd Subtask #3741: krbd: rework request tracking code
For some reason my tests started hanging on Friday when
I added memory debug code for catching leaks and reuses.
I ...
Alex Elder
07:49 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
Is this still an issue? Noah Watkins
04:54 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Josh just pinged me that there was a typo in the chmod patch, and nobody's noticed so apparently it still hasn't been... Greg Farnum
04:24 PM Bug #3795: loadgen task gets into msgr loop
I looked a bit more and I see some failures before that, and also some passes after, e.g. teuthology-2013-01-11_07:00... Sage Weil
11:35 AM Bug #3795: loadgen task gets into msgr loop
taking a look again at the nightly runs, looks like this issue has been happening on next branch from 01-01-2013 whic... Tamilarasi muthamizhan
08:13 AM Bug #3795: loadgen task gets into msgr loop
going to see if the recent msgr changes are to blame.. bisecting! Sage Weil
08:04 AM Bug #3795: loadgen task gets into msgr loop
This appears to be a simple cycle:
- objecter has lots of requests outstanding
- there is a fault (msgr failure i...
Sage Weil
03:37 PM Revision 017b6d63 (ceph): Revert "osdmap: spread replicas across hosts with default crush map"
This reverts commit 7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6.
This breaks teuthology and vstart both in its current ...
Sage Weil
03:04 PM CephFS Documentation #3796 (Resolved): FUSE mount documentation needs some corrections for v0,56x
The FUSE instructions need to be updated for v0.56 and later
currently:
> http://ceph.com/docs/master/cephfs/fuse...
Anonymous
01:35 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
I also don't seem to be able to reproduce on bobtail, marking can't reproduce. Samuel Just
12:58 PM Bug #3772 (New): osd: osd_disk_threads = 5 seems to hang recovery
I don't seem to be able to reproduce this on master. Samuel Just
10:37 AM Bug #3772: osd: osd_disk_threads = 5 seems to hang recovery
didn't reproduce with simple test, trying something more complicated. (roles/8882.yaml + osd disk threads : 10, teste... Samuel Just
01:28 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
Noah Watkins
12:57 PM Feature #3769 (Fix Under Review): osd: scrub should verify snap collection existence, membership
wip_snap_scrub Samuel Just
11:55 AM rbd Bug #2871 (Resolved): rbd export command hangs when trying to export an image of size 0 to a loca...
Not certain which recent fix resolved this, but it works now.
Dan Mick
11:32 AM rbd Bug #3585 (Closed): Image import via QEMU-IMG results in a corrupt rbd
Great, glad to hear it's fixed. Josh Durgin
11:09 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Patch posted for review. I'm not sure I'll be able to test
the scenario very well but hopefully it can be seen by
...
Alex Elder
09:56 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Implementing the change I described now. Alex Elder
11:01 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
for reference, ubuntu@teuthology:/a/teuthology-2013-01-10_07:00:03-regression-argonaut-master-basic/38145 Tamilarasi muthamizhan
10:50 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
This has shown up once in argonaut, probably not worth backporting unless it becomes more of a problem? Samuel Just
09:42 AM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
commit:3610e72e4f9117af712f34a2e12c5e9537a5746f Joao Eduardo Luis
07:00 AM CephFS Bug #2187: pjd chown/00.t failed test 97
Happened again on Friday. Time to add the delay injection to the nightlies?
2013-01-11T07:32:37.489 INFO:teutholo...
Sam Lang
06:52 AM Revision 92a9d9c2 (ceph): ceph.conf: separate replicas across osds
ceph.git master now separates across crush hosts without this setting.
For teuthology clusters, we don't want that (u...
Sage Weil
05:43 AM Bug #3770: OSD crashes on boot
So, my (very basic) understanding of this suggests that the fix is that the trim wouldn't happen in the first place.
...
Faidon Liambotis

01/13/2013

10:11 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
Nope.. which leads me to realize that that setting needs to go in teuthology's ceph.conf. Doing that now, and then I... Sage Weil
10:01 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
*sigh*
This also looks good to me, and I like it better (should have suggested this the first time around). But no...
Greg Farnum
10:05 PM Bug #3774 (Fix Under Review): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
wip-scrub Sage Weil
10:05 PM Bug #3786 (Fix Under Review): osd: scrub is deferred indefinitely if load is high
wip-scrub Sage Weil
07:04 AM Revision 410906e0 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis

01/12/2013

11:05 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
commit:410906e04936c935903526f26fb7db16c412a711 Sage Weil
11:03 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
... Sage Weil
11:01 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
der, broke vstart. can you review wip-3785? Sage Weil
08:01 AM CephFS Feature #3749: Remove forced synchronization from Java bindings
In libcephfs mount/unmount race against each other, and the test of the API (e.g. unmount racing against write). In C... Noah Watkins
01:10 AM Revision 7ea5d84f (ceph): osdmap: spread replicas across hosts with default crush map
This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi...
Sage Weil
01:09 AM Revision 3610e72e (ceph): mon: OSDMonitor: only share osdmap with up OSDs
Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'...
Joao Eduardo Luis
12:25 AM Revision 1f721804 (ceph): rbd: Fix tabs
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick

01/11/2013

11:56 PM Revision 34138993 (ceph): doc: Updates to CRUSH paper.
fixes: 3329, 3707, 3711, 3389
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
10:28 PM Revision 15bb00ca (ceph): rbd: call udevadm settle on map/unmap
When we map/unmap devices, udev gets called to manage device nodes;
this will allow the command to wait for those man...
Dan Mick
10:28 PM Revision e94b06a1 (ceph): rbd: make 'add' modprobe rbd so it has a chance of success
Check for existence of /sys/bus/rbd first to avoid unnecessary calls
Fixes: #3784
Signed-off-by: Dan Mick <dan.mick@...
Dan Mick
08:17 PM Revision 66eb93b8 (ceph): OSD: only trim up to the oldest map still in use by a pg
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin...
Samuel Just
08:15 PM Revision 8cf79f25 (ceph): OSD: check for empty command in do_command
Fixes: #3878
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
Samuel Just
08:09 PM Revision 3e147295 (ceph): Merge pull request #32 from imjustmatthew/imjustmatthew_docs
Correct typo in mon docs 'ceph.com' to 'ceph.conf' John Wilkins
07:59 PM Revision 0f161f1e (ceph): Correct typo in mon docs 'ceph.com' to 'ceph.conf'
Matthew Roy
06:49 PM Revision aeb02061 (ceph): qa/run_xfstests.sh: use cloned xfstests repository
Use our own copy of the xfstests repository rather than hitting
the upstream one repeatedly.
Signed-off-by: Alex Eld...
Alex Elder
06:15 PM Revision 8d0fa15e (ceph): mon: Monitor: only schedule a timecheck after election if we are not alone
Fixes: #3790
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
05:51 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
Merged to master in commit:7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6 and cherry-picked to bobtail in commit:503917f004... Greg Farnum
05:45 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
good question. let's start with bobtail. Sage Weil
05:39 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
Looks good to me. What branches do we want to cherry-pick it on. Greg Farnum
05:24 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
wip-3785 Sage Weil
01:59 PM Bug #3785 (New): ceph: default crush rule does not suit multi-OSD deployments
dang! wrong bug. opening this one back up.
sorry all!
Anonymous
12:34 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I think maybe Deb's comments and closure were meant for another bug (perhaps 3789?) Dan Mick
11:34 AM Bug #3785 (Won't Fix): ceph: default crush rule does not suit multi-OSD deployments
This comment should have been in bug 3789
caused by a lack of resources on the system.
have increased the memory fro...
Anonymous
11:32 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
This comment should have been in bug 3789
upping the memory on these VMs from 512M to 2G
since it appears it was a...
Anonymous
10:55 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I agree with Ian, I have seen *very bad things* happen when crush choses two OSD on one host, rather than distribute... Anonymous
10:11 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
The issue here is that CRUSH maps which behave well on multi-host deployments behave quite poorly on one or two host ... Greg Farnum
05:46 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Yes, Greg. The test passed in the recent runs. Tamilarasi muthamizhan
05:34 PM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
This appears to be passing now, right Tamil?
Since I'm not seeing anything else breaking I'm inclined to leave the...
Greg Farnum
04:25 PM Bug #3772 (In Progress): osd: osd_disk_threads = 5 seems to hang recovery
Samuel Just
03:53 PM Documentation #3330 (In Progress): doc: How to troubleshoot unbalanced CRUSH
John Wilkins
03:51 PM Documentation #3329 (In Progress): doc: What metrics should be used to set node weight
John Wilkins
02:45 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
That makes this sounds like a simple fix... we need to swap the frsize and bsize fields. Except that right now we ar... Sage Weil
02:39 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
I spent a bit of time with gregaf trying to find authoritative sources for what the different values denote. While `... David McBride
01:40 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
This coreutils commit may have useful data:
http://git.savannah.gnu.org/cgit/coreutils.git/commit/src?id=0863f018f0f...
Greg Farnum
01:38 PM CephFS Bug #3793 (Resolved): wrong size reported in some distributions/toolchains
In ceph_statfs we set f_bsize to be 1MB in order to report very large available spaces. However, nowadays it is appar... Greg Farnum
02:38 PM CephFS Feature #3749: Remove forced synchronization from Java bindings
This needs more thought than just removing synchronization. We'd like to be segfault free in Java, even though you co... Noah Watkins
02:26 PM Bug #3789: OSD core dump and down OSD on CentOS cluster
There is 'ceph health', and a nagios plugin that runs it. A similarly trivial plugin can probably be written for oth... Sage Weil
02:01 PM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
dmesg shows it was a lack of resources.
upping the memory on these VMs from 512M to 2G
since it appears it ...
Anonymous
10:28 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
Deb Barba wrote:
> all core files have similar backtrace.
> again, Sage, looks like you are right, low resources
>...
Anonymous
10:27 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
all core files have similar backtrace.
again, Sage, looks like you are right, low resources
dmesg:
hrtimer: inte...
Anonymous
10:23 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
looks from dmesg, you are right Sage, low on resources
centos1 core# gdb /usr/bin/ceph-osd core.0.26177
Core wa...
Anonymous
10:16 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
backtrace of core.0.14401 from centos3:
Core was generated by `/usr/bin/ceph-osd -i 8 --pid-file /var/run/ceph/osd....
Anonymous
09:37 AM Bug #3789 (Need More Info): OSD core dump and down OSD on CentOS cluster
check dmesg, or VM responsiveness. this triggers when a call to sync(2) takes more than... 2 minutes? i forget how l... Sage Weil
09:13 AM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
Running a CentOS VM cluster. Running v0.56.1
I had written a bit of data, and stopped writing about 4pm yesterday...
Anonymous
02:17 PM rbd Subtask #3741: krbd: rework request tracking code
Unfortunately my system crashed after an hour or so. The
crash was in the network driver, and a little analysis
su...
Alex Elder
10:45 AM rbd Subtask #3741: krbd: rework request tracking code
My full test run isn't complete but I seem to have resolved
whatever problem I was hitting yesterday. I have not ye...
Alex Elder
01:39 PM CephFS Bug #3794 (Resolved): uclient: reports sizes wrong in some cases
This is the counterpart to kernel bug #3793. See Client::statfs, in which we set f_bsize to 1MB but f_frsize to 4KB. ... Greg Farnum
12:22 PM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
8cf79f252a1bcea5713065390180a36f31d66dfd Samuel Just
11:12 AM Bug #3787 (Fix Under Review): Ceph OSD crashes on ceph tell osd.x
wip_3787 Samuel Just
09:33 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
verified this happens on master. should be an easy fix. thanks for the report! Sage Weil
12:17 AM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
I recently set up a small test cluster with 2 nodes to test the 0.48.3 -> 0.56.1 upgrade. After Upgrading one of the ... Seb Mel
12:22 PM Bug #3770 (Resolved): OSD crashes on boot
66eb93b83648b4561b77ee6aab5b484e6dba4771 Samuel Just
11:16 AM Bug #3770 (Fix Under Review): OSD crashes on boot
wip_3770 Samuel Just
11:03 AM Bug #3770: OSD crashes on boot
The fault is in OSD::handle_osd_map where we trim old maps. Prior to 0.50, the pgs would have processed up to the cu... Samuel Just
09:59 AM Bug #3770: OSD crashes on boot
I'm seeing this same assert failure when trying to startup 3 of my OSDs. Happy to provide feedback for the debugging ... Mike Dawson
09:43 AM Bug #3770: OSD crashes on boot
sjust said that we're done collecting information and that I could rm the pg directory/log/info, which I did. Unfortu... Faidon Liambotis
09:41 AM Bug #3770: OSD crashes on boot
Ian Colle
12:04 PM Bug #3788: debian source packages are missing
Gary Lowell wrote:
> It looks like the Sources file has been zero length in past releases as well. Still investigat...
Loïc Dachary
12:03 PM Bug #3788: debian source packages are missing
My favorite use case when source packages are available would be... Loïc Dachary
11:33 AM Bug #3788: debian source packages are missing
I think we should build source packages too (in addition to tarballs, etc.). Sage Weil
10:47 AM Bug #3788: debian source packages are missing
We are not currently building debian or rpm source packages. We do put out a source tarball corresponding to the rel... Anonymous
09:56 AM Bug #3788 (In Progress): debian source packages are missing
It looks like the Sources file has been zero length in past releases as well. Still investigating. Anonymous
02:20 AM Bug #3788: debian source packages are missing
Proposed fix at https://github.com/ceph/ceph-build/pull/1 Loïc Dachary
01:44 AM Bug #3788: debian source packages are missing
http://ceph.com/debian/conf/distributions is created from https://github.com/ceph/ceph-build/blob/master/gen_reprepro... Loïc Dachary
01:35 AM Bug #3788 (Resolved): debian source packages are missing
Following the instructions at http://ceph.com/docs/master/install/debian/ to add the ... Loïc Dachary
10:52 AM CephFS Bug #3773: mds crashed at LogEvent::decode
Sure Sage. I was running bonnie from client during upgrade.
I had debug ms=1 set, i will try to reproduce this with...
Tamilarasi muthamizhan
09:41 AM CephFS Bug #3773 (Need More Info): mds crashed at LogEvent::decode
Tamil, I wonder if you can try to reproduce this with mds logging turned up from teh start (debug mds = 20, debug ms ... Sage Weil
10:34 AM Messengers Bug #2569: msgr: connect_rank crash
yes, you are right, Greg. I just wanted to put a note of this somewhere, so chose to update the bug itself :) Tamilarasi muthamizhan
10:23 AM Bug #3748 (Fix Under Review): ceph osd dump --format=json includes non-JSON line
wip-3748 has a fix, commit:0edb53f02231fb83f33d3bc5f58b37b14cd5df82 Joao Eduardo Luis
10:20 AM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
Ian Colle
10:16 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
looks good, merged into master. commit:8d0fa15e6aa3847e89de5d5adfca0a863e8da976 Sage Weil
10:06 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
Had a redundant check on the previous commit; fixed and rebased it and the new commit can be found on wip-3790 commit... Joao Eduardo Luis
10:02 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
This patch fixes it. Joao Eduardo Luis
09:31 AM Bug #3790 (In Progress): Mon crash after update to ceph version 0.56-209-g310112f
My fault. Forgot a check on win_election().
Any chance you can test 6104629d95207f3dfd3a744d81b011b6a714070e on wi...
Joao Eduardo Luis
09:18 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
Previous installed version was .56-193. Ken Franklin
09:14 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
I have a single node cluster on burnupi60 updated each morning to the latest Master branch. After the update this mo... Ken Franklin
09:16 AM Bug #3774 (In Progress): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
09:16 AM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
wip-scrub-sched for the argonaut version. should look very similar for master/bobtail. Sage Weil
02:05 AM Revision 310112f7 (ceph): Merge remote-tracking branch 'gh/wip-3633'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
02:04 AM Revision 9e4a3f03 (ceph): Merge remote-tracking branch 'gh/wip-3633'
Sage Weil
02:03 AM Revision 305cb54a (ceph): suites: rados: multimon: add mon clock skews task yaml files
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
12:58 AM Revision 2fa5d23b (ceph): test: Hadoop cluster and task config.
Add a 3-node cluster specification and a
task for running wordcount with Hadoop on Ceph.
Signed-off-by: Joe Buck <jb...
Joe Buck
12:44 AM Revision aa40de90 (ceph): messages: add MTimeCheck
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
12:44 AM Revision 684d4ba2 (ceph): mon: Monitor: add timecheck infrastructure to detect clock skews
Fixes: #3633
Fixes: #3695
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inkt...
Joao Eduardo Luis
12:44 AM Revision ff1c254b (ceph): mon: Monitor: reduce indentation level; make code more readable
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:44 AM Revision 7a7fff57 (ceph): mon: Monitor: move a couple of if's together on handle_command()
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:44 AM Revision bc57c7a9 (ceph): mon: Monitor: use 'else if' on handle_command instead of bunches of 'if'
... when the options are mutually exclusive.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
12:44 AM Revision 58e03ecb (ceph): mon: Monitor: unify 'ceph health' and 'ceph status'; add json output
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:03 AM Revision e6f284e9 (ceph): doc: Added -a option. Should work without from server, as described.
fixes: #3750
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins

01/10/2013

11:59 PM Revision de6633f9 (ceph): doc: Normalized to term "drive" rather than disk. Changed "(Manual)" en...
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:06 PM Revision 7a8ec194 (ceph): Merge branch 'next'
Samuel Just
09:54 PM Revision 988f3597 (ceph): rados: add truncate support
Signed-off-by: Samuel Just <sam.just@inktank.com>
Revewed-by: Greg Farnum <greg@inktank.com>
Samuel Just
09:04 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
If the load is above the threshold, we will never scrub. For some environments, this is normal (e.g., mixed OSD and ... Sage Weil
08:23 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
This seems to be fixed in QEMU 1.3.0 and Ceph 0.56.1
I've tried QED -> Raw -> Ceph -> Raw then QED -> Ceph -> Raw an...
Matt Anderson
07:56 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
Version: 0.48.2-0ubuntu2~cloud0
Our Ceph deployments typically involve multiple OSDs per host with no disk redunda...
Ian Colle
07:10 PM rbd Feature #3635 (In Progress): rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
07:10 PM Revision 44625d44 (ceph): config_opts.h: default osd_recovery_delay_start to 0
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd...
Samuel Just
07:09 PM rbd Feature #3784 (In Progress): rbd: issue modprobe when rbd map is called
Dan Mick
06:04 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
rbd map will not work unless the rbd kernel module is loaded, and this must be done manually. Add code to rbd to cau... Dan Mick
07:02 PM Revision 830b8ffa (ceph): ReplicatedPG: fix snapdir trimming
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l...
Samuel Just
06:34 PM Revision 0f42c373 (ceph): ReplicatedPG: fix snapdir trimming
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l...
Samuel Just
06:24 PM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
06:14 PM Revision 035caac5 (ceph): Revert "rgw: fix handler leak in handle_request"
This reverts commit eba314a811cd98a79f483dc7a9128fe76c722c78.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
06:11 PM rgw Feature #3402 (Fix Under Review): rgw: improve tests for multipart upload
caleb miles
06:10 PM rgw Feature #3634 (Fix Under Review): rgw: improve teuthology radosgw-admin test
caleb miles
06:09 PM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
commit:310112f702d14294e6ba48f8af41a306288cba65 Sage Weil
06:09 PM Revision eb997e25 (ceph): Merge pull request #31 from chrisglass/expose_cluster_stats_to_python
Added python wrapper to rados_cluster_stat Greg Farnum
05:59 PM rbd Bug #3518 (Can't reproduce): rbd import file --format 2 creates an image named '--format'
Dan Mick
05:59 PM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
It seems that this no longer happens as of e6f284e945f45e39c57921149d4551d9e78557a5,
so closing non-reproducible.
Dan Mick
05:06 PM CephFS Bug #3773: mds crashed at LogEvent::decode
Okay, I gathered up a core file, a high-debug MDS log, and the log with the bad event (and the bad event itself) in t... Greg Farnum
02:05 PM CephFS Bug #3773: mds crashed at LogEvent::decode
I'll at least start this off. Greg Farnum
04:54 PM Revision c8f3fd6e (ceph): marginal: Remove broken symlinks
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
04:47 PM Messengers Bug #2569: msgr: connect_rank crash
I believe this was caused by some issues which we decided not to backport the fixes for due to their size; Sage can c... Greg Farnum
04:43 PM Messengers Bug #2569: msgr: connect_rank crash
hit this on a mixed cluster running argonaut v0.48.3 and v0.56 [ ceph version 0.56-193-g00898c1]
monitors,mds,osds...
Tamilarasi muthamizhan
04:37 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
I claim that zero-sized images are legal, if not particularly useful in that size...but one might well want to create... Dan Mick
04:15 PM Bug #3770: OSD crashes on boot
root@ms-be1003:/var/lib/ceph/osd/ceph-27# find current/meta/ | tee ~/ceph-osd.27.meta | wc -l
42992
Attached.
Faidon Liambotis
04:02 PM Bug #3770: OSD crashes on boot
root@ms-be1003:/var/lib/ceph/osd/ceph-27/current/4.f9_head# attr -lq $PWD | while read attr; do echo $attr; attr -q -... Faidon Liambotis
02:27 PM Bug #3770 (Need More Info): OSD crashes on boot
From the backtrace:
pgid = {m_pool = 4, m_seed = 249, m_preferred = -1}
Based on the info attr, we try to...
Samuel Just
04:04 PM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
Documentation described making the call from the server console, which should work as described. Added -a so that it ... John Wilkins
03:52 PM Bug #3780 (Won't Fix): pg_num inappropriately low on new pools
Version: 0.48.2-0ubuntu2~cloud0
On a Ceph cluster with 18 OSDs, new object pools are being created with a pg_num o...
Ian Colle
03:08 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
The documentation should note that the
@rgw dns name = {hostname}@
option must be set in the
@[client.radosgw.g...
caleb miles
11:13 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
The process for setting up a server that handles subdomain API requests is not documented. If possible we should add ... caleb miles
03:07 PM Documentation #3711 (In Progress): crush-map.rst: choose firstn talks about "N", but does not cle...
John Wilkins
03:05 PM devops Documentation #2886 (In Progress): doc: crush location tricks, ceph.conf, automatic host=
John Wilkins
02:23 PM rbd Subtask #3741: krbd: rework request tracking code
I am leaving shortly for a few hours. In reviewing this
new code I find a few things that make it a little hard
ma...
Alex Elder
01:00 PM rbd Subtask #3741: krbd: rework request tracking code
I did some testing yesterday and found that I got I/O errors
while running xfstests. This was unexpected; I thought...
Alex Elder
01:43 PM Revision 797b3db3 (ceph): Added python wrapper to rados_cluster_stat
The new get_cluster_stats() method on the rados.Rados object calls
the rados_cluster_stat() function in the librados ...
Chris Glass
12:51 PM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
Ian Colle
12:48 PM Feature #3769: osd: scrub should verify snap collection existence, membership
Written, just needs to be ported to Bobtail Ian Colle
09:40 AM Feature #3769 (In Progress): osd: scrub should verify snap collection existence, membership
Sage Weil
12:47 PM Bug #3736 (In Progress): kernel build: failures starting in 3.8-rc1
Ian Colle
12:02 PM Bug #3736: kernel build: failures starting in 3.8-rc1
The remaining issue is that the patch we apply to scripts/package/builddeb to build the perf tools is out of date. I... Anonymous
12:45 PM Bug #3702 (New): OSD SIGABRT during startup
Ian Colle
12:40 PM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
Ian Colle
09:35 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
How's the testing come along, Sage? Greg Farnum
12:39 PM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
Believed fixed by patch to 3633
684d4ba242b26828bd7927860226bfc8a0cfcc2b
Ian Colle
12:35 PM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
Looked into the core dump, can't see how this happened. Samuel Just
12:30 PM Bug #3591 (Closed): auth: could not find secret_id=0
Ian Colle
12:30 PM Bug #3591 (Resolved): auth: could not find secret_id=0
Resolved by Sage's fix above. Ian Colle
12:29 PM Bug #3563 (Closed): osd crashed with error "auth: could not find secret_id=2"
Ian Colle
12:29 PM Bug #3563 (Resolved): osd crashed with error "auth: could not find secret_id=2"
Resolved by fix to 3591 Ian Colle
12:20 PM Bug #3467 (Closed): osd: bad state machine event in start_recoverY_ops
Ian Colle
12:20 PM Bug #3467 (Won't Fix): osd: bad state machine event in start_recoverY_ops
If encountered, restart OSD. Ian Colle
12:13 PM Bug #3300: ceph::buffer::end_of_buffer isn't caught
Josh - Is this just a case where the documentation needs to be updated? Ian Colle
11:46 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
The same issue exists with the debian packages. We have an explicit dependency on python, but not on perl. I don't ... Anonymous
10:55 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
Can we check to ensure perl is not used elsewhere?
Are there guidelines that are provided to the developers that spe...
Anonymous
10:06 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
I hate to see a dependency like perl get added for a oneliner perl regex. Is this the only place perl is used? Can ... Sam Lang
09:43 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
backport to bobtail Ian Colle
11:26 AM Tasks #3779 (Resolved): update osd config ref as appropriate
I'm not sure what our update policies on the docs are, but the defaults named in http://ceph.com/docs/master/rados/co... Greg Farnum
11:11 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
Similar to the issue that triggered #3735 Yehuda Sadeh
10:25 AM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
Sage Weil
10:19 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Ian Colle
10:19 AM rgw Bug #3735 (Resolved): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Ian Colle
10:00 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
commit:e1da85f286838cdd3a6329840cec748c6a11fd26 Sage Weil
09:57 AM Bug #3747: PGs stuck in active+remapped
Sage Weil wrote:
> commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
oops, wrong bug!
Sage Weil
09:45 AM Bug #3747 (Resolved): PGs stuck in active+remapped
commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854 Sage Weil
09:55 AM CephFS Feature #3621 (Closed): qa: add knfsd reexport tests to qa suite
Ian Colle
09:52 AM CephFS Feature #3621: qa: add knfsd reexport tests to qa suite
commit:aaa03bbcd2549a38f962a61fc63be16cca3a6d90 in teuthology.git Sage Weil
09:34 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
If a user has a small to moderate size of root disk, they will probably have to modify the log rotation process for c... Anonymous
09:32 AM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
Sage Weil
08:34 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
Sam,
That is a cool idea. I will open a doc bug for that. Providing instructions for those with smaller root dri...
Anonymous
06:32 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
The easiest solution for this might be to adjust the default logrotate script (src/logrotate.conf) to use the size pa... Sam Lang
03:52 AM Revision 59aad347 (ceph): configure.ac: check for org.junit.rules.ExternalResource
Check for org.junit.rules.ExternalResource if build with
--enable-cephfs-java and --with-debug. Checking for junit4
i...
Danny Al-Gaaf
01:13 AM Revision 12af11a1 (ceph): src/java/Makefile.am: fix default java dir
Fix default javadir in src/java/Makefile.am to $(datadir)/java
since this is the common data dir for java files.
Sig...
Danny Al-Gaaf
01:13 AM Revision 9b167b46 (ceph): ceph.spec.in: fix handling of java files
Fix handling of JAVA (jar) files. Don't move the files around in the install
section since the related Makefile is fi...
Danny Al-Gaaf
01:13 AM Revision f027d025 (ceph): ceph.spec.in: rename libcephfs-java package to cephfs-java
Rename the libcephfs-java package to cephfs-java since the package
contains no (classic) library and RPMLINT complain...
Danny Al-Gaaf
01:13 AM Revision d8c4fc5e (ceph): ceph.spec.in: fix libcephfs-jni package name
Rename libcephfs-jni to libcephfs_jni1 to reflect the SO name/version of
the library and to prevent RPMLINT to compla...
Danny Al-Gaaf
01:13 AM Revision aedbb97f (ceph): configure.ac: remove AC_PROG_RANLIB
Remove already comment out AC_PROG_RANLIB to get rid of warning:
libtoolize: `AC_PROG_RANLIB' is rendered obsolete b...
Danny Al-Gaaf
01:13 AM Revision 61437ee2 (ceph): configure.ac: change junit4 handling
Change handling of --with-debug and junit4. Add a new conditional HAVE_JUNIT4
to be able to build ceph-test package a...
Danny Al-Gaaf
12:11 AM Revision 00898c18 (ceph): rbd: allow copy of zero-length images. Includes simple test.
Fixes: #3765
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick
12:10 AM Revision 1c3d6840 (ceph): doc/install/debian.rst: fix typo in link ref; broke doc build
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick

01/09/2013

11:11 PM Revision 133e4e34 (ceph): Merge branch 'next'
Want to get various rbd-related fixes together for upgrade testing Dan Mick
10:40 PM Revision 48f13946 (ceph): ReplicatedPG: increment scrubber.errors rather than errors
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
05:37 PM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
commit:5b12b514b047a8a46cc5549bd94b398289b9b5f6 Sage Weil
05:08 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
I'm calling this fixed, then. Dan Mick
04:54 PM rbd Bug #3766: rbd resize command fails on a mixed node cluster when it is a copied rbd image and whe...
This works fine on the master branch that has a fix for it :
ceph version 0.56-193-g00898c1 (00898c1860e8ae95b52192...
Tamilarasi muthamizhan
01:44 PM rbd Bug #3766 (Need More Info): rbd resize command fails on a mixed node cluster when it is a copied ...
I think this might be e1776809031c6dad441cfb2b9fac9612720b9083, which is still in next. Can you try an rbd client fr... Dan Mick
04:35 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
Deb Barba <deb.barba@inktank.com>
3:13 PM (1 hour ago)
to Dan
so, as I explained in chat.
i am again seeing ...
Anonymous
04:34 PM Feature #3775 (New): log: stop logging in statfs reports usage above some threshold
Add a 'log stop on utilization = .95' option that will make the log code print one last line like
--- suspending l...
Anonymous
04:31 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
These should get put at the top of the scrub queue in a way that still honors all the scheduling.
The problem is t...
Sage Weil
04:27 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
Dan Mick
04:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
Fixed, test added, in master:
commit:00898c1860e8ae95b5219257d1635b15ccdce5c1
Dan Mick
11:44 AM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
Dan Mick
02:58 PM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
I had a cluster [burnupi06, burnupi07, burnupi08] ...
Tamilarasi muthamizhan
02:32 PM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
I believe this to have been fixed by the fix for #3744. Dan Mick
01:47 PM rbd Bug #3753: rbd copy command reports error even though copy is successful on a mixed node cluster
Tamil, does this still happen with the fix in wip-no-cls-lock (and now in next) for 3744? Dan Mick
02:14 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
reported on IRC, should be easy to reproduce.
we may want to change the default to 2 in order to avoid hiding thes...
Samuel Just
01:51 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
unable to reproduce so far Dan Mick
12:05 PM CephFS Feature #3570 (In Progress): teuthology: mds thrasher
Sam Lang
11:47 AM rbd Feature #2256: rbd: parallelize deletions
Dan Mick
11:46 AM rbd Feature #2297: ObjectCacher: mark buffers mergeable for ksm
Dan Mick
11:46 AM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
Dan Mick
11:46 AM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
11:42 AM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
commit:4483285c9fb16f09986e2e48b855cd3db869e33c in next Dan Mick
11:28 AM Bug #3771: ceph does not have startup scripts in Centos
Gary found that the installation script was commented out 2011-10-17
> commit 9baf5ef4f35c38d7fbaa70bde8f2c9383b2f...
Anonymous
11:13 AM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
I did a basic ceph v0.56 installation on Centos 6.3
I have rebooted my nodes, and find that ceph is not startup up a...
Anonymous
10:58 AM CephFS Bug #3681: kclient fsx fails nightly
Proposed fix to set i_size before the setattr request:
This will resolve the above issue, because the cap flush on...
Sam Lang
09:59 AM Bug #3683 (Can't reproduce): mon: leak of MMonPaxos
Joao Eduardo Luis
09:58 AM Bug #3683: mon: leak of MMonPaxos
I can't for the life of me get to reproduce this leak. In the meantime, Sage submitted a patch to msg/Pipe.cc [1] tha... Joao Eduardo Luis
07:17 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
I've been unable to reproduce this bug, but the cause was pretty obvious, so I pushed a fix that should deal with thi... Joao Eduardo Luis
03:39 AM Revision 62e721a9 (ceph): librados: add aio stat tests
Implement simple write-stat test, and a write-stat-remove-stat test cycle.
Signed-off-by: Filippos Giannakos <philip...
Filippos Giannakos
03:38 AM Revision 879578c1 (ceph): librados: implement aio_stat
Implement aio stat and also export this functionality to the C API.
Signed-off-by: Filippos Giannakos <philipgian@gr...
Filippos Giannakos
02:32 AM Revision 5b12b514 (ceph): osd: make missing head non-fatal during scrub
If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca...
Sage Weil
02:29 AM Revision e1da85f2 (ceph): rgw: Fix crash when FastCGI frontend doesn't set SCRIPT_URI
Fixes: #3735
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Sylvain Munaut
02:28 AM Revision eba314a8 (ceph): rgw: fix handler leak in handle_request
Fixes: #3682
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
caleb miles
02:25 AM Revision 4483285c (ceph): librbd: Allow get_lock_info to fail
If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need...
Dan Mick
02:21 AM Revision 77ddf276 (ceph): doc/release-notes: v0.48.3argonaut
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:23 AM Bug #3770 (Resolved): OSD crashes on boot
One of my 0.56.1 OSDs crashed and couldn't boot: it was reaching tp_op heartbeats, and even after increasing that I w... Faidon Liambotis

01/08/2013

10:21 PM Feature #3769 (Resolved): osd: scrub should verify snap collection existence, membership
and, hopefully, backport this to argonaut Sage Weil
09:39 PM Feature #3651 (In Progress): osd: deep scrub should hash omap
David Zafman
07:57 PM Revision 573f5315 (ceph): marginal/multiclient: Matching tests for kclient
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
07:54 PM Revision 14385a66 (ceph): marginal/multiclient: Add three client cluster
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
07:51 PM Revision a4df5238 (ceph): marginal/multiclient: Adding ior test to marginal
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:36 PM Revision 1e03fe18 (ceph): marginal/multiclient: Add a test for fsx-mpi
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:23 PM Revision c07a4cb6 (ceph): marginal/multiclient: New task to run mdtest
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:11 PM Revision f17847e5 (ceph): task/kclient: chmod root to 1777.
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
05:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
I looked into this; it happens because clip_io() (called from read_iterate()) tries to validate
that writing at offs...
Dan Mick
03:23 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
while am still surprised about why we ne...
Tamilarasi muthamizhan
04:45 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
logrotate for ceph (/etc/logrotate.d/ceph) uses perl commands
if perl is not installed, logrotate fails
if logrotat...
Anonymous
04:29 PM CephFS Bug #3597: ceph-fuse: denying root access
Is root actually a member of the fuse group? If not that would be correct behavior. Greg Farnum
04:07 PM Revision f8958463 (ceph): task/mpi: Allow working directory to be specified
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:46 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
ubuntu@burnupi24:/var/log/ceph$ ceph -v
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
...
Tamilarasi muthamizhan
03:42 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
I think so.
But first let's verify it passes.
Sage Weil
12:43 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Should we revert that teuthology commit, then? Greg Farnum
12:31 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
There was a bug in the kernel for o_creat permissions checking for non root users.. Its fixed in the testing branch. ... Sage Weil
10:49 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
This is weird. Tamil says this one has never passed, but we can both run it locally fine and it passes in the ceph-fu... Greg Farnum
09:39 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
I made a change to the cfuse task to chmod 1777 the ceph root dir after its mounted. I think we should do the same f... Sam Lang
09:21 AM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
log: ubuntu@teuthology:/a/teuthology-2013-01-05_22:28:52-regression-next-testing-basic/35949
35949: (190s) collect...
Tamilarasi muthamizhan
03:34 PM Revision 16248121 (ceph): task: A task to setup mpi
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:33 PM Revision e88c0fc8 (ceph): task/ceph-fuse: chmod root to 1777
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:32 PM Revision 4ed20ae8 (ceph): task/pexec: Add barrier capability
This patch adds the ability to barrier between
parallel exec tasks so that all tasks will perform
the following step ...
Sam Lang
03:31 PM Revision 35320083 (ceph): task/pexec: More fixes for all case, exec on hosts
We don't want to do an exec per role, but per-host. We
were already doing an exec per host, but the names were confu...
Sam Lang
03:29 PM Revision 081a80f8 (ceph): task/pexec: Fix when 'all' is used
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:25 PM Revision d44fb147 (ceph): radosgw-admin.py: Increase test coverage to current admin feature set.
Signed-off-by: caleb miles <caleb.miles@inktank.com> caleb miles
12:58 PM Feature #3760: osd: maintain checksum on collection contents
It wasn't clear to me from the description, but we are of course talking about maintaining in the HashIndex a checksu... Greg Farnum
12:13 PM Feature #3760 (Rejected): osd: maintain checksum on collection contents
Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ... Samuel Just
12:33 PM RADOS Feature #3764 (New): osd: async replicas
The following is more a topic for conversation than a feature:
Currently, latency on any operation is limited by t...
Samuel Just
12:23 PM rbd Feature #3763 (Resolved): krbd: handle flattening of mapped image
An rbd client receives notice if the snapshot context for
a mapped rbd image has changed. It is possible for the
s...
Alex Elder
12:19 PM Linux kernel client Bug #3762 (Duplicate): kernel osd client: verify support for multiple ops per request
In order to support layered rbd images, the osd client needs
to support multiple ops in a single osd request.
Loo...
Alex Elder
12:15 PM rbd Feature #3761 (Resolved): kernel messenger: need to support multiple ops per request
The kernel messenger currently gets message data from either
a bio list or a page vector. That is one or the other,...
Alex Elder
12:13 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
Samuel Just
12:11 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ... Samuel Just
12:08 PM rbd Tasks #2853: krbd: read path
This task depends on the completion of the following others
before it can be completed:
3741 krbd: rework request ...
Alex Elder
12:07 PM Feature #3758 (Rejected): osd: incremental object checksumming
Currently, scrub can only compare the checksums between replicas. If an inconsistency is found between two replicas,... Samuel Just
12:07 PM rbd Subtask #2854: krbd: write path
Work on this won't really begin until the read path work
has completed (http://tracker.newdream.net/issues/2853).
Alex Elder
12:06 PM rbd Subtask #2854: krbd: write path
OK, I'm going to interpret this as:
Any write operation on a layered image will be preceded
by an existence c...
Alex Elder
12:04 PM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
Added tests to the _marginal_ qa suite that run IOR, mdtest, and fsx-mpi. Sam Lang
11:48 AM Feature #3756 (Duplicate): Watch/Notify cleanup
Samuel Just
11:41 AM Feature #3756 (Duplicate): Watch/Notify cleanup
The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this...
Samuel Just
11:47 AM Feature #3757 (Resolved): osd: Watch/Notify cleanup
The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this...
Samuel Just
11:24 AM Bug #3744: librbd: need to handle older OSDs that don't have cls_lock
Actually, rados lock list should continue to fail. Dan Mick
11:10 AM Documentation #3322: doc: Explain multi-tenant CephFS
Where is this located? I wasn't able to find it. Greg Farnum
11:00 AM rbd Tasks #3755 (Resolved): krbd: use new request tracking code for sync object operations
The last request type still using the old request tracking code
is for handling synchronous operations. There are t...
Alex Elder
10:58 AM rbd Feature #3754 (Closed): krbd: use new request tracking code for notify ack
Two request types remain that still use the old request
tracking mechanism. One of them is sending acknowledgements...
Alex Elder
09:54 AM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
On a mixed node cluster running argonaut[burnupi21,...
Tamilarasi muthamizhan
09:39 AM CephFS Feature #3543: mds: new encoding
I'm going to get started on this (mostly just figuring out current state, probably) today. Greg Farnum
09:28 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
Joao Eduardo Luis
06:54 AM Bug #3695 (In Progress): monitor crashed after an upgrade in Monitor::timecheck
Joao Eduardo Luis
08:47 AM Linux kernel client Bug #3751: krbd: fix type of snap_id local variable
I have a fix for this and I'll post it for review later
today....
Alex Elder
08:47 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
The type of the snap_id local variable in rbd_dev_v2_snap_info()
is defined with the wrong byte order.
Alex Elder
06:43 AM Bug #3748: ceph osd dump --format=json includes non-JSON line
One other option would be to provide "standard" fields for status output when using json, regardless of any other exp... Joao Eduardo Luis
05:08 AM Revision 920f82e8 (ceph): v0.48.3argonaut
Gary Lowell
04:51 AM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
I believe that the Ceph quick start guide should specify
@sudo service ceph -a start@
instead of the current
@...
caleb miles
04:51 AM Revision f07921be (ceph): doc/install: new URLs for argonaut vs bobtail
Also restructure the document a bit to make the choice of packages more
clear.
Signed-off-by: Sage Weil <sage@inktan...
Sage Weil
04:46 AM Revision 72674ad4 (ceph): doc/release-notes: v0.56.1
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:40 AM Bug #3747: PGs stuck in active+remapped
I did a "ceph osd out 0; sleep 30; ceph osd in 0" and out of those 61 active+remapped pgs, 5 went into active+remappe... Faidon Liambotis
12:14 AM Revision 1b194b25 (ceph): Merge branch 'wip-stripe-gran'
Reviewed-by: Greg Farnum <greg@inktank.com> Noah Watkins

01/07/2013

11:50 PM Revision 26e8438a (ceph): test: enforce -ENOTCONN contract in libcephfs
Tests all relevant calls for -ENOTCONN when used with an unmounted
ceph_mount_info param.
Signed-off-by: Noah Watkin...
Noah Watkins
11:49 PM Revision 5c58aa96 (ceph): libcephfs: return -ENOTCONN when call unmounted
Adds -ENOTCONN return value for stat, fchmod, fchown, lchown.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
11:16 PM Revision f83fcf63 (ceph): PG: set DEGRADED in Active AdvMap handler based on pool size
Otherwise, if the acting set does not change, the pg might
not show up as degraded if the pool size now exceeds the
a...
Samuel Just
11:04 PM Revision c4121093 (ceph): libcephfs: clarify interface return value
Document that ceph_get_stripe_unit_granularity may return an error code
(e.g. -ENOTCONN). The interface requires a mo...
Noah Watkins
09:33 PM Revision e4a54162 (ceph): v0.56.1
Gary Lowell
09:12 PM Revision c8f8c7e6 (ceph): Merge branch 'next'
Sage Weil
09:08 PM Revision 9aecacda (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim...
Sage Weil
09:08 PM Revision 299dbad4 (ceph): msgr: update Message envelope in encode, not write_message
Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi...
Sage Weil
09:08 PM Revision 35d2f583 (ceph): msg/Pipe: encode message inside pipe_lock
This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen...
Sage Weil
09:08 PM Revision 9b23f195 (ceph): msg/Pipe: associate sending msgs to con inside lock
Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes...
Sage Weil
09:08 PM Revision 6229b5a0 (ceph): msg/Pipe: fix msg leak in requeue_sent()
The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from comm...
Sage Weil
09:04 PM Revision 1b39b316 (ceph): Merge branch 'wip-3678-b' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Sage Weil
09:02 PM Revision 40706afc (ceph): msgr: update Message envelope in encode, not write_message
Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi...
Sage Weil
09:02 PM Revision d16ad926 (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim...
Sage Weil
09:01 PM Revision 6a00ce0d (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc...
Sage Weil
08:58 PM Revision 62586884 (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc...
Sage Weil
06:39 PM Revision 213e3559 (ceph): osd: fix race in do_recovery()
Verify that the PG is still RECOVERING or BACKFILL when we take the pg
lock in the recovery thread. This prevents a ...
Sage Weil
06:38 PM Revision e410d1a0 (ceph): ReplicatedPG: requeue waiting_for_ondisk in apply_and_flush_repops
Fixes: #3722
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
06:34 PM Revision 4c9f4c3c (ceph): ceph-fuse: rename ceph_ll_* to fuse_ll_*
To not conflict with future linuxbox pull for nfs-ganesha.
Signed-off-by: David Zafman <david.zafman@inktank.com>
Re...
David Zafman
04:04 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
Remove "synchronized" keyword from native interface. This was originally added when we were seeing some pthread mutex... Noah Watkins
03:58 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
ceph osd dump --format=json includes the non-JSON "dumped osdmap epoch N" at the top of the output, which of course b... Dan Mick
03:42 PM Bug #3747 (Closed): PGs stuck in active+remapped
About a week ago I doubled the number of OSDs in my cluster from 24 to 48 and, in the same day, adjusted CRUSH's defa... Faidon Liambotis
03:35 PM rbd Subtask #2854: krbd: write path
rbd write path.. 'guard' in the sense that the write has a check to verify the object already exists. Sage Weil
03:22 PM rbd Subtask #2854: krbd: write path
Pretty sure this is about the rbd locking and fencing. Greg Farnum
03:11 PM rbd Subtask #2854: krbd: write path
I'm about to mark bug 3418 as a duplicate of this one.
I'm adding the following from that bug here first.
I did...
Alex Elder
03:11 PM rbd Subtask #2854: krbd: write path
I'm not sure what "guard writes" is supposed to mean.
But I'm going to interpret it as simply implementing the
writ...
Alex Elder
03:26 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
Error coming from fsx:
INFO:teuthology.orchestra.run.out:Mapped Write: non-zero data past EOF (0xb826) page offset...
Sam Lang
03:14 PM rbd Feature #3419 (Duplicate): krbd: copy-up on write to clone
This is a duplicate of http://tracker.newdream.net/issues/2855. Alex Elder
03:14 PM rbd Subtask #2855: krbd: copy-up on write to clone
I don't know how to change the one-line bug description or I
would.
I need some clarification about the intended ...
Alex Elder
03:12 PM rbd Feature #3418 (Duplicate): krbd: write path (layering)
This is a duplicate of http://tracker.newdream.net/issues/2854. Alex Elder
03:07 PM rbd Feature #3417 (Duplicate): krbd: read path (layering)
This is a duplicate of tracker.newdream.net/issues/2854. Alex Elder
03:06 PM rbd Tasks #2853: krbd: read path
I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work o...
Alex Elder
03:05 PM rbd Feature #3416 (Duplicate): krbd: open parent on open
Marking this as a duplicate of http://tracker.newdream.net/issues/2852. Alex Elder
02:51 PM rbd Bug #3743: krbd: errors on submitted requests are ignored
If I could figure out how, I'd change the title of this
to say "krbd" rather than "rbd" to help make it clear
which...
Alex Elder
02:27 PM rbd Bug #3743 (Won't Fix): krbd: errors on submitted requests are ignored
When a Linux request comes down to the rbd driver via rbd_rq_fn(),
rbd_dev_do_request() is called after validating t...
Alex Elder
02:50 PM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
A Linux I/O request on an rbd image is broken into one or
more rbd requests, one request directed to each osd object...
Alex Elder
02:41 PM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
Older OSDs didn't have libcls_lock, and will fail lock operations; this means
virtually all rbd operations and rados...
Dan Mick
01:22 PM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
commit:e410d1a066b906cad3103a5bbfa5b4509be9ac37 Sage Weil
01:22 PM Bug #3736: kernel build: failures starting in 3.8-rc1
Sure enough, this is the commit that causes the problem:
af3df2c perf tools: Try to build Documentation when insta...
Alex Elder
11:48 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Looks like commit 6ca2a9c is the first one in that branch
that fails. It has a parent ce37f40 that succeeds.
I'v...
Alex Elder
10:24 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Heard back from Neil as well as Vlad Yasevich about my
proposed fix and they both ack'd it. Linus was in on
the di...
Alex Elder
09:07 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Despite a working build of the *kernel*, the package build
overall is still failing. It has something to do with bu...
Alex Elder
08:52 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Neil Horman sent a response to my message and suggested
three possible alternatives to fix the underlying problem,
...
Alex Elder
05:42 AM Bug #3736: kernel build: failures starting in 3.8-rc1
I changed our config file, found in the git repository
autobuild-ceph in the file "kernel-config" in the way
descri...
Alex Elder
05:40 AM Bug #3736: kernel build: failures starting in 3.8-rc1
I'm retroactively updating this so a bit about what's been
done gets documented.
The problem was in the Kconfig f...
Alex Elder
05:35 AM Bug #3736 (Resolved): kernel build: failures starting in 3.8-rc1
Kernels as of version 3.8-rc1 are not properly building in
autobuilder. The initial symptom was that the config pha...
Alex Elder
01:16 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
Sage Weil
01:16 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
commit:1b39b31678aea8c5bbdb38811b3919525228d10f Sage Weil
01:01 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
Sage Weil
12:19 PM CephFS Cleanup #3742 (Resolved): Remove old Hadoop wrappers and configuration options
I think it's likely that the current Hadoop shim is at least at feature parity with the old wrappers. Noah Watkins
12:16 PM Bug #3702: OSD SIGABRT during startup
Dan Mick wrote:
> Is this related to rbd, or should it be in category 'ceph'?
Ah, yes, it should. Thank you for c...
Justin Lott
11:31 AM Bug #3702: OSD SIGABRT during startup
Is this related to rbd, or should it be in category 'ceph'? Dan Mick
12:07 PM rbd Subtask #3741: krbd: rework request tracking code
... Alex Elder
11:54 AM rbd Subtask #3741 (Resolved): krbd: rework request tracking code
This is actually work that's mostly complete, but it never
got a bug assigned to it.
In order to handle layering ...
Alex Elder
11:26 AM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
this is probably #3734, now fixed. Sage Weil
11:09 AM rbd Subtask #2852: krbd: open parent on open
This work is essentially done, and has been since
October 2012 (or even earlier). However I held off
posting it fo...
Alex Elder
11:00 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
Our current ceph-client tree is based on Linux 3.6.
That is fairly old code (late September, 2012). We
should upda...
Alex Elder
10:12 AM Feature #3739 (Resolved): osd: repair object size vs object_info_t mismatches
if the object_info_t size doesn't match the on-disk file/object size, we needt o repair it. this means proposing a s... Sage Weil
10:02 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
Noah Watkins
10:02 AM CephFS Bug #3726 (Closed): Enforce Ceph's minimum stripe size in the java bindings
Noah Watkins
09:21 AM CephFS Bug #3738 (Resolved): kclient fsx truncate/write multi-client race

This bug is similar to #3681, but occurs only in the non-exclusive case (multiple clients), where a truncate doesn'...
Sam Lang
09:09 AM CephFS Bug #3681: kclient fsx fails nightly
The race here is between a truncate down, and completion of osd write ops triggering a cap flush. The exact order th... Sam Lang
06:30 AM rbd Bug #3737 (Resolved): Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache e...
Oliver Francke
04:38 AM Revision 4cfc4903 (ceph): msg/Pipe: encode message inside pipe_lock
This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen...
Sage Weil
04:38 AM Revision a058f161 (ceph): msg/Pipe: associate sending msgs to con inside lock
Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes...
Sage Weil
04:38 AM Revision 2a1eb466 (ceph): msg/Pipe: fix msg leak in requeue_sent()
The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:18 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Here's the fix I used on my system to fix the problem. The S3 service is set at the root of the virtual server so "" ... Sylvain Munaut
03:07 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
I'm using lighttpd as a Fast CGI front end for radosgw and it doesn't set SCRIPT_URI environment variable.
So the ...
Sylvain Munaut

01/06/2013

10:50 PM Bug #3734 (Fix Under Review): osd/objecter: misdirected op in librados api tests
wip-3734 Sage Weil
10:41 PM Bug #3734: osd/objecter: misdirected op in librados api tests
epoch 328:... Sage Weil
10:15 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
... Sage Weil
03:10 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
this was #3731 Sage Weil
02:38 PM Bug #3722: osd: indefinitely hung request on stable cluster
Sage Weil
02:34 PM Bug #3678 (Fix Under Review): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MN...
YAY, wip-3678 is consistently passing now. Sage Weil
05:37 AM Revision a10950f9 (ceph): os/FileJournal: include limits.h
Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ce49968938ca3636f48fe5431...
Sage Weil
04:54 AM Revision ce499689 (ceph): os/FileJournal: include limits.h
Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil

01/05/2013

09:32 PM Feature #3733 (Closed): osd: update leveldb submodule
Sage Weil
07:17 PM Revision e9efa332 (ceph): java: add stripe unit granularity tests
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:12 PM Revision ececcf57 (ceph): java: update javadoc comments
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:10 PM Revision cdd138da (ceph): java: fix whitespace
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:08 PM Revision abcda95b (ceph): libcephfs: expose stripe unit granularity
Assists clients in choosing layout parameters.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
07:08 PM Revision 6954bf33 (ceph): java: add support for get_stripe_unit_granularity
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Joe Buck
06:47 PM Documentation #3389 (In Progress): doc: crush docs could use a full example crushmap
John Wilkins
10:02 AM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Do we have a test that checks our interfaces to
automatically catch inadvertent protocol changes?
If not, we should.
Alex Elder
09:04 AM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
commit:988a52173522e9a410ba975a4e8b7c25c7801123 Sage Weil
09:04 AM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
commit:28d59d374b28629a230d36b93e60a8474c902aa5 Sage Weil
09:03 AM Bug #3698 (Resolved): filestore: ENOENT on clone
commit:e89b6ade63cdad315ab754789de24008cfe42b37 Sage Weil
08:27 AM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
Report the rate of recovery (objects and bytes per second) via the monitor, presumably via 'ceph -w' and similar inte... Sage Weil
04:48 AM Revision 415294c0 (ceph): Merge branch 'next'
Sage Weil
04:47 AM Revision cd194ef3 (ceph): osd: special case CALL op to not have RD bit effects
In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ...
Sage Weil
04:47 AM Revision 921e06de (ceph): Revert "OSD: remove RD flag from CALL ops"
This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati...
Sage Weil
04:46 AM Revision 988a5217 (ceph): osd: special case CALL op to not have RD bit effects
In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ...
Sage Weil
04:46 AM Revision d3abd0fe (ceph): Revert "OSD: remove RD flag from CALL ops"
This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati...
Sage Weil
03:51 AM Revision 3a940874 (ceph): libcephfs: delete client after messenger shutdown
Prevents race between messages being dispatched to the client after the
client has been free'd.
Signed-off-by: Noah ...
Noah Watkins
02:02 AM Revision 0978dc49 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-...
Dan Mick

01/04/2013

09:45 PM Revision 7513e971 (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:44 PM Revision e89b6ade (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:37 PM Revision 6a3d475c (ceph): Merge remote branch 'origin/wip-rbd-watch'
Reviewed-by: Dan Mick <dan.mick@inktank.com> Josh Durgin
08:32 PM Revision cd5f2bfd (ceph): ObjectCacher: fix off-by-one error in split
This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which wou...
Josh Durgin
07:54 PM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
commit:3a9408742a8a6cbc870cba543a208285f1a6cec1 Sage Weil
03:25 PM CephFS Bug #3666: Segfault running test_libcephfs
I pushed a new wip-client-shutdown. This switches the clean-up order of client/messenger in libcephfs, rather than mo... Noah Watkins
01:36 PM CephFS Bug #3666: Segfault running test_libcephfs
Right, I think your fix will work, but it breaks the interface abstraction (messenger is created above the client, de... Sam Lang
01:16 PM CephFS Bug #3666: Segfault running test_libcephfs
This is what I'm running to reproduce the error. It's been running now for an hour on wip-client-shutdown without any... Noah Watkins
12:57 PM CephFS Bug #3666: Segfault running test_libcephfs
Rather than moving messenger shutdown into client shutdown? Noah Watkins
12:48 PM CephFS Bug #3666: Segfault running test_libcephfs
A similar issue was just handled in the ceph_fuse.cc code. There we just delay deleting the client till the end. Yo... Sam Lang
10:41 AM CephFS Bug #3666: Segfault running test_libcephfs
During unmount, the client is shutdown and free'd before the messenger. If any messages are delivered after the clien... Noah Watkins
07:07 PM Revision 802c486f (ceph): config: change default log_max_recent to 10,000
Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: S...
Sage Weil
06:18 PM Revision d6496abf (ceph): remove rbd_header_race test
This no longer works since export does not do a watch, and the race is
being closed a different way not detectable by...
Josh Durgin
06:16 PM Revision 620dd551 (ceph): task: mon_clock_skew_check.py: Check for clock skews on the monitors
Will run for as long as teuthology runs. By default, fails if any clock
skews higher than 0.05 seconds are detected, ...
Joao Eduardo Luis
06:11 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
commit:0978dc4963fe441fb67afecb074bc7b01798d59d Dan Mick
03:12 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre...
Tamilarasi muthamizhan
06:06 PM Bug #3702: OSD SIGABRT during startup
Sage Weil wrote:
> Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
...
Justin Lott
09:42 AM Bug #3702 (Need More Info): OSD SIGABRT during startup
Sage Weil
05:54 PM Revision 1a878611 (ceph): regression: include nfs suite
Sage Weil
05:50 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
got msgr logs in ubuntu@teuthology:/a/sage-a3/34724, but the crash looked different from the earlier ones (whose logs... Sage Weil
05:40 PM Bug #3731 (Fix Under Review): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompati...
see wip-3731 Sage Weil
05:19 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Agreed. And let's make sure it's fixed for 0.56.1.
Sage Weil
05:15 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Discussed this with Dan and Sam and I think we just want to roll this patch back and tell people not to use v0.56 for... Greg Farnum
04:34 PM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
CEPH_OSD_OP_CALL changed to remove the CEPH_OSD_OP_MODE_RD bit in
91e941aef9f55425cc12204146f26d79c444cfae; however,...
Dan Mick
05:03 PM Revision e88b909a (ceph): task: ceph_manager: add 'get_mon_health' function
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
03:29 PM CephFS Feature #3730 (Closed): Support replication factor in Hadoop
In order to support per-file replication values in Hadoop we need to specify that a new file should be generated in a... Noah Watkins
02:38 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5 Josh Durgin
11:31 AM rbd Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
in branch wip-rbd-watch Josh Durgin
01:54 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
Also, name it something along the lines of get_stripe_granularity() and not .._min(imum)_ as that isn't entirely accu... Anonymous
01:40 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
After a discussion on jabber, the decision is to go with exposing a function call in libcephfs and then using that in... Anonymous
11:09 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
The Hadoop bindings are using the blocksize as the stripe size. If a block size is explicitly passed down, it ends up... Anonymous
01:00 PM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
Heads up, Zheng Yan's patches on the mds fix issues related to running multiclient dbench tests. Sam Lang
12:24 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hmm, okay. I wasn't real clear on the previous bugs so I'll need to look at it more if I end up taking this, but soun... Greg Farnum
11:46 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thin...
Sage Weil
11:35 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thing set_trace() does. Ex... Greg Farnum
11:17 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Am I reading it correctly that this is just going to be doing the config and wrapper work to no...
Sage Weil
09:01 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Am I reading it correctly that this is just going to be doing the config and wrapper work to not call set_trace() in ... Greg Farnum
12:20 PM CephFS Feature #3543: mds: new encoding
Sage Weil
12:20 PM CephFS Feature #3728: mds: draft design for lookup by ino
Sage Weil
12:14 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
12:20 PM CephFS Feature #3570: teuthology: mds thrasher
Sage Weil
12:06 PM CephFS Feature #3727 (Resolved): mds: refactor EMetablob encoding paths
Right now, the EMetaBlob sub-structures — for performance reasons — use an encoding pattern that doesn't match anythi... Sage Weil
11:42 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
Greg Farnum wrote:
> I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should e...
Sage Weil
09:34 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should encompass. Are you talk... Greg Farnum
11:41 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
This was a whiteboard discussion 2 years ago. Nothing was written down. We should reopen new and more detailed issu... Sage Weil
09:29 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
Where are the results of this bug? It's marked resolved but I don't see any fsck references in the git tree, and ther... Greg Farnum
11:39 AM Feature #685: libcephmon: interact with ceph monitors via a library
BTW it may make sense to push the client command stuff in the ceph tool into MonClient, and then wrap that in libceph... Sage Weil
11:38 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Greg Farnum wrote:
> Do we have a separate bug for the library calls this needs?
#685, which would take the clien...
Sage Weil
09:27 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Do we have a separate bug for the library calls this needs? Greg Farnum
11:36 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
Greg Farnum wrote:
> And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the...
Sage Weil
09:24 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the Ceph shim or anything... Greg Farnum
11:35 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Greg Farnum wrote:
> Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph moun...
Sage Weil
09:24 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph mount and then running tes... Greg Farnum
11:34 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Greg Farnum wrote:
> Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a...
Sage Weil
09:22 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a cluster going with vs... Greg Farnum
11:32 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
Greg Farnum wrote:
> What all does this encompass? Design? Implementation? Does it need to be an online switch or ca...
Sage Weil
09:13 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
What all does this encompass? Design? Implementation? Does it need to be an online switch or can it be an offline job? Greg Farnum
11:30 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Greg Farnum wrote:
> Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect ...
Sage Weil
09:12 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect so, but the presence o... Greg Farnum
11:23 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
Josh Durgin
10:32 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734... Tamilarasi muthamizhan
11:23 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Greg Farnum wrote:
> Do we have any kind of design for this? We've talked about it some and it's conceptually simple...
Sage Weil
09:08 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Do we have any kind of design for this? We've talked about it some and it's conceptually simple, but splitting up the... Greg Farnum
11:15 AM CephFS Feature #626 (In Progress): qa: add IOR, rompio, or other parallel workloads suite
Yeah, that's what slang's working on to enable this. Assigning this to him. Sage Weil
08:57 AM CephFS Feature #626: qa: add IOR, rompio, or other parallel workloads suite
SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sur... Greg Farnum
11:14 AM Bug #3722: osd: indefinitely hung request on stable cluster
the trigger is a brief osd reset due to an intermittent network outage. no actual ceph-osd daemons restart.
<pr...
Sage Weil
09:39 AM Bug #3722 (Need More Info): osd: indefinitely hung request on stable cluster
Sage Weil
08:36 AM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
0.48.2argonaut, rbd workload.
occasional requests are blocked indefinitely.
*may* be osd down/up cycles (due to...
Sage Weil
11:13 AM CephFS Feature #3621 (Resolved): qa: add knfsd reexport tests to qa suite
Sage Weil
10:53 AM Bug #3723: ceph osd down command reports incorrectly
similarly for "ceph osd in" command as well
ubuntu@burnupi06:/etc/ceph$ sudo ceph osd in 2 -k /etc/ceph/ceph.key...
Tamilarasi muthamizhan
09:33 AM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
issuing the command: "sudo ceph osd down 2" reports osd.2 is already down but sudo ceph osd stat reports all are up.
...
Ken Franklin
10:21 AM Bug #3698 (In Progress): filestore: ENOENT on clone
Sage Weil
09:43 AM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
commit:4ae4dce5c5bb547c1ff54d07c8b70d287490cae9 Sage Weil
09:43 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
Oh, those are librados specific numbers, aren't they. So this bug is to create and expose a libceph version, then. Wh... Greg Farnum
09:35 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
In libcephfs there is a call to get Ceph version (yes, just expose this). But, I recall Sage mentioning that it might... Noah Watkins
09:19 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
This is just exposing the librados version() function to Java, right? Greg Farnum
09:41 AM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
The only radosgw-admin API calls currently are *get usage* and *trim usage* The docs at
http://ceph.com/doc...
caleb miles
09:41 AM CephFS Cleanup #660: mds: use helpers in mknod, mkdir, openc paths
What kind of helpers are you talking about with this? inode fetchers and lock grabbers? In a quick scan over handle_c... Greg Farnum
09:36 AM CephFS Feature #603: mds: repair directory hierarchy
This is part of #82 fsck, right? Do we have a more detailed algorithm anywhere? Greg Farnum
05:02 AM Revision 39a734fb (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
04:17 AM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> Nat, you should be able to install either of libtcmalloc-minimal or libgoogle-perftools — are...
Nat Makarevitch
03:40 AM Revision c63c6646 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
03:00 AM Revision acfa0c9a (ceph): mds: optimize C_MDC_RetryOpenRemoteIno
When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish
context for discovering remote inode. When it...
Yan, Zheng
02:45 AM Revision b03eab22 (ceph): mds: forbid creating file in deleted directory
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 59953257 (ceph): mds: keep dentry lock in sync state as much as possible
Unlike locks of other types, dentry lock in unreadable state can block path
traverse, so it should be in sync state a...
Yan, Zheng
02:45 AM Revision f9280cb6 (ceph): mds: fix replica state for LOCK_MIX_LOCK
LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state
should be LOCK_MIX.
Signed-off-by: Yan, ...
Yan, Zheng
02:45 AM Revision 248e4ab8 (ceph): mds: fix cap mask for ifile lock
ifile lock has 8 cap bits, should its cap mask should be 0xff
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Yan, Zheng
02:45 AM Revision 420f3355 (ceph): mds: rdlock prepended dest trace when handling rename
rdlock prepended dest trace to prevent them from being xlocked by
someone else.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision ea2fd127 (ceph): mds: check null context in CDir::fetch()
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 3705c7ca (ceph): mds: drop locks when opening remote dentry
Opening remote dentry while holding locks may cause dead lock. For example,
'discover' is blocked by a xlocked dentry...
Yan, Zheng
02:45 AM Revision ca4dc4db (ceph): mds: check if stray dentry is needed
The necessity of stray dentry can change before the request acquires
all locks.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision acbe6d97 (ceph): mds: don't issue caps while inode is exporting caps
If issue caps while inode is exporting caps, the client will drop the
caps soon when it receives the CAP_OP_EXPORT me...
Yan, Zheng
02:45 AM Revision d379ac8e (ceph): mds: disable concurrent remote locking
Current code allows multiple MDRequests to concurrently acquire a
remote lock. But a lock ACK message wakes all reque...
Yan, Zheng
01:15 AM Revision 28d59d37 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
12:23 AM Revision 49416619 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
12:13 AM Revision f1e0305f (ceph): doc: Removed the --without-tcmalloc flag until further advised.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:07 AM Revision 19df2086 (ceph): Merge pull request #30 from rca/master
Minor clarification in docs. Sage Weil

01/03/2013

11:04 PM Revision 5ce47c2a (ceph): ssh_keys.py: pull the keys out of targets entry
rather than the hosts known hosts file.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@i...
Joe Buck
10:51 PM Revision 88af7d18 (ceph): doc: Added defaults for PGs, links to recommended settings, and updated...
Fixes: #3555
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
10:32 PM Revision b8f061dc (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
10:18 PM Revision 4ae4dce5 (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
09:30 PM Revision 73bc8ffc (ceph): doc: Added comments on --without-tcmalloc option when building Ceph.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:30 PM Revision 37b57cdf (ceph): Update doc/rados/configuration/filesystem-recommendations.rst
Clarified when it's necessary to use the setting:
filestore xattr use omap = true
rca
09:29 PM Revision 43ef6772 (ceph): doc: Added some packages to the copyable line.
Fixes: #3686
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
09:28 PM Revision 333ae82c (ceph): doc: Fixed syntax error.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:57 PM Revision aaa03bbc (ceph): qa: Add knfsd reexport suite
Feature http://tracker.newdream.net/issues/3621
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman
08:55 PM Revision 67968d11 (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
08:54 PM Revision 34266e6b (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
08:53 PM Revision 4034f6c8 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
08:53 PM Revision 7e94f6f1 (ceph): Merge remote-tracking branch 'gh/wip-3714-b' into next
Signed-off-by: Samuel Just <sam.just@inktank.com> Sage Weil
08:44 PM Revision 224a33bb (ceph): qa/workunit: Add dbench-short.sh for nfs suite
A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed...
David Zafman
08:13 PM Documentation #3709 (In Progress): crush-map.rst: claims 'types' are default, not true (must be s...
John Wilkins
02:32 PM Documentation #3709: crush-map.rst: claims 'types' are default, not true (must be specified); spe...
These are "defaults" in the sense that they're generated as part of the default OSD Map. Apparently that needs to be ... Greg Farnum
07:57 PM Documentation #3707 (In Progress): crush-map.rst: syntax error in example
John Wilkins
05:54 PM Bug #3702: OSD SIGABRT during startup
Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else? Sage Weil
05:45 PM Bug #3721: filestore: op_seq written in wrong order on non-btrfs
Sage Weil
04:02 PM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
see wip-fsync Sage Weil
05:23 PM Revision f8bb4814 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
05:14 PM Revision eee795c0 (ceph): rbd_xfstests.yaml: drop test 186
Stop running test 186. It keeps failing in nightly runs, unable
to unmount the scratch file system during setup. As...
Alex Elder
04:47 PM rgw Documentation #2993 (Resolved): doc: write quick RGW guide (if feasible)
John Wilkins
04:45 PM devops Feature #2884: doc: osd hotplugging
I believe the hotplug event was added, but will confirm. John Wilkins
04:43 PM devops Documentation #2974: doc: update chef docs for mon key distribution
I believe this is done. Will verify. John Wilkins
04:13 PM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> John, can you remove that --without-tcmalloc bit until we hear more?
>
> Nat, you should be ...
John Wilkins
02:48 PM devops Documentation #3686 (In Progress): install prerequisites (Debian)
John, can you remove that --without-tcmalloc bit until we hear more?
Nat, you should be able to install either of ...
Greg Farnum
02:45 PM devops Documentation #3686: install prerequisites (Debian)
Eek. We really, really want people to be using tcmalloc (memory behavior without it is astonishingly atrocious). I kn... Greg Farnum
01:31 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
Added packages to the copyable lines. Modified the build page to include --without-tcmalloc. John Wilkins
03:50 PM Bug #3698: filestore: ENOENT on clone
Ok. The recovery_qos stuff can allow a client op to reorder past a push. This is a problem since the push might be ... Samuel Just
07:53 AM Bug #3698: filestore: ENOENT on clone
another instance with logs: ubuntu@teuthology:/a/sage-a2/33879 Sage Weil
02:52 PM Documentation #3555 (Resolved): {page-num} in ceph osd pool create is not optional
Updated the document to add "required," the default values, a link to calculating PG values, clarification about PGP,... John Wilkins
02:49 PM Bug #3633: mon: clock drift errors not reported by ceph status
The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based enti... Greg Farnum
10:12 AM Bug #3633: mon: clock drift errors not reported by ceph status
The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might ... Joao Eduardo Luis
08:46 AM Bug #3633: mon: clock drift errors not reported by ceph status
Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ce... Corin Langosch
02:34 PM Bug #3720: Ceph Reporting Negative Number of Degraded objects
Per Josh D's suggestion, I set the tunables and it resolved the issue.
# ceph osd getcrushmap -o /tmp/crush
# cru...
Mike Dawson
01:02 PM Bug #3720 (Duplicate): Ceph Reporting Negative Number of Degraded objects
Changed the replication of two pools from 2x to 3x. Cluster rebalanced to nearly HEALTH_OK but got stuck at:
HEALT...
Mike Dawson
02:32 PM rbd Bug #3697: rbd copy.sh test failing in nightly
When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te... Dan Mick
01:59 PM CephFS Bug #3597: ceph-fuse: denying root access
I believe that we can reproduce this error. We are running Ubuntu 12.04 LTS Server on both the client and on the Cep... Graham Hemingway
12:56 PM CephFS Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
logs: ubuntu@teuthology:/a/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33621... Tamilarasi muthamizhan
12:53 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
commit:7e94f6f1a7b7a865433edacd6a521f6ea1170eac Sage Weil
10:28 AM Bug #3714 (Fix Under Review): osd: new peering code does not consume osdmaps prior to booting
Sage Weil
12:48 PM CephFS Bug #3718 (Rejected): multi-client dbench gets stuck over NFS exported cephfs
When running qa/workunit dbench.sh the dbench 1 passes, but the dbench 10 gets hung up.
We should check this with ...
David Zafman
12:28 PM CephFS Feature #3621 (In Progress): qa: add knfsd reexport tests to qa suite
David Zafman
09:49 AM RADOS Feature #3717 (New): osd: Make Rebalancing Smarter
From Corin Langosch - During recovery/ rebalacing it can happen that an osd receives lots of new data before data tha... Ian Colle
09:45 AM Bug #3716: recovery should take osd usage into account
1. My cluster already uses the tuned crushmap "crushtool -i /tmp/crush --set-choose-local-tries 0 --set-choose-local-... Corin Langosch
09:36 AM Bug #3716 (Closed): recovery should take osd usage into account
#1: this is a matter of adjusting the crush tunables. see http://ceph.com/docs/master/rados/operations/crush-map/?hig... Sage Weil
09:08 AM Bug #3716 (Closed): recovery should take osd usage into account
Using argonaut 0.48.2. Yesterday one osd crashed (disk io error) and recovery started as expected. All osds had an us... Corin Langosch
09:44 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
Joao,
thanks for the update.
Since mine came about due to a testing environment build on DHCP, I did not have the ...
Anonymous
09:32 AM CephFS Bug #3681: kclient fsx fails nightly
Its most likely all the same bug, but fsx fails in different ways each time (always because of a truncate down). The... Sam Lang
09:27 AM CephFS Feature #3543: mds: new encoding
right. about 80% complete, see wip-mds-encoding. Sage Weil
09:22 AM CephFS Feature #3543: mds: new encoding
What is this task? Switching to use our versioned encoding scheme? Greg Farnum
09:17 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,...
Alex Elder
06:39 AM Revision a32d6c5d (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
06:20 AM Revision 0bfad8ef (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
06:04 AM Revision 5fc94e89 (ceph): osd: drop oldest_last_clean from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:04 AM Revision 67f7ee67 (ceph): osd: drop unused variables from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:09 AM Revision a14a36ed (ceph): OSDMap: fix modifed -> modified typo
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:44 AM Revision 9ca69e73 (ceph): ceph: malloc check =3 means we hear on stderr too
Sage Weil
03:58 AM Revision 2141454e (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
02:13 AM Revision 6b5a89d2 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:01 AM Revision 43cba617 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil

01/02/2013

11:59 PM Revision 29ff87a5 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
11:58 PM Revision 64d2760a (ceph): doc: Added a memory profiling section. Ported from the wiki.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:57 PM Revision 5066abf1 (ceph): doc: Added memory profiling to the index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:08 PM Revision 0e9a0cd7 (ceph): qa/workunit: Update pjd script to use new tarball
The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-of...
Sam Lang
11:07 PM Bug #3715: Crash during 0.55 -> 0.56 upgrade
is someone sending an MOSDOp that has no ops? init_op_flags() is called before can_*(), so this sounds like an empty... Sage Weil
10:05 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
I started upgrading my 0.55.1 cluster to 0.56 and at one point in the middle of the upgrade, all 0.55.1 OSDs started ... Faidon Liambotis
10:38 PM Revision d8940d15 (ceph): fuse: Fix cleanup code path on init failure
With the changes from 856f32ab, the cfuse.init call returns
a _positive_ errno, which was getting ignored. Also, if ...
Sam Lang
10:15 PM Revision c4370ff0 (ceph): librbd: establish watch before reading header
This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou...
Josh Durgin
09:56 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ... Dan Mick
09:39 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
Sage Weil
09:42 PM Revision 93656013 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 483c6f76adf960017614a8641c4dcdbd7902ce33)
Sage Weil
09:42 PM Revision be0473bb (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c461e7fc1e34fdddd8ff8833693d067451df906b)
Sage Weil
09:42 PM Revision de619327 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:42 PM Revision ded454c6 (ceph): os/FileJournal: logger is optional
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 076b418c7f03c5c62f811fdc566e4e2b776389b7)
Sage Weil
09:42 PM Revision 9a1cf518 (ceph): Merge branch 'wip-journal-aio' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
Backport: bobtail
Sage Weil
09:39 PM Revision dda7b651 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:39 PM Revision c461e7fc (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:39 PM Revision 483c6f76 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:34 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
Previously when we handled the old osdmaps catching up (pre-MOSDBoot) we'd do advance_map and the pgs would update th... Sage Weil
08:32 PM Revision e0858fa8 (ceph): Revert "librbd: ensure header is up to date after initial read"
Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th...
Josh Durgin
08:31 PM Revision 06310994 (ceph): ceph: enable malloc debugging for ceph-osd
Sage Weil
07:49 PM Revision 3686371e (ceph): rados: add test_filejournal
This writes to /tmp by default; should be ok plana, since it's / and not
tmpfs.
Sage Weil
07:24 PM Revision 82297706 (ceph): doc: Minor edits.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:15 PM Revision d3b9803e (ceph): doc: Fixed typo, clarified usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
05:23 PM rbd Bug #3685: xfs test 186 fails in the nightlies
It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u...
Alex Elder
02:34 PM rbd Bug #3685: xfs test 186 fails in the nightlies
OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera...
Alex Elder
12:06 PM rbd Bug #3685: xfs test 186 fails in the nightlies
Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co...
Alex Elder
11:56 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x...
Alex Elder
05:15 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
reproduced this on burnupi21. Tamilarasi muthamizhan
05:00 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
with glibc malloc and debug enabled:... Sage Weil
08:57 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another one with full osd logs:... Sage Weil
04:13 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
This has been ported. I haven't added a valgrind use case yet. John Wilkins
01:20 PM Documentation #3687 (In Progress): Documentation needs a "memory profiling" section
John Wilkins
03:51 PM Feature #3713 (Rejected): ceph osd tree should show disk usage
As ceph seems to already monitor the disk usage of each osd it's be great to have it displayed in "ceph osd tree". Corin Langosch
03:08 PM rbd Bug #3619: librbd: read_iterate sparse behavior broken
Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed.
Dan Mick
02:11 PM devops Feature #3712 (New): Ceph Commands should provide appropriate responses, when Ceph Service is not...
When ceph service is not running, running other ceph command should give a response that makes sense instead of just ... Anonymous
02:02 PM Cleanup #2078: ceph tool: only output response data to stdout
i think we need to phase out all of the first-line nonsense. Sage Weil
01:48 PM Cleanup #2078: ceph tool: only output response data to stdout
This also affects things like ceph pg dump --format=json. You can't pipe it to a pretty printer without ignoring the ... Josh Durgin
01:52 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
The implication is that 'N' is "the number of buckets of type 'type' available", but Sam believes it must really be "... Dan Mick
01:40 PM Bug #3684 (Resolved): filejournal: aio vector size is not limited
Sage Weil
01:34 PM rbd Feature #3456 (Closed): make exit code of ceph status commands status dependent
Josh Durgin
01:29 PM rbd Documentation #2992 (Resolved): doc: RBD parent/child snapshot
Josh Durgin
01:26 PM rbd Documentation #2992: doc: RBD parent/child snapshot
This should be resolved. John Wilkins
01:24 PM Documentation #3710 (Closed): crush-map.rst: talks about 'step choose' but does not document it
Dan Mick
01:23 PM Documentation #3411 (Resolved): doc: add introductory detail to the main doc page (index.rst)
John Wilkins
01:21 PM rgw Feature #3207 (In Progress): qa: swift functional tests in nightly
Sage Weil
01:21 PM rgw Feature #3366 (In Progress): rgw: dr: define management api
Sage Weil
01:18 PM Documentation #2980 (Resolved): doc: write upgrading Ceph version
This was checked in and also reviewed by Josh and Sage. John Wilkins
01:16 PM Documentation #3322 (Resolved): doc: Explain multi-tenant CephFS
This has been added to a the end of the Ceph Configuration file section. It may benefit from review, as I believe the... John Wilkins
01:12 PM Feature #647 (Duplicate): mon: refactor paxos interaction
Sage Weil
01:11 PM Feature #183 (Resolved): qa: xfstests workunit
Sage Weil
01:10 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
crush-map.rst claims that the bucket type defaults are as appear in the table, but they're
not defaults; they must b...
Dan Mick
01:09 PM Feature #3376 (Duplicate): use external leveldb package for default builds
Sage Weil
01:08 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
example includes:
item ceph-osd-server-1 2.00
this must have 'weight' explicitly in the line:
...
Dan Mick
01:03 PM Feature #3425 (Resolved): mon workload generator
Sage Weil
12:39 PM Bug #3702: OSD SIGABRT during startup
Attempting to start osd.131 (which was down due to the above noted problems) today resulted in quorum loss. Essential... Justin Lott
12:03 PM rgw Bug #3706 (Resolved): rgw functional test testSlashInName failed in nightly
logs: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33224... Tamilarasi muthamizhan
11:25 AM Revision a79493da (ceph): mds: skip frozen inode when assimilating dirty inodes' rstat
CDir::assimilate_dirty_rstat_inodes() may encounter frozen inodes that
are being renamed. Skip these frozen inodes be...
Yan, Zheng
11:25 AM Revision 2f96b472 (ceph): mds: fix anchor table commit race
Anchor table updates for a given inode is fully serialized on client side.
But due to network latency, two commit req...
Yan, Zheng
11:25 AM Revision 7e04504d (ceph): mds: fix on-going two phrase commits tracking
The slaves for two phrase commit should be mdr->more()->witnessed
instead of mdr->more()->slaves. mdr->more()->slaves...
Yan, Zheng
11:25 AM Revision b3796f46 (ceph): mds: indroduce DROPLOCKS slave request
In some rare case, Locker::acquire_locks() drops all acquired locks
in order to auth pin new objects. But Locker::dro...
Yan, Zheng
11:25 AM Revision b2d5005a (ceph): mds: fix lock state transition check
Locker::simple_excl() and Locker::scatter_mix() miss is_rdlocked
check; Locker::file_excl() miss is_rdlocked check an...
Yan, Zheng
11:25 AM Revision fe5936b1 (ceph): mds: remove unnecessary is_xlocked check
Locker::foo_eval() is always called for stable locks, so no need to
check if the lock is xlocked.
Signed-off-by: Yan...
Yan, Zheng
11:25 AM Revision f5ea5c36 (ceph): mds: don't defer processing caps if inode is auth pinned
We should not defer processing caps if the inode is auth pinned by MDRequest,
because the MDRequest may change lock s...
Yan, Zheng
11:25 AM Revision 5e8642a8 (ceph): mds: call maybe_eval_stray after removing a replica dentry
MDCache::handle_cache_expire() processes dentries after inodes, so the
MDCache::maybe_eval_stray() in MDCache::inode_...
Yan, Zheng
11:25 AM Revision 84224743 (ceph): mds: fix rename inode exportor check
Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is
inode exportor of rename operation is not reli...
Yan, Zheng
11:25 AM Revision 26279574 (ceph): mds: don't trigger assertion when discover races with rename
Discover reply that adds replica dentry and inode can race with rename
if slave request for rename sends discover and...
Yan, Zheng
11:25 AM Revision 5ae715be (ceph): mds: xlock stray dentry when handling rename or unlink
This prevents MDS from reintegrating stray before rename/unlink finishes
Signed-off-by: Yan, Zheng <zheng.z.yan@inte...
Yan, Zheng
11:25 AM Revision 7a520168 (ceph): mds: don't journal null dentry for overwrited remote linkage
Server::_rename_prepare() adds null dest dentry to the EMetaBlob if
the rename operation overwrites a remote linkage....
Yan, Zheng
11:25 AM Revision fcb9f988 (ceph): mds: use null dentry to find old parent of renamed directory
When replaying an directory rename operation, MDS need to find old parent of
the renamed directory to adjust auth sub...
Yan, Zheng
11:25 AM Revision d9d71473 (ceph): mds: don't trim ambiguous imports in MDCache::trim_non_auth_subtree
Trimming ambiguous imports in MDCache::trim_non_auth_subtree() confuses
MDCache::disambiguate_imports() and causes in...
Yan, Zheng
11:25 AM Revision 3b13d3dc (ceph): mds: only export directory fragments in stray to their auth MDS
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
11:25 AM Revision 61da9b18 (ceph): mds: mark rename inode as ambiguous auth on all involved MDS
When handling cross authority rename, the master first sends OP_RENAMEPREP
slave requests to witness MDS, then sends ...
Yan, Zheng
11:09 AM Linux kernel client Bug #2764 (Closed): xfstest hang; osd socket closed messages
The fix for the warning messages is:
28362986f8743124b3a0fda20a8ed3e80309cce1
libceph: report connection ...
Alex Elder
10:54 AM Bug #3698: filestore: ENOENT on clone
recent log: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33152 Tamilarasi muthamizhan
09:45 AM CephFS Bug #3700: mds: FAILED assert(!item_session_list.is_on_list())
fixed by revert of bad fix, see commit:6711a4c4038dbdf843f9dfe42c7809c5c37ae534 Sage Weil
09:37 AM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
Sage Weil
09:41 AM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ... Sage Weil
09:09 AM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
... Sage Weil
08:28 AM Feature #3704 (Resolved): mon: add min log level to send cluster msgs to syslog
e.g., WARN and above only, but not INFO. This is for the mon/LogMonitor.cc submission path, not log/Log.cc (for debu... Sage Weil
05:55 AM Revision e10267b5 (ceph): mds: fix Locker::simple_eval()
Locker::simple_eval() checks if the loner wants CEPH_CAP_GEXCL to
decide if it should change the lock to EXCL state, ...
Yan, Zheng
05:54 AM Revision 7e23321b (ceph): mds: don't renew revoking lease
MDS may receives lease renew request while lease is being revoked,
just ignore the renew request.
Signed-off-by: Yan...
Yan, Zheng

01/01/2013

06:36 PM Revision eb02eaed (ceph): Merge remote-tracking branch 'gh/wip-bobtail-docs'
Sage Weil
05:35 AM Revision f1196c7e (ceph): Merge branch 'master' of https://github.com/ceph/ceph
Gary Lowell
05:31 AM Revision 5dd6b199 (ceph): Merge branch 'next'
Gary Lowell
02:37 AM Revision 8f77ec7d (ceph): Merge branch 'next'
Sage Weil
02:36 AM Revision 94a5dd6b (ceph): Merge remote-tracking branch 'gh/wip-3675'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
01:10 AM Revision 1a32f0a0 (ceph): v0.56
Gary Lowell

12/31/2012

11:28 PM Revision 49ebe1ee (ceph): client: fix _create created ino condition
We get 8 bytes back for the created ino.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
11:26 PM Revision a10054bc (ceph): libcephfs: choose more unique nonce
We were using a per-process counter combined with the pid. A short
running process can easily loop through and reuse...
Sage Weil
11:26 PM Revision e2fef38d (ceph): client: fix _create
make_request() clear out req->reply and frees req; we can't inspect
it here.
Instead, just assume that extra_bl is t...
Sage Weil
06:35 PM rbd Bug #3697: rbd copy.sh test failing in nightly
FWIW I ran this in a loop and reproduced it after 7 iterations (well, a slightly different error actually, when it re... Sage Weil
05:42 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
Dan Mick
05:08 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Hm, doesn't reproduce on local vstart cluster. Pondering possible failure modes. Dan Mick
04:23 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Trying to reproduce now Dan Mick
06:17 PM Revision 7d70dd11 (ceph): Revert "kernel: move fsync test to marginal suite until it works"
This reverts commit acb91f7d0d4882d7393a99b142aec8687b9b4bb7.
Now fixed in master branch, commit b4d3bd06d4083d78075...
Sage Weil
06:16 PM Revision b4d3bd06 (ceph): Merge remote-tracking branch 'gh/wip-3625'
Sage Weil
05:38 PM rbd Bug #3703: osd: crash while encrypting
This is an osd crash.... Josh Durgin
02:55 PM rbd Bug #3703 (Can't reproduce): osd: crash while encrypting
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_19:00:03-regression-next-testing-basic/32113... Tamilarasi muthamizhan
04:11 PM Revision ed586c1b (ceph): task: ceph: don't wait for 'healthy' if 'wait-for-healthy' is false.
This new config option obviously defaults to 'true' in order to not only
maintain compatibility, but because it makes...
Joao Eduardo Luis
02:58 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
bringing back the marked out osd.1 in on burnupi06 while running the io hit the following,
2012-12-31 14:26:26.6...
Tamilarasi muthamizhan
02:30 PM Messengers Feature #3509 (Resolved): msgr: delay injection
Sage Weil
10:18 AM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
Sage Weil
09:06 AM Bug #3702 (Can't reproduce): OSD SIGABRT during startup
After conversion of OSD's from btrfs to XFS, some OSD's SIGABRT during their first startup on XFS:
2012-12-29 05:0...
Justin Lott
08:55 AM Bug #3683: mon: leak of MMonPaxos
recent logs: ubuntu@teuthology:/a/teuthology-2012-12-29_19:00:03-regression-next-testing-basic/31414 Tamilarasi muthamizhan
08:37 AM rbd Bug #3701 (Can't reproduce): qemu xfstest hung BUG: unable to handle kernel NULL pointer derefere...
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_03:00:06-regression-master-testing-gcov/31929... Tamilarasi muthamizhan

12/30/2012

11:29 PM Revision ec5288a3 (ceph): Merge remote-tracking branch 'gh/wip-rbd-unprotect' into next
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
07:18 PM Revision 82cec48e (ceph): doc: add-or-rm-mons.rst: Add 'Changing Monitor's IPs' section
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
Joao Eduardo Luis
07:17 PM Revision 379f0792 (ceph): doc: add-or-rm-mons.rst: Clarify what the monitor name/id is.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
06:08 PM CephFS Fix #3630: mds: broken closed connection cleanup
... Sage Weil
06:06 PM CephFS Fix #3630: mds: broken closed connection cleanup
The con re-use looks like this:
- client connects
- mds ms_verify_authorizer creates a new session
- msgr see ex...
Sage Weil
06:04 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
see #3630..let's fix this properly. Sage Weil
08:06 AM Revision 85e9d4f0 (ceph): cls_rbd: get_children does not need write permission
This prevented a read-only user from being able to unprotect a
snapshot without write permission on all pools. This w...
Josh Durgin
08:06 AM Revision 91e941ae (ceph): OSD: remove RD flag from CALL ops
20496b8d2b2c3779a771695c6f778abbdb66d92a forgot to do this. Without
this change, all class methods required regular r...
Josh Durgin
08:06 AM Revision c67c789d (ceph): librbd: add {rbd_}open_read_only()
Since 58890cfad5f7bee933baa599a68e6c65993379d4, regular {rbd_}open()
would fail with -EPERM if the user did not have ...
Josh Durgin
08:06 AM Revision 47bf5195 (ceph): librbd: open parent as read-only during clone
We never write to the parent, and don't need to watch it during this process.
Signed-off-by: Josh Durgin <josh.durgi...
Josh Durgin
08:06 AM Revision 958addc0 (ceph): rbd: open (source) image as read-only
This allows users without write access to copy, export and list
information about an image.
Signed-off-by: Josh Durg...
Josh Durgin
08:06 AM Revision d0a14d11 (ceph): librbd: fix race between unprotect and clone
Clone needs to actually re-read the header to make sure the image is
still protected before returning. Additionally, ...
Josh Durgin
08:06 AM Revision 8bbb4a36 (ceph): doc: fix rbd permissions for unprotect
Unprotect examines all pools, so use blanket x before 0.54. After
that, use class-read restricted by object_prefix to...
Josh Durgin
05:00 AM Revision 7b0dbeb0 (ceph): doc/install/upgrading: edits to upgrade document
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:00 AM Revision 4aa6af76 (ceph): doc/release-notes: link to upgrade doc
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/29/2012

08:04 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
Possibly the same bug in teuthology:/a/joshd-3631-12-28-12_08.55/30739... Josh Durgin
07:45 PM Bug #3698: filestore: ENOENT on clone
Sage Weil
07:44 PM Bug #3698: filestore: ENOENT on clone
Can you add 'debug osd = 20' so the job you're running?
Sage Weil
04:30 PM Bug #3698: filestore: ENOENT on clone
Happened again in teuthology:/a/joshd-3631-12-28-12_08.53/30681 Josh Durgin
09:50 AM Bug #3698 (Resolved): filestore: ENOENT on clone
Full logs in teuthology:/a/joshd-3631-12-27-12_22.21/29826... Josh Durgin
04:38 PM Revision 6711a4c4 (ceph): Revert "mds: replace closed sessions on connect"
This reverts commit 8b599083705c2495810c00f9f5fd5bb8ace7f32e.
This fix is not correct. See #3696.
Sage Weil
04:28 PM Revision bb4a2c55 (ceph): rgw: enable logging in ceph.conf
Sage Weil
02:39 PM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
logs: ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30039... Tamilarasi muthamizhan
02:32 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30036 Tamilarasi muthamizhan
09:43 AM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
reverted the broken fix, reproducing the original problem again. Sage Weil
02:19 PM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
cluster: burnupi06 [running osd.1 on v0.55.1] , burnupi07[running osd.3, osd.4, mon.b on argonaut], burnupi08[running... Tamilarasi muthamizhan
08:37 AM rbd Bug #3697 (Duplicate): rbd copy.sh test failing in nightly
... Sage Weil
01:21 AM Revision a5d692a7 (ceph): msgr: inject delays at inconvenient times
Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-...
Sage Weil
01:21 AM Revision 82f8bcdd (ceph): msg/Pipe: use state_closed atomic_t for _lookup_pipe
We shouldn't look at Pipe::state in SimpleMessenger::_lookup_pipe() without
holding pipe_lock. Instead, use an atomi...
Sage Weil
01:21 AM Revision 7bf0b085 (ceph): msgr: atomically queue first message with connect_rank
Atomically queue the first message on the new pipe, without dropping
and retaking pipe_lock.
Signed-off-by: Sage Wei...
Sage Weil
01:21 AM Revision 6339c5d4 (ceph): msgr: don't queue message on closed pipe
If we have a con that refs a pipe but it is closed, don't use it. If
the ref is still there, it is only because we a...
Sage Weil
01:21 AM Revision e99b4a30 (ceph): msgr: fix race on Pipe removal from hash
When a pipe is faulting and shutting down, we have to drop pipe_lock to
take msgr lock and then remove the entry. Th...
Sage Weil
01:19 AM Revision 83c8025d (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:19 AM Revision c2a75253 (ceph): test: mon: workloadgen: debug when message fsid != monmap fsid
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
01:19 AM Revision b30ab517 (ceph): test: mon: workloadgen: assert if monmap's fsid is zero after authenticate
Fixes: #3629
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
01:19 AM Revision 35836847 (ceph): doc: update Hadoop documentation
Updates configuration option names, and adds object.size,
localize.reads, and root.dir control options.
Signed-off-b...
Noah Watkins
01:12 AM Revision 942c7145 (ceph): init-ceph: ok, 8K files
16K might be a bit many.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
01:10 AM Revision 0a5d6d87 (ceph): msg/Pipe: remove broken cephs signing requirement check
Remove the special-case check, which does not inform the peer what
protocol features are missing. It also enforces t...
Sage Weil
12:00 AM Revision 65b787ea (ceph): msg/Pipe: include remote socket addr in debug output
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/28/2012

11:55 PM Revision 9e5e08f8 (ceph): doc: Added a new upgrade document.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:55 PM Revision 1553267e (ceph): doc: Minor edit.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:54 PM Revision 02b8bcd0 (ceph): doc: Added upgrade link to index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:44 PM Revision 076b418c (ceph): os/FileJournal: logger is optional
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:14 PM Revision 3debf0cf (ceph): client: fix fh leak in non-create case
We may take the O_CREAT path and get an fh from _create, but created can
still be false. In that case, skip the fina...
Sage Weil
11:10 PM Revision 7f35e5dd (ceph): client: Make ll_create use _create
This is a fix for bug #3625, where multiple clients race to create a
file, and the loser returns EEXIST instead of a ...
Sam Lang
11:10 PM Revision 67bc849c (ceph): mds: Return created inode in mds reply to create
If multiple clients race to create a file, multiple clients will send a
create request and get back a valid dentry+in...
Sam Lang
11:08 PM Revision 813787af (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
11:03 PM Revision ca34fc4d (ceph): osd: allow RecoveryDone self-transition in RepNotRecovering
In a mixed cluster where some OSDs support the recovery reservations and
some don't, the replica may be new code in R...
Sage Weil
10:15 PM Revision 0f5383f4 (ceph): Merge remote-tracking branch 'origin/wip-gl-docs'
Update release process documentation. Gary Lowell
10:05 PM Revision 1867b818 (ceph): docs: fix typo in release-process doc
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
10:00 PM Linux kernel client Bug #3519: rbd map hang during system startup
yay! Dan Mick
06:08 AM Linux kernel client Bug #3519 (Resolved): rbd map hang during system startup
Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8.
Alex Elder
09:47 PM Revision 3a8bf3af (ceph): doc/release-notes: document new 'max open files' default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:11 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
Sage Weil
06:42 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
This occurred shortly after startup when trying to reproduce another bug on the master branch:... Josh Durgin
08:34 PM Revision ea13ecc2 (ceph): osd: less noise about inefficient tmap updates
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
08:12 PM Revision 9483a032 (ceph): init-ceph: fix status version check across machines
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt...
Sage Weil
08:12 PM Revision 8fef9360 (ceph): init-ceph: use SSH in "service ceph status -a" to get version
When running "service ceph status -a", a version number was never
returned for remote hosts, only for the local. Thi...
Travis Rhoden
08:11 PM Revision 672c56b1 (ceph): init-ceph: default to 16K max_open_files
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
07:58 PM Revision 948e7524 (ceph): ceph-fuse: Avoid doing handle cleanup in dtor
The CephFuse::Handle class needs the client
pointer to be valid for finalizing, so don't finalize
in the destructor (...
Sam Lang
07:10 PM Revision ff2d4abb (ceph): ceph-fuse: Pass client handle as userdata
The fuse lowlevel API isn't getting the client
handle when when it gets initialized, resulting
in a null pointer for ...
Sam Lang
06:21 PM CephFS Fix #3630: mds: broken closed connection cleanup
Sage Weil
05:57 PM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
ceph version : 0.55.1-329-g01376d4 (01376d44d73189080d207f701fc7e38cf55c738d)
cluster:
burnupi15[running osd.1, ...
Tamilarasi muthamizhan
05:09 PM Bug #3675 (Resolved): osd: hang during intial peering
Sage Weil
04:55 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
the problem was old ceph-osd daemons on other hosts trying to connect. running code that didn't include commit:4d20b... Sage Weil
12:27 PM Bug #3690: osd crashed in FileStore::_do_transaction
made the default fd limit much higher in commit:672c56b18de3b02606e47013edfc2e8b679d8797 Sage Weil
10:39 AM Bug #3690: osd crashed in FileStore::_do_transaction
... Sage Weil
10:32 AM Bug #3690: osd crashed in FileStore::_do_transaction
... Tamilarasi muthamizhan
04:40 PM Bug #3684 (Fix Under Review): filejournal: aio vector size is not limited
wip-journal-aio Sage Weil
04:09 PM Revision acb91f7d (ceph): kernel: move fsync test to marginal suite until it works
Sage Weil
04:08 PM Revision 02e4eeff (ceph): kernel: move fsx to marginal suite until it passese
Sage Weil
04:06 PM Documentation #3694 (Closed): doc: how to use the admin socket interface
A couple pages in the docs mention specific commands, but there's no overall explanation of what it is, and how you c... Josh Durgin
01:56 PM Bug #3691: Lock issue in librados resulting in application hang
This affects small clusters more because a single osd is a larger proportion of the whole cluster. In bobtail, there ... Josh Durgin
01:46 PM Bug #3691: Lock issue in librados resulting in application hang
Well, this is an even worse issue. We are adding new osds (just 8 now), and the cluster has been staying "unhealthy" ... Xiaopong Tran
01:04 PM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
You're calling the synchronous version of write, and the spot where it's 'hung' is just waiting for the response from... Josh Durgin
04:53 AM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
We ran into some nasty lock issue in librados, it's trying to write some data, and hangs there for a many seconds unt... Xiaopong Tran
12:46 PM RADOS Bug #3693 (Duplicate): crushtool compile fails with unhelpful message, diagnosis quite difficult
A user tried to create his own crushmap as follows:... Dan Mick
12:18 PM rbd Bug #2689 (In Progress): qemu iozone test hangs
This seems to still be a problem. I'll try to get more information about what's going on. It looks like there's an er... Josh Durgin
12:12 PM devops Documentation #2774: doc: ceph-disk man page
These would be useful. Someone on irc was confused earlier by the undocumented requirement to set --cluster-uuid (or ... Josh Durgin
12:08 PM rbd Bug #3692: OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
Chronology of events (UTC) in the latest example of this happening, in case it's relevant:
15:50:46 mon.b is s...
Justin Lott
12:01 PM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
I've seen this happen twice:
- Reboot a node running a number of OSD's
- Within a short period of time, seemingly...
Justin Lott
11:42 AM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
wip-3689 has a fix; please test! Sage Weil
10:58 AM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28728 Tamilarasi muthamizhan
10:53 AM Bug #3631: osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during librbd_fsx
ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662... Tamilarasi muthamizhan
10:34 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
recent log: ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28713 Tamilarasi muthamizhan
06:10 AM Bug #3657 (Resolved): rbd: crash mapping image
Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8.
Alex Elder
06:08 AM Revision 9967cf24 (ceph): release-notes: rgw logging now off by default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:03 AM Revision 1c3e12a2 (ceph): doc: warn about using caching without QEMU knowing
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
06:02 AM Revision f6ce5dda (ceph): rgw: disable ops and usage logging by default
Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned ...
Sage Weil
04:28 AM Bug #3683 (In Progress): mon: leak of MMonPaxos
Joao Eduardo Luis
04:26 AM Bug #3633: mon: clock drift errors not reported by ceph status
wip-3633 now has a couple of patches that introduce a mechanism to keep track of clock skews on the monitors.
If s...
Joao Eduardo Luis
01:24 AM Revision 64b845f6 (ceph): features is uint64_t
This won't bite us for a while yet (we're on bit 26), but it will soon!
Signed-off-by: Sage Weil <sage@inktank.com>
...
Sage Weil
01:15 AM Revision 2fbe3e17 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:55 AM Revision 856f32ab (ceph): ceph-fuse: Split main into init/main/finalize
With the invalidate callback enabled for fuse, the Client::unmount
call requires the fuse channel and session objects...
Sam Lang
12:39 AM Revision c0fe3815 (ceph): java: remove deprecated libcephfs
Removes ceph_set_default_*
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
12:32 AM Revision 6c7b667b (ceph): init-ceph: fix status version check across machines
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt...
Sage Weil

12/27/2012

11:39 PM Revision 774a54cb (ceph): docs: update release process documentation.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
10:23 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
This looks like a compatibility issue with recovery queueing:... Sage Weil
05:26 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
Log from crashing osd with greater debug level https://dl.dropbox.com/u/5820195/ceph-osd.1.log.gz. Maciej Galkiewicz
05:09 PM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
Reported by mgalkiewicz in #ceph. https://gist.github.com/raw/4393494/f3ae88406350b74ac6d608b8b75960f85435e85e/gist... Sage Weil
09:40 PM Revision af37cc3a (ceph): Merge remote-tracking branch 'gh/wip-mds'
Sage Weil
09:26 PM Revision 63567392 (ceph): osd: fix recovery assert for pg repair case
In the case of PG repair, this assert is not valid. Disable it for now.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:09 PM Revision 1fa8c83d (ceph): Merge branch 'wip-osd-flags'
Sage Weil
09:07 PM Revision 207e93ab (ceph): Merge remote-tracking branch 'gh/wip-mds-pool'
Reviewed-by: Sam Lang <sam.lang@inktank.com> Sage Weil
08:12 PM Revision 03f6dfa4 (ceph): osd: move rmw_flags to OpRequest, out of MOSDOp
It was very sloppy to put a server-side processing state inside the
messsage. Move it to the OpRequestRef instead.
...
Sage Weil
08:12 PM Revision f1dfd64f (ceph): messages/MOSDOpReply: remove misleading may_read/may_write
These are OpRequest properties, calculated/enforced at the OSD. They don't
belong in the MOSDOp or MOSDOpReply messa...
Sage Weil
08:12 PM Revision f2306038 (ceph): osd: only calculate OpRequest rmw flags once
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
08:04 PM Linux kernel client Bug #3519: rbd map hang during system startup
Nick reports:
I have some exciting news. After 215 test runs, no hung processes
were detected. I think we may...
Alex Elder
07:58 PM Bug #3657: rbd: crash mapping image
I'm currently testing two patches related to this bug, and
while I haven't pushed them to the testing branch yet I
...
Alex Elder
07:27 PM Revision 998f7194 (ceph): dropping xfs test 186 due to bug: 3685
Signed-off-by: tamil <tamil.muthamizhan@inktank.com> Tamilarasi muthamizhan
07:14 PM Revision 98e7b598 (ceph): docs: remove extra release-process2 file.
This file mostly duplicated the existing release documentation. Differences
have been merged into the primary file.
...
Gary Lowell
07:12 PM Revision 82c71716 (ceph): osd: drop 'osd recovery max active' back to previous default (5)
Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high. In my tes...
Sage Weil
07:11 PM Revision 6f1f03c7 (ceph): journal: reduce journal max queue size
Keep the journal queue size smaller than the filestore queue size.
Keeping this small also means that we can lower t...
Sage Weil
07:09 PM Revision 0d2ad2f2 (ceph): mds: use set to store MDSMap data pools
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:53 PM Revision 80bcaa29 (ceph): rados: add filestore_idempotent test with journal aio = true
Sage Weil
05:55 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
this reproduced once out of ~60 runs on the fsx task.... Sage Weil
05:36 PM Revision 2137d5cd (ceph): mds: wait for client's mdsmap when specifying data pool
The client may have a newer map than we do; make sure we wait for it lest
we inadvertantly reply because we think the...
Sage Weil
05:33 PM Bug #3690: osd crashed in FileStore::_do_transaction
leaving the cluster as it is for someone to take a look at it. Tamilarasi muthamizhan
05:33 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
ceph version: 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
I had a cluster[burnupi15, burnupi19,...
Tamilarasi muthamizhan
05:33 PM Revision 9da6d882 (ceph): doc: document mds config options
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:52 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
ceph version : 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
rbd allows images created with zero ...
Tamilarasi muthamizhan
04:45 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
While debugging what I thought was a Ceph memory leak, I was pointed to
http://ceph.com/deprecated/Memory_Profiling
...
Faidon Liambotis
04:37 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
On http://ceph.com/docs/master/install/build-prerequisites/ , in the "On Debian/Squeeze, execute aptitude install ...... Nat Makarevitch
12:32 PM rbd Bug #3427: krbd: unmap does not remove block device properly
I am going to assume that the racing open is the cause of
the problem reported by Nikola Kotur.
To fix it, I will...
Alex Elder
12:17 PM rbd Bug #3427: krbd: unmap does not remove block device properly
> For RBD, wasn't the use_count something we just added? Would it cover this situation?
No.
The first warning i...
Alex Elder
08:53 AM rbd Bug #3427: krbd: unmap does not remove block device properly
For cephfs, the vfs normally handles that.
For RBD, wasn't the use_count something we *just* added? Would it cove...
Sage Weil
08:37 AM rbd Bug #3427: krbd: unmap does not remove block device properly
I also note, having taken a little closer look at Nikola Kotur's
kernel log that both an open and a close appear to ...
Alex Elder
08:31 AM rbd Bug #3427: krbd: unmap does not remove block device properly
It looks to me like the osd client code has nothing in place
to protect itself from one of its users (ceph client, m...
Alex Elder
12:01 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
There aren't known leaks in argonaut. If you can reproduce with valgrind massif and see where the heap is going, tha... Sage Weil
11:28 AM rbd Bug #2689: qemu iozone test hangs
Testing again since some possible causes were fixed. Josh Durgin
10:54 AM rbd Bug #3685 (Closed): xfs test 186 fails in the nightlies
logs: ubuntu@teuthology:/a/teuthology-2012-12-26_19:00:03-regression-next-testing-basic/28039
...
Tamilarasi muthamizhan
09:19 AM Bug #3684 (Resolved): filejournal: aio vector size is not limited
FileJournal::write_aio_bl does not limit the size of the iov to IOV_MAX. Sage Weil
08:23 AM Bug #3683 (Resolved): mon: leak of MMonPaxos
ubuntu@teuthology:/a/teuthology-2012-12-22_19:00:02-regression-next-testing-basic/22989
saw it a few days earlier,...
Sage Weil
01:34 AM Revision 916d1cf6 (ceph): doc: journaler config options
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/26/2012

10:27 PM Revision c34e38bc (ceph): log: 10,000 recent log entries
This is what we were (wrongly) doing before, so there are no memory
utilization surprises.
Signed-off-by: Sage Weil ...
Sage Weil
10:27 PM Revision 4daede79 (ceph): log: fix log_max_recent config
<facepalm>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4de7748b72d4f90eb1197a70015c199c15...
Sage Weil
10:27 PM Revision fdae0552 (ceph): log: fix flush/signal race
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav...
Sage Weil
08:54 PM Revision cedea139 (ceph): docs: Merge changes from release-process2 document.
Gary Lowell
07:58 PM Revision 850a056b (ceph): mds: add waiting_for_mdsmap queue
Defer events until we get a specific MDSMap epoch.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
07:58 PM Revision c764935d (ceph): mds: do not check for pool existence in osdmap
We don't have a wait mechanism to ensure the MSDMap has the latest osdmap
here. Just trust the MDSMap.
Signed-off-b...
Sage Weil
06:55 PM Revision 4929fc7d (ceph): qa: remove xfstests 172 and 173 from qemu testing
These seem to require newer xfs.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
05:42 PM Revision f5403f94 (ceph): doc/man/8/mkcephfs: update --mkfs a bit
Document that 'devs' and 'osd mkfs type' must be defined.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:23 PM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
log: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27925 Tamilarasi muthamizhan
04:20 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
Logs: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27924
ubuntu@teuthology:/...
Tamilarasi muthamizhan
03:48 PM Bug #3378 (Can't reproduce): common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
The suicide timeout is the symptom only. Usually it means the thread is blocked by a hung syscall. In your case, Ma... Sage Weil
02:38 PM rbd Bug #3427: krbd: unmap does not remove block device properly
I haven't spent time on this in almost a month so wanted to just
provide an update. We have been looking at and try...
Alex Elder
01:02 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
We are using 0.48.2 for the OSDs and our plan is to upgrade to 0.56 (or the next stable release) when it comes out. Kevin Scheunemann
11:45 AM Bug #3546 (Won't Fix): CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
The crash is a known problem with pre-3.4 kernels. Fixes have been backported to 3.4 stable and 3.6 stable kernels, ... Sage Weil
11:36 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
At the time, the clients where running 3.2.0-32, but we have since upgraded to 3.6.9 per another ceph bug.
We have...
Kevin Scheunemann
11:25 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
What kernel version are you running? Sage Weil
11:05 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another run:... Sage Weil
09:38 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another one:... Sage Weil
09:59 AM CephFS Bug #3681 (Resolved): kclient fsx fails nightly
... Sage Weil
08:44 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
Xiaopong Tran wrote:
> I'm using xfs, with no specific mount options, just the default.
>
> I added the debug set...
Sage Weil
02:22 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
I'm using xfs, with no specific mount options, just the default.
I added the debug settings, and got a large log f...
Xiaopong Tran
08:39 AM CephFS Feature #3679 (Closed): Any API to get metadata?
Yep! See libcephfs. There is... Sage Weil
01:08 AM CephFS Feature #3679 (Closed): Any API to get metadata?
hello,there.
I am wondering if there is any API to get the metadata of a file .
I have the ceph file system run by ...
lollipop king
07:20 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
err... that should have been each monitor's ip and port.
as in...
Joao Eduardo Luis
01:10 AM CephFS Tasks #3680 (Rejected): deduplication in ceph
I am wondering how to do deduplication in ceph...the big problem is how to get the metadata of a file
and how to mod...
lollipop king

12/25/2012

08:35 PM Bug #3378: common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
Saw this show up during parametric sweep testing on EXT4 with 8 concurrent OSD disk threads. Ceph build is from gitb... Mark Nelson

12/24/2012

02:58 PM CephFS Feature #1448 (In Progress): test hadoop on sepia
Sage Weil
02:58 PM CephFS Cleanup #814 (Resolved): hadoop: refactor hadoop shim in terms of java libceph bindings
Sage Weil
02:56 PM rbd Feature #3580 (Resolved): rbd import from stdin could try harder to sparsify images
Sage Weil
02:54 PM rgw Feature #1950: rgw: create S3/Swift ACL interoperability suite
Sage Weil
12:27 PM Bug #3676 (Need More Info): osd keeps crashing at ReplicatedPG::scan_range()
... Sage Weil
12:04 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
... Sage Weil
09:22 AM rbd Bug #3654 (Fix Under Review): libvirt: colons in ipv6 monitor addresses are not escaped when sent...
Sage Weil
08:45 AM rbd Fix #3665: librbd: deadlock during flatten
the problem is that we are holding the snap_lock and then waiting for io. but we mostly use snap_lock as a tight inne... Sage Weil
04:01 AM Revision d18f3c2d (ceph): mds: don't force in->first == dn->first
The fullbit sets it now. For multiversion inodes, it's "first" can be in
the future, since this dentry may not have ...
Sage Weil
04:01 AM Revision 8b599083 (ceph): mds: replace closed sessions on connect
If a connection comes and there is a closed session attached, remove it.
This is probably a failure of an old session...
Sage Weil
04:01 AM Revision a3e70aed (ceph): mds: always send discover if want_xlocked is true
If want_xlocked is true, we can not rely on previously sent discover
because it's likely the previous discover is blo...
Yan, Zheng
04:01 AM Revision 96f48aa0 (ceph): mds: re-issue caps after importing caps
The imported caps may prevent unstable locks from entering stable
states. So we should call Locker::eval_gather() wit...
Yan, Zheng
04:01 AM Revision dd441576 (ceph): mds: take export lock set before sending MExportDirDiscover
Migrator::export_dir() only check if it can lock the export lock set
but not take the lock set. So someone else can c...
Yan, Zheng
04:01 AM Revision 1174dd31 (ceph): mds: don't retry readdir request after issuing caps
If remote linkage without inode is encountered after some caps are
issued, Server::handle_client_readdir() should sen...
Yan, Zheng
04:01 AM Revision f5e86ecb (ceph): mds: delay processing cache expire when state >= EXPORT_EXPORTING
It's possible that MDS receives cache expire in EXPORT_LOGGINGFINISH
and EXPORT_NOTIFYING states.
Signed-off-by: Yan...
Yan, Zheng
04:01 AM Revision efbca31d (ceph): mds: fix file existing check in Server::handle_client_openc()
Creating new file needs to be handled by directory fragment's auth
MDS, opening existing file in write mode needs to ...
Yan, Zheng
04:01 AM Revision 00025462 (ceph): mds: fix race between send_dentry_link() and cache expire
MDentryLink message can race with cache expire, When it arrives at
the target MDS, it's possible there is no correspo...
Yan, Zheng
04:01 AM Revision a1485f95 (ceph): mds: compare sessionmap version before replaying imported sessions
Otherwise we may wrongly increase mds->sessionmap.version, which
will confuse future journal replays that involving s...
Yan, Zheng
04:01 AM Revision 48d8ae58 (ceph): mds: alllow handle_client_readdir() fetching freezing dir.
At that point, the request already auth pins and locks some objects.
So CDir::fetch() should ignore the can_auth_pin ...
Yan, Zheng
04:01 AM Revision 0ab0744e (ceph): mds: properly mark dirfrag dirty
If predirty_journal_parents() does not propagate changes in dir's
fragstat into corresponding inode's dirstat, it sho...
Yan, Zheng
04:01 AM Revision b7e698a5 (ceph): mds: no bloom filter for replica dir
We should delete dir fragment's bloom filter after exporting the dir
fragment to other MDS. Otherwise the residual bl...
Yan, Zheng
04:01 AM Revision e6b8f0a6 (ceph): mds: set want_base_dir to false for MDCache::discover_ino()
When frozen inode is encountered, MDCache::handle_discover() sends
reply immediately if the reply message is not empt...
Yan, Zheng
04:01 AM Revision 69f9f024 (ceph): mds: fix error hanlding in MDCache::handle_discover_reply()
The error hanlding code in MDCache::handle_discover_reply() has two
main issues. MDCache::handle_discover_reply() doe...
Yan, Zheng
03:59 AM Revision d9673ca3 (ceph): Merge branch 'wip-create-layout'
Reviewed-by: Greg Farnum <greg@inktank.com>
The functional tests for the create operations should add and specify no...
Sage Weil
03:39 AM Revision d2f5890f (ceph): client, libcephfs: add method to get the pool name for an open file
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:39 AM Revision 8efcf54d (ceph): mds: *_pg_pool -> *_pool
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:39 AM Revision 697ed23c (ceph): client: remove set_default_*() methods
This is a poor interface. The hadoop stuff is shifting to specify this
information on file creation instead.
Signed...
Sage Weil
03:39 AM Revision 99d9e1da (ceph): mds: allow data pool to be specfied on create
Reuse old preferred_pg field. Only use if the new CREATEPOOLID feature
is present, and the value is >= 0.
Verify th...
Sage Weil
03:39 AM Revision 3f458217 (ceph): mds: verify that the pool id is valid on SET[DIR]LAYOUT
Make sure the data pool exists and is part of the MDSMap data pools list.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
03:39 AM Revision 32ab274a (ceph): client: specify data pool on create operations
Fill in the data pool field if specified by the client, or set to -1.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil

12/23/2012

11:21 PM Revision 61d43af7 (ceph): osd: make MOSDFailure output more sensible
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:21 PM Revision 850d1d54 (ceph): osd: fix dup failure cancellations
If we had a pending failure report, and send a cancellation, take it
out of our pending list so that we don't keep re...
Sage Weil
11:11 PM Revision 9df522e9 (ceph): mon: make osd failure report log msgs sensible
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:42 PM Revision 1290671f (ceph): Merge branch 'wip-scrub' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Conflicts:
src/osd/PG.cc
Sage Weil
09:53 PM Revision 8362e640 (ceph): monclient: fix get_monmap_privately retry interval
Use mon_client_hunt_interval (default 3) instead of hardcoding 1 second.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:53 PM Revision d843a64a (ceph): Makefile: fix 'base' rule
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:12 PM CephFS Cleanup #3677 (Closed): libcephfs, mds: test creation/addition of data pools, create policy
the create data pool argument is tested only with the default pools. once an lib is in place for the unit/functional... Sage Weil
09:06 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
No worries. Let us know if you do come across behavior that looks like a bug! Sage Weil
08:59 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hi Sage,
i am very sorry for taking your time with this issue, I feel like an idiot :(
The buggy client is runnin...
Roman Hlynovskiy
07:19 PM Revision 00b89c3f (ceph): Merge branch 'next'
Sage Weil
07:18 PM Revision a09f5b1b (ceph): init-ceph,mkcephfs: default inode64 for mounting xfs
According to hch this is now the default or new kernels.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
03:22 PM Bug #3675 (Fix Under Review): osd: hang during intial peering
wip-3675 Sage Weil

12/22/2012

09:39 PM Bug #3675: osd: hang during intial peering
ok, this is actually also a race that can cause the register_pipe assert. the locking needs to be reworked here. pu... Sage Weil
09:28 PM Bug #3675: osd: hang during intial peering
... Sage Weil
08:54 PM Bug #3675: osd: hang during intial peering
it took about 1500 iterations of this job to reproduce the hang:... Sage Weil
08:53 PM Bug #3675: osd: hang during intial peering
ubuntu@teuthology:/a/sage-peer1/21827 Sage Weil
08:52 PM Bug #3675: osd: hang during intial peering
this is a messenger bug. if there is a socket error at the end of accept(), after the register_pipe(), we then fail ... Sage Weil
08:11 AM Bug #3675 (Resolved): osd: hang during intial peering
the initial wait for healthy blocked on 2 pgs. ms inject socket failres = 500. everything was up.
no logs, so it...
Sage Weil
07:10 PM Revision 5f25f9f8 (ceph): init-ceph: default osd_data path
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:40 PM Bug #3657 (In Progress): rbd: crash mapping image
I got a response from Ugis. The patches I supplied to him
did stop the crashes he was seeing. So we'll want to get...
Alex Elder
09:22 AM Bug #3676 (Can't reproduce): osd keeps crashing at ReplicatedPG::scan_range()
This specific osd (osd.17) keeps crashing at the same location, as I tried to bring it back. It would start peering a... Xiaopong Tran
07:04 AM Documentation #3674 (Resolved): Deployment documentation is confusing
As a new user who spent hours googling and reading source code to decipher what each tool does, I thought of giving s... Faidon Liambotis
06:29 AM devops Feature #3255: ceph-disk: allow prepare without activate (for spares)
Couldn't ceph-disk-prepare take a lock by e.g. writing a file (or even flock()ing it) in /var/lib/ceph/ before it sta... Faidon Liambotis
04:37 AM Revision ad9bcc70 (ceph): PG: don't use a self-transition for WaitRemoteRecoveryReserved
Previously, using the state on active worked, but now we might
go back through WaitRemoteRecoveryReserved without res...
Samuel Just
04:37 AM Revision f6b2ca8b (ceph): OSD: always do a deep scrub when repairing
Otherwise, errors turned up in a deep-scrub will be
swept under the rug without being repaired.
Signed-off-by: Samue...
Samuel Just
04:35 AM Revision 2e96bb18 (ceph): PG: Handle repair once in scrub_finish
We don't want to change missing sets during a chunky
scrub since it would cause !is_clean() and derail
the rest of th...
Samuel Just
02:47 AM devops Feature #3673 (Rejected): ceph-disk-prepare should provide an option for SSD alignment
ceph-disk-prepare takes an option to use an external disk as a journal. It is commonly suggested that the journal is ... Faidon Liambotis
01:12 AM Revision bdcf6647 (ceph): .gitignore: Add ar-lib to ignore list
Gary Lowell
01:03 AM Revision 4a558048 (ceph): librbd: move buf_is_zero() to new common/util.cc and include/util.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Dan Mick
01:03 AM Revision 410903fe (ceph): rbd: check for all-zero buf in export, seek output if so
Use buf_is_zero in common/util.cc
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durg...
Dan Mick
01:03 AM Revision 5905d7fa (ceph): rbd: harder-working sparse import from stdin
Try to accumulate image-sized blocks when importing from stdin, even if
each read is shorter than requested; if we ge...
Dan Mick
01:03 AM Revision 6325a480 (ceph): import_export.sh: sparse import export
Add tests for:
- sparse import makes expected sparse images
- sparse export makes expected sparse files
- sp...
Dan Mick
12:55 AM Revision 51a900cf (ceph): autogen.sh: Create m4 directory for leveldb
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:47 AM Revision 8f5de156 (ceph): osd: fix pg stat msgs vs timeout
We can get a pattern like so:
- new mon session
- after say 120 seconds, we decide to send a stats msg
- outstanding...
Sage Weil
12:19 AM Revision 74473bb6 (ceph): leveldb: Update submodule
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:14 AM Revision 2bf4f42b (ceph): doc: Added new journaler page to CephFS section. Needs descriptions.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:14 AM Revision 53afac1a (ceph): doc: Added Journaler Configuration to toc tree.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:09 AM Revision 757902d6 (ceph): doc: Added --mkfs options.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:08 AM Revision 46d03344 (ceph): doc: Added running multiple clusters. Per Tommi.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:07 AM Revision e3d07566 (ceph): doc: Updated the Configuration File section.
- Replaced ceph.conf with Ceph configuration to clarify
when running multiple clusters on the same hardware.
- Adde...
John Wilkins

12/21/2012

11:20 PM Revision 00ed6657 (ceph): PG::scrub_compare_maps increment scrubber.fixed for missing repairs
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
11:16 PM Revision c9e05174 (ceph): PG::_compare_scrubmaps: increment scrubber.errors on missing object
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
11:15 PM Revision b564fdb8 (ceph): release-notes: remove warning about osd caps
This was only an issue from 0.49-0.52 upgrading to 0.53+
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
11:15 PM Revision 3076e459 (ceph): release-notes: pgnum is required now
This should have been in the 0.55 release notes.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
11:15 PM Revision 048567e0 (ceph): release-notes: fix typos
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
11:15 PM Revision b39928df (ceph): release-notes: remove bug fix that does not affect argonaut
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
11:15 PM Revision 4a039393 (ceph): release-notes: add more user-visible changes
These are from looking through the shortlog from 0.48.2..next.
The description of the min_size defaults could probabl...
Josh Durgin
10:54 PM Revision 09d4f036 (ceph): doc: Added sudo the ceph health for when cephx is on.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:53 PM Revision 085992f6 (ceph): doc: minor fix to syntax.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:23 PM Revision 206ffcd8 (ceph): mkcephfs: error out if 'devs' defined but 'osd fs type' not defined
We can infer btrfs if they use btrfs devs, but if they use devs there is
no default fs.
Signed-off-by: Sage Weil <sa...
Sage Weil
10:04 PM Revision 4a40067d (ceph): doc: update ceph.conf examples about btrfs default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:00 PM Revision 677a7a5a (ceph): rgw: add swift tasks
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
09:56 PM Revision 11fb3141 (ceph): Merge remote-tracking branch 'gh/wip-scrub' into next
Sage Weil
09:45 PM Revision 47145d80 (ceph): Merge remote-tracking branch 'gh/wip-3643' into next
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
09:44 PM Revision 999ba1b2 (ceph): monc: only warn about missing keyring if we fail to authenticate
This avoids the situation where a librados or other user with the default
of 'cephx,none' and no keyring is authentic...
Sage Weil
09:10 PM Revision 5d5a42bc (ceph): osd: clear CLEAN on exit from Clean state
This means we can drop the scrub repair state_clear() call. We probably
can drop others, but lets leave that for ano...
Sage Weil
08:19 PM Revision b3e62ad6 (ceph): auth: use none auth if keyring not found
If both cephx and none are accepted auth methods, and
cephx keyring cannot be found then resort to using
none, instea...
Yehuda Sadeh
07:37 PM Revision ae044e64 (ceph): osd: allow transition from Clean -> WaitLocalRecoveryReserved for repair
If we do a scrub repair, we need to go from clean to recovery again to
copy objects around.
This fixes a simple repa...
Sage Weil
07:36 PM Revision 7c56d8fa (ceph): PG::sched_scrub: return true if scrub newly kicked off
The previous return value wasn't really what OSD::sched_scrub
wanted to know.
Signed-off-by: Samuel Just <sam.just@i...
Samuel Just
07:36 PM Revision 4d661e0d (ceph): PG::sched_scrub: only set PG_STATE_DEEP_SCRUB once reserved
Otherwise we would have +DEEP before we have +SCRUB.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
07:29 PM Revision 19e44bff (ceph): osd: clear scrub state if queued scrub doesn't start
We set SCRUBBING when we queue a pg for scrub. If we dequeue and
call scrub() but abort for some reason (!active, de...
Sage Weil
07:29 PM Revision 670afc6c (ceph): PG: in sched_scrub() set PG_STATE_DEEP_SCRUB not scrubber.deep
scrubber.deep gets reset in scrub() to match
state_test(PG_STATE_DEEP_SCRUB).
Signed-off-by: Samuel Just <sam.just@i...
Samuel Just
06:20 PM Revision c02d34dc (ceph): task/swift: change upstream repository url
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
06:20 PM Revision 2f829870 (ceph): task/swift: change upstream repository url
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
06:15 PM Revision feb0aad2 (ceph): doc: Moved path to individual OSD entires.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
04:41 PM Bug #3661: mon: idle/empty osds marked down after 15 min
commit:8f5de156056de78c90f1dc7bf7c5a131c32c1bb8 Sage Weil
03:58 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
ubuntu@ceph3:/etc/ceph$ ceph -m ip:port -s
server name not found: ip (servname not supported for ai_socktype)
unabl...
Anonymous
11:10 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Pat, just a little triage before I dive into this full head on, could you please try the following for each monitor?
...
Joao Eduardo Luis
02:39 PM CephFS Documentation #3672 (Resolved): doc: how to mount ceph-fuse from fstab
There's a new mount helper in bobtail for this. It contains these comments:... Josh Durgin
02:36 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
anywhere i inserted a hash "#", this lovely program made them into numbered columns, so if you see a block with
x...
Anonymous
02:23 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
ok, looks like my conf settings got munged.
let's try this again
i was trying to get the mkcephfs to create a defau...
Anonymous
02:26 PM rgw Feature #3671 (Resolved): Request for x-amz-grant-full-control support
DH is requesting support for x-amz-grant-full-control:
"With Amazon S3, you can do specific grants like
x-amz-g...
JuanJose Galvez
02:23 PM rgw Feature #3670 (Resolved): Request for bucket-owner-read and bucket-owner-full-control grants
From DH, they'd like to see two types of requests which we currently ignore.
"Amazon has bucket-owner-read and buc...
JuanJose Galvez
01:59 PM Linux kernel client Bug #1492 (Can't reproduce): fsx failure on kclient
Sage Weil
01:55 PM rgw Feature #3669 (Resolved): rgw: support acl grants through http headers
support x-amz-grant-* http header fields. Yehuda Sadeh
01:48 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
commit:47145d800951db396785560df4e6d5d344af97dd Sage Weil
12:17 PM Bug #3643 (Fix Under Review): default authentication on the client does not work without a config...
Yehuda Sadeh
11:18 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
pretty sure this was caused by the log bug and 'log max new = 1', fixed by commit:50914e7a429acddb981bc3344f51a793280... Sage Weil
11:03 AM Bug #3657: rbd: crash mapping image
hmm. yeah, it probably means we should set the required features during negotiation to include MSG_AUTH instead of ... Sage Weil
10:56 AM Bug #3657: rbd: crash mapping image
There is another thing that came from the two crash logs Ugis
just supplied. They both contained lines like this:
...
Alex Elder
10:47 AM Bug #3657: rbd: crash mapping image
Ugis supplied two more images containing captured crash
stack traces. Both contained lines like this:
[ 32...
Alex Elder
10:27 AM rgw Feature #3668 (Resolved): rgw: support CORS
Yehuda Sadeh
10:21 AM rgw Feature #3667 (Resolved): rgw: support extra canned acl params
bucket-owner-read, bucket-owner-full-control Yehuda Sadeh
10:20 AM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
... Noah Watkins
10:18 AM Bug #3650 (In Progress): osd: crash in Reset state -> start_peering_interval -> on_change -> proc...
Sage Weil
09:36 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
Sage Weil
10:03 AM rbd Fix #3665 (Resolved): librbd: deadlock during flatten
Ran into this trying to reproduce #3631.
The test_librbd_fsx process is still running on plana34 for debugging.
...
Josh Durgin
08:36 AM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
I ran this test throughout the day yesterday and couldn't reproduce it, with message delays enabled. Marking as can'... Sam Lang
08:32 AM rbd Bug #3664 (Resolved): osdc/ObjectCacher.cc: 517: FAILED assert(!i->size())
... Sage Weil
07:52 AM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hi Roman-
The logging levels are right, but in both mds logs neither mds was ever active; both were in the up:stan...
Sage Weil
05:45 AM Revision e765dcb4 (ceph): osd: only dec_scrubs_active if we were active
This fixes a bug that puts scrubs_active negative.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:44 AM Revision ada3e27f (ceph): osd: reintroduce inc_scrubs_active helper
This mostly generates nice debug output. It also slightly simplifies
code and makes things symmetric.
Signed-off-by...
Sage Weil
01:43 AM Revision ae26432d (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:49 AM Revision bc4f74c7 (ceph): ceph.spec.in: Fedora builds debuginfo by default.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:24 AM Revision accce830 (ceph): Merge remote-tracking branch 'upstream/wip_notify' into next
Reviewed-by: Sage Weil <sage@inktank.com> Samuel Just

12/20/2012

11:51 PM Revision 129a49ad (ceph): cephtool: mention ceph osd ls, fix ceph osd tell N bench
Add ceph osd ls to help; make help for ceph osd tell N bench look
more like injectargs, which says <osd-id or *> to m...
Dan Mick
11:32 PM Revision a36d1db1 (ceph): rgw: remove noisy log message
No need for that log message.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
11:30 PM Revision 5b5a19ac (ceph): rgw: fix daemonize initialization
Just call the common daemonize function. Otherwise we end up
not initializng stdout / stderr correctly.
Signed-off-b...
Yehuda Sadeh
11:10 PM Revision 754fc200 (ceph): release notes: Mention new cephtool commands
ceph osd ls and ceph tell osd.N version are new. Mention their use
for verifying that all OSDs are upgraded in the n...
Dan Mick
10:19 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hello Sage,
added 4 logs:
screen output from console of the laggy client. it ends up on 'jroger@pr02:~/data$ cp...
Roman Hlynovskiy
09:07 PM CephFS Bug #3663 (Need More Info): ceph kernel client is getting stuck on xstat* operations
Hmm. It's actually just saying its the oldest client; it's not actually too old (yet). The looping connect attempts... Sage Weil
08:48 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
there are 2 kernel clients happily working with ceph. as soon as I try mounting ceph from the third client, it's gett... Roman Hlynovskiy
09:54 PM Bug #3661 (In Progress): mon: idle/empty osds marked down after 15 min
Sage Weil
04:57 PM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
wip-mon Sage Weil
09:48 PM Revision 50914e7a (ceph): log: fix flush/signal race
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav...
Sage Weil
09:29 PM Revision c0e23712 (ceph): ReplicatedPG::remove_notify : don't leak the notify object
Following remove_notify, there are no other references to
notif, delete it.
Signed-off-by: Samuel Just <sam.just@ink...
Samuel Just
09:27 PM Revision b5031a22 (ceph): OSD,ReplicatedPG: do not track notifies on the session
handle_notify_timeout and remove_notify currently do not clean up this
state leaving dangling Notification*. Further...
Samuel Just
08:59 PM Revision 719679ea (ceph): doc: Added package and repo links for Apache and FastCGI. Added SSL ena...
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:59 PM Revision 04eb1e73 (ceph): doc: Fixed restructuredText usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:42 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
The algorithm appears to be
1) if 'devs' is not defined, look for 'btrfs devs'; if that's defined, use those for ...
Dan Mick
05:00 PM Bug #3662 (Won't Fix): mkcephfs --mkfs is not inserting any default settings
It was my understanding that "sudo mkcephfs -a -c ceph.conf -k ceph.keyring --mkfs" would format a device with btrfs ... Anonymous
07:39 PM Revision ea9fc87d (ceph): doc: Removed foo. Apparently myimage was added and foo not removed.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:07 PM Revision 9f67c450 (ceph): Merge branch 'next'
Sage Weil
07:04 PM Revision 17c627b5 (ceph): Merge remote-tracking branch 'gh/wip-cephtool' into next
Sage Weil
06:58 PM Revision 0953ce53 (ceph): rados: add cephtool test
Sage Weil
06:49 PM Revision f38d8911 (ceph): Merge branch 'wip-build-fixes' into next
Sage Weil
06:46 PM Bug #3627 (Resolved): osd: segfault in ~MOSDSubOp during thrashing+rbd_fsx
accce830514c6b099eb0e00a8ae34396d14565a3 should fix it. Samuel Just
06:45 PM Bug #3659 (Resolved): complete_notify crash
accce830514c6b099eb0e00a8ae34396d14565a3 should take care of it. Samuel Just
12:24 PM Bug #3659: complete_notify crash
Saw on alexandria Samuel Just
12:23 PM Bug #3659 (Resolved): complete_notify crash
l=0).accept connect_seq 58 vs existing 57 state standby
2012-12-19 18:20:42.186013 7f3b1afe7700 0 <cls> cls/rgw/cls...
Samuel Just
06:13 PM Revision a803159b (ceph): rgw: configurable exit timeout
Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s...
Yehuda Sadeh
06:07 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
I just did an "scp" to burnupi40.front.sepia.ceph.com:/home/ubuntu/3647.vm.tgz Anonymous
04:32 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
I am in Sunnyvale and the VMs reside on my desktop. I have snapshotted and created a tar file of my 3 node cluster. ... Anonymous
07:32 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Pat, do you still have the VMs in this state? If so, can I take a look? Joao Eduardo Luis
05:45 PM Revision 92b59e90 (ceph): rgw: don't try to assign content type if not found
Fixes: #3648
Cannot assign a NULL pointer into stl string. This is only
relevant to swift, when uploading an object w...
Yehuda Sadeh
04:53 PM Revision c02e9062 (ceph): Merge remote-tracking branch 'gh/wip-crushtool' into next
Reviewed-by: Caleb Miles <caleb.miles@inktank.com> Sage Weil
03:49 PM rbd Bug #3524: test_librbd_fsx: crash after flatten
Sam saw this come up again in: ubuntu@teutholog:/a/sam-ooo3/19022
It's a different cause of the same symptom. In t...
Josh Durgin
02:52 PM Bug #3660 (Resolved): osd: marking objects lost invalidates pg stats
If you lose an object, the pg stats become invalid, and the next scrub will report a problem.
We could mark the st...
Sage Weil
02:20 PM Linux kernel client Bug #3519: rbd map hang during system startup
We've learned a few things since my last update, but the main
thing is that Nick tried the latest thing I offered an...
Alex Elder
11:41 AM Bug #3496 (Resolved): doc: have old URL's redirect to new ones
John Wilkins
11:41 AM Documentation #3564 (Resolved): doc: many broken links since rearrangement
John Wilkins
11:40 AM rgw Documentation #2989 (Resolved): doc: write RGW troubleshooting
John Wilkins
11:40 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
Apparently foo was the image name, and myimage was added and foo not removed. John Wilkins
11:36 AM Bug #3656 (In Progress): docs: "foo" doesn't mean anything in rbd example
John Wilkins
05:29 AM Bug #3656: docs: "foo" doesn't mean anything in rbd example
Whoops, I forgot to assign it.
Alex Elder
05:28 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
Someone named "Ugis" on the mailing list was having trouble
with the rbd command. One of the things this person men...
Alex Elder
10:10 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
Fixed, commit:04e7a5ca1364166a6b93e6cd0fcf58faf629a01c Yehuda Sadeh
09:47 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
Fixed, commit:92b59e90590aee501ae090adebf58978912f9dd3. Yehuda Sadeh
09:42 AM Bug #3658: osd/mon: stops processing pg stat messages
see /a/sage-ooo2, /a/sam-ooo3 Sage Weil
09:42 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
... Sage Weil
09:37 AM Feature #3622 (Rejected): RADOS pools should support more than 65535 PGs
kernel limit only Sage Weil
07:40 AM Bug #3633: mon: clock drift errors not reported by ceph status
'HEALTH_OK' and 'HEALTH_WARN' are assessed in a way that makes it non-trivial to leverage the existing way of doing t... Joao Eduardo Luis
06:03 AM Revision 799c59ae (ceph): rgw: remove useless configurable, fix swift auth error handling
Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been speci...
Yehuda Sadeh
06:03 AM Revision 08c64249 (ceph): rgw: don't initialize keystone if not set up
Fixes: #3653
No need to initialize keystone, including the keystone
revocation thread which was verbose if key stone ...
Yehuda Sadeh
05:56 AM Bug #3657 (Resolved): rbd: crash mapping image
I'm just creating this to track some activity from someone
on the mailing list reporting kernel crashes when attempt...
Alex Elder
01:07 AM Revision 3ed2d59e (ceph): rgw: fix error handling with swift
Fixes: #3649
verify_swift_token returns a bool and not an int.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
12:51 AM Revision 9a9778fb (ceph): Merge remote-tracking branch 'upstream/wip_pg_temp' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
Samuel Just

12/19/2012

11:19 PM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
fsstress stuck in _read_sync()
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_6...
Sam Lang
10:22 PM Revision 5497d228 (ceph): doc: Modified the demo configuration file for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:02 PM Revision 40fdd773 (ceph): doc: Added Gateway Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:02 PM Revision 5281ee24 (ceph): doc: Added Gateway Quick Start configuration file.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:01 PM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
Fixed, commit:799c59ae89c9a70f08d9bf2e7624d25e6641d41f. Yehuda Sadeh
05:13 PM rgw Bug #3649 (Fix Under Review): rgw: swift list buckets returns empty result
Yehuda Sadeh
05:02 PM rgw Bug #3649: rgw: swift list buckets returns empty result
Backporting is required for a bad error handling that triggered the symptoms. Yehuda Sadeh
04:59 PM rgw Bug #3649: rgw: swift list buckets returns empty result
This was happening when trying to use keystone, but without specifying 'rgw swift use keystone'. Ended up shortcuttin... Yehuda Sadeh
07:39 AM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
Yehuda Sadeh
10:01 PM Revision 84fb371d (ceph): Updated Getting Started index to include Gateway Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:01 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
done, commit:08c64249eb8cd7922de5c398a9426538918db77c. Yehuda Sadeh
05:13 PM rgw Bug #3653 (Fix Under Review): In bobtail, turn off keystone errors in radosgw.log when not applic...
Yehuda Sadeh
01:30 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
In bobtail, when radosgw is installed and configured on a cluster node, we see the following errors in radosgw.log, w... Tamilarasi muthamizhan
10:00 PM Revision 5e955103 (ceph): doc: Added REST Gateway link to 5-minute Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:52 PM Revision c2b231e4 (ceph): doc: Updated the 5-minute Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:47 PM Revision f596cee7 (ceph): doc: Updated Block Device Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:46 PM Revision 60b2857d (ceph): doc: Updated CephFS Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:45 PM Revision d17bd384 (ceph): doc: Added authentication and mkcephfs settings for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:36 PM Revision cd5c82db (ceph): doc: Added javascript code block tag.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
06:33 PM Revision 6122a9f6 (ceph): OSDMonitor: remove temp pg mappings with no up pgs
Otherwise, the pg won't be validly mapped until one of the temp
pgs comes back up.
Signed-off-by: Samuel Just <sam.j...
Samuel Just
06:32 PM Revision 2395af9f (ceph): OSDMap: make apply_incremental take a const argument
This requires us to copy bufferlists in two cases since bufferlist
does not have a const interator at this time.
Sig...
Samuel Just
05:17 PM rgw Tasks #3152 (Resolved): rgw: document usage testing
Done, commit:2f73c07511dce200b5dd298c6f86e03fbb9b3dd1 Yehuda Sadeh
05:16 PM rgw Feature #3494 (Closed): ceph S3 upload slowly
Closing, need more info about the specific user problem. Yehuda Sadeh
05:15 PM rgw Bug #3620 (Fix Under Review): rgw:improve multiple user access keys scalability
Yehuda Sadeh
05:13 PM rgw Bug #3648 (Fix Under Review): rgw: swift put object with empty mime type crashes
Yehuda Sadeh
07:39 AM rgw Bug #3648: rgw: swift put object with empty mime type crashes
[[https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1092137]] Yehuda Sadeh
07:38 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
Yehuda Sadeh
04:38 PM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
Given xml like:... Josh Durgin
04:37 PM Revision 2e49d5c4 (ceph): cephtool: add qa workunit
A few basic sanity checks, including a tell on a down osd.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:03 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
Proposed fix in wip-3637. The client's max size request in MClientCaps gets dropped if the file lock is in a non-sta... Sam Lang
03:10 PM Linux kernel client Bug #3519: rbd map hang during system startup
I found a possible explanation of the problem, and have
created and pushed a fix on top of the code that I most
rec...
Alex Elder
09:06 AM Linux kernel client Bug #3519: rbd map hang during system startup
Nick provided more information:
https://gist.github.com/raw/4330223/2f131ee312ee43cb3d8c307a9bf2f454a7edfe57/rbd...
Alex Elder
03:00 PM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
Dave Chinner has confirmed my explanation. The bug no
longer exists (in its current form) in the latest code,
so w...
Alex Elder
10:46 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
I'm fairly sure this is an XFS problem, so as suggested by
Ian I'm marking this "Won't Fix" (again). If new evidenc...
Alex Elder
06:27 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
Dave Chinner responded to my note with a few questions
requesting more information. I spent some time this
morning...
Alex Elder
12:30 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
Pushed fixes to wip-3625 (ceph and ceph-client repos) that implement #3 (mds sends back the created flag in reply to ... Sam Lang
12:29 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
David and I have posted comments on github about the fix to allow multiple
clients opening the same file to get a va...
Sam Lang
12:11 PM Bug #3652 (Duplicate): split should not mess up stats
this will be replaced with bugs corresponding to a design of some kind Samuel Just
12:10 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
Samuel Just
11:47 AM rbd Bug #3611 (Resolved): rbd.py: segfault with many snapshots
This was caused by c3107009f66bc06b5e14c465142e14120f9a4412. Reverting it fixes the problem. There is a corrected imp... Josh Durgin
11:44 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
This still occurs with the wip-3611 branch, so it is a different problem. Josh Durgin
11:15 AM Bug #3633: mon: clock drift errors not reported by ceph status
Here's my config: http://pastie.org/5554031
I'm pretty sure there was no warning when I did 'ceph -w', because I w...
Corin Langosch
08:25 AM Bug #3633 (In Progress): mon: clock drift errors not reported by ceph status
I'm looking into an adequate way to make 'ceph -s' return a warning when the clocks have drifted.
However, 'ceph -...
Joao Eduardo Luis
10:29 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Below works, but "ceph -s" does not
ubuntu@ceph1:~$ ceph health
2012-12-19 18:27:51.090414 mon <- [health]
2012-...
Anonymous
10:22 AM Bug #2784 (Resolved): osd hit suicide timeout
Not actually a bug in the renzhi case. Samuel Just
08:18 AM rgw Feature #3207: qa: swift functional tests in nightly
from James' last bug report:... Sage Weil
08:02 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
that line of code is... Sage Weil
07:52 AM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
... Sage Weil
05:00 AM Revision d9c2396b (ceph): ceph.spec.in: Improve finding location of jni.h for sles11.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
04:08 AM Revision b2eb8bd2 (ceph): osd: implement 'version' tell command
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:40 AM Revision 46344105 (ceph): ceph.spec.in: Add packages for libcephfs-jni and libcephfs-java
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
03:21 AM Revision 85763f09 (ceph): ceph: report error string to stderr, not stdout
If we return an error, send the message to stderr. This makes things
more easily scriptable because error messages w...
Sage Weil
03:20 AM Revision 5f24e23b (ceph): ceph: fix error reporting when tell target is invalid or down
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:11 AM Revision b00eb6fd (ceph): mon: 'ceph osd ls'
List osd ids that exist.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
01:00 AM Revision 212f6b56 (ceph): OSDMap::dump: tag pg_temp mappings with pgid
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Samuel Just

12/18/2012

10:12 PM Revision 04e7a5ca (ceph): rgw: configurable exit timeout
Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s...
Yehuda Sadeh
08:59 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
The hang occurs because a client requests a max size increase, but doesn't have write caps, so the mds puts it on the... Sam Lang
07:53 AM CephFS Bug #3637 (Resolved): client: not issuing caps for with clients doing shared writes
With 3 clients running ceph-fuse, running the ior command:
/tmp/cephtest/binary/usr/local/bin/ior -e -w -r -W -b 1...
Sam Lang
07:40 PM Bug #3646: pg_temp with two down/out osds
sounds exactly right Sage Weil
07:29 PM Bug #3646: pg_temp with two down/out osds
It actually does that already. OSDMonitor::remove_redundant_pg_temp(). I'll hook in around there for the fix, doing ... Samuel Just
06:42 PM Bug #3646: pg_temp with two down/out osds
Good point. We can also remove mappings that match the crush result. Although that is a more expensive scan by the m... Sage Weil
04:41 PM Bug #3646 (Resolved): pg_temp with two down/out osds
Encountered on MassEffect, osdmap is attached.
{ "pgid": "2.25",
"osds": [
30,...
Samuel Just
06:08 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
added output for dmesg on ceph1 Anonymous
06:04 PM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
Seeing errors when setting up ceph from scratch with the options in the ceph.conf file. I forgot the auth options f... Anonymous
04:23 PM CephFS Feature #3645 (Resolved): Requesting the ability to rename CephFS snapshots inside the ".snap"-di...
I believe the ability to rename CephFS snapshots can come in handy in many cases. For example, if one wants to imple... Oliver Daudey
02:26 PM Bug #3644 (Resolved): ObjectCacher: discard_set ignores waiters
IO in flight contained entirely in a discarded section will not be acked to the caller, since the waiters are removed... Josh Durgin
01:41 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
On a single node bobtail cluster,the ceph-auth setting is as mentioned below,
ubuntu@burnupi09:/etc/ceph$ sudo cat...
Tamilarasi muthamizhan
01:26 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
Instead of using an assert version op, establish the watch before reading the header. This hasn't actually caused any... Josh Durgin
12:01 PM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely

Moved to #3641
Sam Lang
10:56 AM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely
Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
...
Sam Lang
12:00 PM CephFS Bug #3641 (Resolved): kclient: hit EOF prematurely

Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
...
Sam Lang
11:57 AM CephFS Bug #3640 (Duplicate): kclient: hang and kernel panic

Creating a placeholder for the following issue reported by Eric Renfro on the mailing list:
http://thread.gmane....
Sam Lang
11:17 AM rgw Feature #2941 (Fix Under Review): rgw: improve streaming read performance
Yehuda Sadeh
10:36 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
Currently exit timeout is 5 seconds, we should make it configurable, and probably have a higher default. Yehuda Sadeh
09:46 AM Bug #2784: osd hit suicide timeout
This bug popped again on v0.55.1
renzhi on IRC stumbled upon it after upgrading from v0.48.2, and has been unable ...
Joao Eduardo Luis
08:54 AM rbd Bug #3611: rbd.py: segfault with many snapshots
This survived overnight testing (with the python librbd tests) with 56 passes. Josh Durgin
08:03 AM Linux kernel client Bug #3519: rbd map hang during system startup
I looked through the latest log message supplied by Nick
Bartos. I scanned through it to look only at the rbd
acti...
Alex Elder
07:39 AM Linux kernel client Bug #3519: rbd map hang during system startup
There has been quite a lot of activity on this bug but it's
all been recorded on the mailing list rather than here.
...
Alex Elder
06:40 AM Bug #3624 (In Progress): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last ...
Answer to my question, based on evidence in this bug:
The control (yaml) file contains this:
overrides:
...
Alex Elder
04:42 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
Note how this was on a cluster with *very* few OSDs (4 at the time!) as I originally mentioned and this may play a fa... Faidon Liambotis
01:12 AM Revision dbe6fb72 (ceph): crushtool: only dump usage on -h|--help
Instead, output a useful error message.
Fix error code to be a success.
Add test for the output usage.
Signed-off-...
Sage Weil
01:12 AM Revision 6c7ec2d4 (ceph): crushtool: nicer error message on extra args
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:51 AM Revision 0dd13025 (ceph): Merge remote-tracking branch 'gh/testing' into next
Sage Weil
12:38 AM Revision fd482a27 (ceph): ceph.spec.in: Update pre-reqs for ceph-fuse pacakge.
Gary Lowell
12:29 AM Revision 1b67a438 (ceph): Revert "objecter: don't use new tid when retrying notifies"
This reverts commit c3107009f66bc06b5e14c465142e14120f9a4412.
This appears to be causing problems in the objecter by...
Sage Weil
12:14 AM Feature #3288: docs: document the chooseleaf command in crush
Commit 9f0510 added docs for multiple crush hierarchies and the examples use chooseleaf, which is still undocumented. Faidon Liambotis

12/17/2012

10:59 PM rbd Bug #3611 (Fix Under Review): rbd.py: segfault with many snapshots
wip-3611 contains a respin of the bad commit. It's passing test_stress_watch with failure injection and the python te... Josh Durgin
11:09 AM rbd Bug #3611: rbd.py: segfault with many snapshots
also, ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16289 Tamilarasi muthamizhan
11:08 AM rbd Bug #3611: rbd.py: segfault with many snapshots
recent log: ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16281 Tamilarasi muthamizhan
10:53 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Thanks for the logs. All the differences there are zeroes where actual data should be, but the librbd debug log shows... Josh Durgin
10:41 PM Revision bdc998ef (ceph): mon: OSDMonitor: add option 'mon_max_pool_pg_num' and limit 'pg_num' ac...
Instead of having a hardcoded default, use a configurable one. It is
limited to 65536 until future testing guarantees...
Joao Eduardo Luis
10:39 PM Revision 21c47c6a (ceph): osd: debug EMSGSIZE / OSD_WRITETOOBIG
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:39 PM Revision f81ca898 (ceph): doc/release-notes: don't use format 2 rbd images until after osds upgrade
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
07:14 PM Revision 3c246226 (ceph): crushtool: add --set-chooseleaf-descend-once to help
We forgot to update this in 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
06:53 PM Revision 874b2732 (ceph): doc/release-notes: 'mon max pool pg num'
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:30 PM Feature #1655: gitbuilder aggregator page
... Sage Weil
04:25 PM Feature #1655: gitbuilder aggregator page
I'm not sure if anyone else has asked, but any chance of sharing the updated server side cgi script which now has aja... Jimmy Tang
04:10 PM Bug #3617 (In Progress): Ceph doesn't support > 65536 PGs(?) and fails silently
Looking closer, I have a feeling this was a large # of pgs making a different bug surface. Jim has been running his ... Sage Weil
04:03 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
Note how your commit changed the (default) limit from 65535 to 65536. Faidon Liambotis
04:01 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
The default is now 65536, and can be adjusted using the option 'mon max pool pg num' if higher values are desired. Joao Eduardo Luis
03:57 PM Revision e8b8531e (ceph): doc: fix typo in config file
The option is host, not hostname
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
02:35 PM Bug #3636 (Resolved): sub_op_modify assert(!missing.is_missing(soid));
Encountered in Alexandria, fixed in 047aecd90f1dbfb172f48f9d10b67e82b3a8ce15, may it rest in piece. Samuel Just
12:40 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
Trivial change. Biggest decision is which libc routine to use to spawn the command... Dan Mick
11:42 AM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
The rbd command line interface creates mappings by sending
output to the /sys/bus/rbd/add file system entry, and rem...
Alex Elder
11:14 AM rgw Feature #3634 (Resolved): rgw: improve teuthology radosgw-admin test
Yehuda Sadeh
11:10 AM rgw Bug #3620: rgw:improve multiple user access keys scalability
Possibly impacts interoperability Ian Colle
11:09 AM rgw Bug #3628: rgw: leak of object parts on partial upload
Appears to only be in Argonaut Ian Colle
09:31 AM rgw Bug #3628: rgw: leak of object parts on partial upload
Actually, per user, this affects older versions (argonaut), but does not happen in newer version. Looking at the code... Yehuda Sadeh
10:53 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
Tried to reproduce this behavior to no avail.
There are operations on the test that do hang for a long time, but a...
Joao Eduardo Luis
09:49 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
When XFS gets an I/O error, there is not a lot it can do.
If it happens to involve user data blocks it could continu...
Alex Elder
09:42 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
XFS bug Ian Colle
09:42 AM Bug #3599 (Resolved): mkcephfs should fail out when ceph.conf has an error
Sage Weil
09:38 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
Possibly related to 3611 Ian Colle
09:14 AM Bug #3629: test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
I've gone through the logs again and again, as well as through the code. The logs only show the last couple hundred l... Joao Eduardo Luis
08:08 AM Linux kernel client Bug #2764: xfstest hang; osd socket closed messages
I have posted a fix for the "socket closed" messages, and it has
been reviewed and will fairly soon be pushed to the...
Alex Elder
07:36 AM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
Using argonat 0.48.2. Today all ceph commands were randomly slow. So I checked all hosts, all monitors (3) and osds (... Corin Langosch
 

Also available in: Atom