Project

General

Profile

Activity

From 12/06/2012 to 01/04/2013

01/04/2013

09:45 PM Revision 7513e971 (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:44 PM Revision e89b6ade (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:37 PM Revision 6a3d475c (ceph): Merge remote branch 'origin/wip-rbd-watch'
Reviewed-by: Dan Mick <dan.mick@inktank.com> Josh Durgin
08:32 PM Revision cd5f2bfd (ceph): ObjectCacher: fix off-by-one error in split
This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which wou...
Josh Durgin
07:54 PM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
commit:3a9408742a8a6cbc870cba543a208285f1a6cec1 Sage Weil
03:25 PM CephFS Bug #3666: Segfault running test_libcephfs
I pushed a new wip-client-shutdown. This switches the clean-up order of client/messenger in libcephfs, rather than mo... Noah Watkins
01:36 PM CephFS Bug #3666: Segfault running test_libcephfs
Right, I think your fix will work, but it breaks the interface abstraction (messenger is created above the client, de... Sam Lang
01:16 PM CephFS Bug #3666: Segfault running test_libcephfs
This is what I'm running to reproduce the error. It's been running now for an hour on wip-client-shutdown without any... Noah Watkins
12:57 PM CephFS Bug #3666: Segfault running test_libcephfs
Rather than moving messenger shutdown into client shutdown? Noah Watkins
12:48 PM CephFS Bug #3666: Segfault running test_libcephfs
A similar issue was just handled in the ceph_fuse.cc code. There we just delay deleting the client till the end. Yo... Sam Lang
10:41 AM CephFS Bug #3666: Segfault running test_libcephfs
During unmount, the client is shutdown and free'd before the messenger. If any messages are delivered after the clien... Noah Watkins
07:07 PM Revision 802c486f (ceph): config: change default log_max_recent to 10,000
Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: S...
Sage Weil
06:18 PM Revision d6496abf (ceph): remove rbd_header_race test
This no longer works since export does not do a watch, and the race is
being closed a different way not detectable by...
Josh Durgin
06:16 PM Revision 620dd551 (ceph): task: mon_clock_skew_check.py: Check for clock skews on the monitors
Will run for as long as teuthology runs. By default, fails if any clock
skews higher than 0.05 seconds are detected, ...
Joao Eduardo Luis
06:11 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
commit:0978dc4963fe441fb67afecb074bc7b01798d59d Dan Mick
03:12 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre...
Tamilarasi muthamizhan
06:06 PM Bug #3702: OSD SIGABRT during startup
Sage Weil wrote:
> Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
...
Justin Lott
09:42 AM Bug #3702 (Need More Info): OSD SIGABRT during startup
Sage Weil
05:54 PM Revision 1a878611 (ceph): regression: include nfs suite
Sage Weil
05:50 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
got msgr logs in ubuntu@teuthology:/a/sage-a3/34724, but the crash looked different from the earlier ones (whose logs... Sage Weil
05:40 PM Bug #3731 (Fix Under Review): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompati...
see wip-3731 Sage Weil
05:19 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Agreed. And let's make sure it's fixed for 0.56.1.
Sage Weil
05:15 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Discussed this with Dan and Sam and I think we just want to roll this patch back and tell people not to use v0.56 for... Greg Farnum
04:34 PM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
CEPH_OSD_OP_CALL changed to remove the CEPH_OSD_OP_MODE_RD bit in
91e941aef9f55425cc12204146f26d79c444cfae; however,...
Dan Mick
05:03 PM Revision e88b909a (ceph): task: ceph_manager: add 'get_mon_health' function
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
03:29 PM CephFS Feature #3730 (Closed): Support replication factor in Hadoop
In order to support per-file replication values in Hadoop we need to specify that a new file should be generated in a... Noah Watkins
02:38 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5 Josh Durgin
11:31 AM rbd Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
in branch wip-rbd-watch Josh Durgin
01:54 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
Also, name it something along the lines of get_stripe_granularity() and not .._min(imum)_ as that isn't entirely accu... Anonymous
01:40 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
After a discussion on jabber, the decision is to go with exposing a function call in libcephfs and then using that in... Anonymous
11:09 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
The Hadoop bindings are using the blocksize as the stripe size. If a block size is explicitly passed down, it ends up... Anonymous
01:00 PM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
Heads up, Zheng Yan's patches on the mds fix issues related to running multiclient dbench tests. Sam Lang
12:24 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hmm, okay. I wasn't real clear on the previous bugs so I'll need to look at it more if I end up taking this, but soun... Greg Farnum
11:46 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thin...
Sage Weil
11:35 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thing set_trace() does. Ex... Greg Farnum
11:17 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Am I reading it correctly that this is just going to be doing the config and wrapper work to no...
Sage Weil
09:01 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Am I reading it correctly that this is just going to be doing the config and wrapper work to not call set_trace() in ... Greg Farnum
12:20 PM CephFS Feature #3543: mds: new encoding
Sage Weil
12:20 PM CephFS Feature #3728: mds: draft design for lookup by ino
Sage Weil
12:14 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
12:20 PM CephFS Feature #3570: teuthology: mds thrasher
Sage Weil
12:06 PM CephFS Feature #3727 (Resolved): mds: refactor EMetablob encoding paths
Right now, the EMetaBlob sub-structures — for performance reasons — use an encoding pattern that doesn't match anythi... Sage Weil
11:42 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
Greg Farnum wrote:
> I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should e...
Sage Weil
09:34 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should encompass. Are you talk... Greg Farnum
11:41 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
This was a whiteboard discussion 2 years ago. Nothing was written down. We should reopen new and more detailed issu... Sage Weil
09:29 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
Where are the results of this bug? It's marked resolved but I don't see any fsck references in the git tree, and ther... Greg Farnum
11:39 AM Feature #685: libcephmon: interact with ceph monitors via a library
BTW it may make sense to push the client command stuff in the ceph tool into MonClient, and then wrap that in libceph... Sage Weil
11:38 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Greg Farnum wrote:
> Do we have a separate bug for the library calls this needs?
#685, which would take the clien...
Sage Weil
09:27 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Do we have a separate bug for the library calls this needs? Greg Farnum
11:36 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
Greg Farnum wrote:
> And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the...
Sage Weil
09:24 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the Ceph shim or anything... Greg Farnum
11:35 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Greg Farnum wrote:
> Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph moun...
Sage Weil
09:24 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph mount and then running tes... Greg Farnum
11:34 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Greg Farnum wrote:
> Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a...
Sage Weil
09:22 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a cluster going with vs... Greg Farnum
11:32 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
Greg Farnum wrote:
> What all does this encompass? Design? Implementation? Does it need to be an online switch or ca...
Sage Weil
09:13 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
What all does this encompass? Design? Implementation? Does it need to be an online switch or can it be an offline job? Greg Farnum
11:30 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Greg Farnum wrote:
> Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect ...
Sage Weil
09:12 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect so, but the presence o... Greg Farnum
11:23 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
Josh Durgin
10:32 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734... Tamilarasi muthamizhan
11:23 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Greg Farnum wrote:
> Do we have any kind of design for this? We've talked about it some and it's conceptually simple...
Sage Weil
09:08 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Do we have any kind of design for this? We've talked about it some and it's conceptually simple, but splitting up the... Greg Farnum
11:15 AM CephFS Feature #626 (In Progress): qa: add IOR, rompio, or other parallel workloads suite
Yeah, that's what slang's working on to enable this. Assigning this to him. Sage Weil
08:57 AM CephFS Feature #626: qa: add IOR, rompio, or other parallel workloads suite
SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sur... Greg Farnum
11:14 AM Bug #3722: osd: indefinitely hung request on stable cluster
the trigger is a brief osd reset due to an intermittent network outage. no actual ceph-osd daemons restart.
<pr...
Sage Weil
09:39 AM Bug #3722 (Need More Info): osd: indefinitely hung request on stable cluster
Sage Weil
08:36 AM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
0.48.2argonaut, rbd workload.
occasional requests are blocked indefinitely.
*may* be osd down/up cycles (due to...
Sage Weil
11:13 AM CephFS Feature #3621 (Resolved): qa: add knfsd reexport tests to qa suite
Sage Weil
10:53 AM Bug #3723: ceph osd down command reports incorrectly
similarly for "ceph osd in" command as well
ubuntu@burnupi06:/etc/ceph$ sudo ceph osd in 2 -k /etc/ceph/ceph.key...
Tamilarasi muthamizhan
09:33 AM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
issuing the command: "sudo ceph osd down 2" reports osd.2 is already down but sudo ceph osd stat reports all are up.
...
Ken Franklin
10:21 AM Bug #3698 (In Progress): filestore: ENOENT on clone
Sage Weil
09:43 AM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
commit:4ae4dce5c5bb547c1ff54d07c8b70d287490cae9 Sage Weil
09:43 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
Oh, those are librados specific numbers, aren't they. So this bug is to create and expose a libceph version, then. Wh... Greg Farnum
09:35 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
In libcephfs there is a call to get Ceph version (yes, just expose this). But, I recall Sage mentioning that it might... Noah Watkins
09:19 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
This is just exposing the librados version() function to Java, right? Greg Farnum
09:41 AM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
The only radosgw-admin API calls currently are *get usage* and *trim usage* The docs at
http://ceph.com/doc...
caleb miles
09:41 AM CephFS Cleanup #660: mds: use helpers in mknod, mkdir, openc paths
What kind of helpers are you talking about with this? inode fetchers and lock grabbers? In a quick scan over handle_c... Greg Farnum
09:36 AM CephFS Feature #603: mds: repair directory hierarchy
This is part of #82 fsck, right? Do we have a more detailed algorithm anywhere? Greg Farnum
05:02 AM Revision 39a734fb (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
04:17 AM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> Nat, you should be able to install either of libtcmalloc-minimal or libgoogle-perftools — are...
Nat Makarevitch
03:40 AM Revision c63c6646 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
03:00 AM Revision acfa0c9a (ceph): mds: optimize C_MDC_RetryOpenRemoteIno
When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish
context for discovering remote inode. When it...
Yan, Zheng
02:45 AM Revision b03eab22 (ceph): mds: forbid creating file in deleted directory
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 59953257 (ceph): mds: keep dentry lock in sync state as much as possible
Unlike locks of other types, dentry lock in unreadable state can block path
traverse, so it should be in sync state a...
Yan, Zheng
02:45 AM Revision f9280cb6 (ceph): mds: fix replica state for LOCK_MIX_LOCK
LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state
should be LOCK_MIX.
Signed-off-by: Yan, ...
Yan, Zheng
02:45 AM Revision 248e4ab8 (ceph): mds: fix cap mask for ifile lock
ifile lock has 8 cap bits, should its cap mask should be 0xff
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Yan, Zheng
02:45 AM Revision 420f3355 (ceph): mds: rdlock prepended dest trace when handling rename
rdlock prepended dest trace to prevent them from being xlocked by
someone else.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision ea2fd127 (ceph): mds: check null context in CDir::fetch()
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 3705c7ca (ceph): mds: drop locks when opening remote dentry
Opening remote dentry while holding locks may cause dead lock. For example,
'discover' is blocked by a xlocked dentry...
Yan, Zheng
02:45 AM Revision ca4dc4db (ceph): mds: check if stray dentry is needed
The necessity of stray dentry can change before the request acquires
all locks.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision acbe6d97 (ceph): mds: don't issue caps while inode is exporting caps
If issue caps while inode is exporting caps, the client will drop the
caps soon when it receives the CAP_OP_EXPORT me...
Yan, Zheng
02:45 AM Revision d379ac8e (ceph): mds: disable concurrent remote locking
Current code allows multiple MDRequests to concurrently acquire a
remote lock. But a lock ACK message wakes all reque...
Yan, Zheng
01:15 AM Revision 28d59d37 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
12:23 AM Revision 49416619 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
12:13 AM Revision f1e0305f (ceph): doc: Removed the --without-tcmalloc flag until further advised.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:07 AM Revision 19df2086 (ceph): Merge pull request #30 from rca/master
Minor clarification in docs. Sage Weil

01/03/2013

11:04 PM Revision 5ce47c2a (ceph): ssh_keys.py: pull the keys out of targets entry
rather than the hosts known hosts file.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@i...
Joe Buck
10:51 PM Revision 88af7d18 (ceph): doc: Added defaults for PGs, links to recommended settings, and updated...
Fixes: #3555
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
10:32 PM Revision b8f061dc (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
10:18 PM Revision 4ae4dce5 (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
09:30 PM Revision 73bc8ffc (ceph): doc: Added comments on --without-tcmalloc option when building Ceph.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:30 PM Revision 37b57cdf (ceph): Update doc/rados/configuration/filesystem-recommendations.rst
Clarified when it's necessary to use the setting:
filestore xattr use omap = true
rca
09:29 PM Revision 43ef6772 (ceph): doc: Added some packages to the copyable line.
Fixes: #3686
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
09:28 PM Revision 333ae82c (ceph): doc: Fixed syntax error.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:57 PM Revision aaa03bbc (ceph): qa: Add knfsd reexport suite
Feature http://tracker.newdream.net/issues/3621
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman
08:55 PM Revision 67968d11 (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
08:54 PM Revision 34266e6b (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
08:53 PM Revision 4034f6c8 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
08:53 PM Revision 7e94f6f1 (ceph): Merge remote-tracking branch 'gh/wip-3714-b' into next
Signed-off-by: Samuel Just <sam.just@inktank.com> Sage Weil
08:44 PM Revision 224a33bb (ceph): qa/workunit: Add dbench-short.sh for nfs suite
A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed...
David Zafman
08:13 PM Documentation #3709 (In Progress): crush-map.rst: claims 'types' are default, not true (must be s...
John Wilkins
02:32 PM Documentation #3709: crush-map.rst: claims 'types' are default, not true (must be specified); spe...
These are "defaults" in the sense that they're generated as part of the default OSD Map. Apparently that needs to be ... Greg Farnum
07:57 PM Documentation #3707 (In Progress): crush-map.rst: syntax error in example
John Wilkins
05:54 PM Bug #3702: OSD SIGABRT during startup
Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else? Sage Weil
05:45 PM Bug #3721: filestore: op_seq written in wrong order on non-btrfs
Sage Weil
04:02 PM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
see wip-fsync Sage Weil
05:23 PM Revision f8bb4814 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
05:14 PM Revision eee795c0 (ceph): rbd_xfstests.yaml: drop test 186
Stop running test 186. It keeps failing in nightly runs, unable
to unmount the scratch file system during setup. As...
Alex Elder
04:47 PM rgw Documentation #2993 (Resolved): doc: write quick RGW guide (if feasible)
John Wilkins
04:45 PM devops Feature #2884: doc: osd hotplugging
I believe the hotplug event was added, but will confirm. John Wilkins
04:43 PM devops Documentation #2974: doc: update chef docs for mon key distribution
I believe this is done. Will verify. John Wilkins
04:13 PM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> John, can you remove that --without-tcmalloc bit until we hear more?
>
> Nat, you should be ...
John Wilkins
02:48 PM devops Documentation #3686 (In Progress): install prerequisites (Debian)
John, can you remove that --without-tcmalloc bit until we hear more?
Nat, you should be able to install either of ...
Greg Farnum
02:45 PM devops Documentation #3686: install prerequisites (Debian)
Eek. We really, really want people to be using tcmalloc (memory behavior without it is astonishingly atrocious). I kn... Greg Farnum
01:31 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
Added packages to the copyable lines. Modified the build page to include --without-tcmalloc. John Wilkins
03:50 PM Bug #3698: filestore: ENOENT on clone
Ok. The recovery_qos stuff can allow a client op to reorder past a push. This is a problem since the push might be ... Samuel Just
07:53 AM Bug #3698: filestore: ENOENT on clone
another instance with logs: ubuntu@teuthology:/a/sage-a2/33879 Sage Weil
02:52 PM Documentation #3555 (Resolved): {page-num} in ceph osd pool create is not optional
Updated the document to add "required," the default values, a link to calculating PG values, clarification about PGP,... John Wilkins
02:49 PM Bug #3633: mon: clock drift errors not reported by ceph status
The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based enti... Greg Farnum
10:12 AM Bug #3633: mon: clock drift errors not reported by ceph status
The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might ... Joao Eduardo Luis
08:46 AM Bug #3633: mon: clock drift errors not reported by ceph status
Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ce... Corin Langosch
02:34 PM Bug #3720: Ceph Reporting Negative Number of Degraded objects
Per Josh D's suggestion, I set the tunables and it resolved the issue.
# ceph osd getcrushmap -o /tmp/crush
# cru...
Mike Dawson
01:02 PM Bug #3720 (Duplicate): Ceph Reporting Negative Number of Degraded objects
Changed the replication of two pools from 2x to 3x. Cluster rebalanced to nearly HEALTH_OK but got stuck at:
HEALT...
Mike Dawson
02:32 PM rbd Bug #3697: rbd copy.sh test failing in nightly
When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te... Dan Mick
01:59 PM CephFS Bug #3597: ceph-fuse: denying root access
I believe that we can reproduce this error. We are running Ubuntu 12.04 LTS Server on both the client and on the Cep... Graham Hemingway
12:56 PM CephFS Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
logs: ubuntu@teuthology:/a/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33621... Tamilarasi muthamizhan
12:53 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
commit:7e94f6f1a7b7a865433edacd6a521f6ea1170eac Sage Weil
10:28 AM Bug #3714 (Fix Under Review): osd: new peering code does not consume osdmaps prior to booting
Sage Weil
12:48 PM CephFS Bug #3718 (Rejected): multi-client dbench gets stuck over NFS exported cephfs
When running qa/workunit dbench.sh the dbench 1 passes, but the dbench 10 gets hung up.
We should check this with ...
David Zafman
12:28 PM CephFS Feature #3621 (In Progress): qa: add knfsd reexport tests to qa suite
David Zafman
09:49 AM RADOS Feature #3717 (New): osd: Make Rebalancing Smarter
From Corin Langosch - During recovery/ rebalacing it can happen that an osd receives lots of new data before data tha... Ian Colle
09:45 AM Bug #3716: recovery should take osd usage into account
1. My cluster already uses the tuned crushmap "crushtool -i /tmp/crush --set-choose-local-tries 0 --set-choose-local-... Corin Langosch
09:36 AM Bug #3716 (Closed): recovery should take osd usage into account
#1: this is a matter of adjusting the crush tunables. see http://ceph.com/docs/master/rados/operations/crush-map/?hig... Sage Weil
09:08 AM Bug #3716 (Closed): recovery should take osd usage into account
Using argonaut 0.48.2. Yesterday one osd crashed (disk io error) and recovery started as expected. All osds had an us... Corin Langosch
09:44 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
Joao,
thanks for the update.
Since mine came about due to a testing environment build on DHCP, I did not have the ...
Anonymous
09:32 AM CephFS Bug #3681: kclient fsx fails nightly
Its most likely all the same bug, but fsx fails in different ways each time (always because of a truncate down). The... Sam Lang
09:27 AM CephFS Feature #3543: mds: new encoding
right. about 80% complete, see wip-mds-encoding. Sage Weil
09:22 AM CephFS Feature #3543: mds: new encoding
What is this task? Switching to use our versioned encoding scheme? Greg Farnum
09:17 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,...
Alex Elder
06:39 AM Revision a32d6c5d (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
06:20 AM Revision 0bfad8ef (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
06:04 AM Revision 5fc94e89 (ceph): osd: drop oldest_last_clean from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:04 AM Revision 67f7ee67 (ceph): osd: drop unused variables from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:09 AM Revision a14a36ed (ceph): OSDMap: fix modifed -> modified typo
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:44 AM Revision 9ca69e73 (ceph): ceph: malloc check =3 means we hear on stderr too
Sage Weil
03:58 AM Revision 2141454e (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
02:13 AM Revision 6b5a89d2 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:01 AM Revision 43cba617 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil

01/02/2013

11:59 PM Revision 29ff87a5 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
11:58 PM Revision 64d2760a (ceph): doc: Added a memory profiling section. Ported from the wiki.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:57 PM Revision 5066abf1 (ceph): doc: Added memory profiling to the index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:08 PM Revision 0e9a0cd7 (ceph): qa/workunit: Update pjd script to use new tarball
The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-of...
Sam Lang
11:07 PM Bug #3715: Crash during 0.55 -> 0.56 upgrade
is someone sending an MOSDOp that has no ops? init_op_flags() is called before can_*(), so this sounds like an empty... Sage Weil
10:05 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
I started upgrading my 0.55.1 cluster to 0.56 and at one point in the middle of the upgrade, all 0.55.1 OSDs started ... Faidon Liambotis
10:38 PM Revision d8940d15 (ceph): fuse: Fix cleanup code path on init failure
With the changes from 856f32ab, the cfuse.init call returns
a _positive_ errno, which was getting ignored. Also, if ...
Sam Lang
10:15 PM Revision c4370ff0 (ceph): librbd: establish watch before reading header
This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou...
Josh Durgin
09:56 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ... Dan Mick
09:39 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
Sage Weil
09:42 PM Revision 93656013 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 483c6f76adf960017614a8641c4dcdbd7902ce33)
Sage Weil
09:42 PM Revision be0473bb (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c461e7fc1e34fdddd8ff8833693d067451df906b)
Sage Weil
09:42 PM Revision de619327 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:42 PM Revision ded454c6 (ceph): os/FileJournal: logger is optional
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 076b418c7f03c5c62f811fdc566e4e2b776389b7)
Sage Weil
09:42 PM Revision 9a1cf518 (ceph): Merge branch 'wip-journal-aio' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
Backport: bobtail
Sage Weil
09:39 PM Revision dda7b651 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:39 PM Revision c461e7fc (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:39 PM Revision 483c6f76 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:34 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
Previously when we handled the old osdmaps catching up (pre-MOSDBoot) we'd do advance_map and the pgs would update th... Sage Weil
08:32 PM Revision e0858fa8 (ceph): Revert "librbd: ensure header is up to date after initial read"
Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th...
Josh Durgin
08:31 PM Revision 06310994 (ceph): ceph: enable malloc debugging for ceph-osd
Sage Weil
07:49 PM Revision 3686371e (ceph): rados: add test_filejournal
This writes to /tmp by default; should be ok plana, since it's / and not
tmpfs.
Sage Weil
07:24 PM Revision 82297706 (ceph): doc: Minor edits.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:15 PM Revision d3b9803e (ceph): doc: Fixed typo, clarified usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
05:23 PM rbd Bug #3685: xfs test 186 fails in the nightlies
It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u...
Alex Elder
02:34 PM rbd Bug #3685: xfs test 186 fails in the nightlies
OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera...
Alex Elder
12:06 PM rbd Bug #3685: xfs test 186 fails in the nightlies
Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co...
Alex Elder
11:56 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x...
Alex Elder
05:15 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
reproduced this on burnupi21. Tamilarasi muthamizhan
05:00 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
with glibc malloc and debug enabled:... Sage Weil
08:57 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another one with full osd logs:... Sage Weil
04:13 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
This has been ported. I haven't added a valgrind use case yet. John Wilkins
01:20 PM Documentation #3687 (In Progress): Documentation needs a "memory profiling" section
John Wilkins
03:51 PM Feature #3713 (Rejected): ceph osd tree should show disk usage
As ceph seems to already monitor the disk usage of each osd it's be great to have it displayed in "ceph osd tree". Corin Langosch
03:08 PM rbd Bug #3619: librbd: read_iterate sparse behavior broken
Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed.
Dan Mick
02:11 PM devops Feature #3712 (New): Ceph Commands should provide appropriate responses, when Ceph Service is not...
When ceph service is not running, running other ceph command should give a response that makes sense instead of just ... Anonymous
02:02 PM Cleanup #2078: ceph tool: only output response data to stdout
i think we need to phase out all of the first-line nonsense. Sage Weil
01:48 PM Cleanup #2078: ceph tool: only output response data to stdout
This also affects things like ceph pg dump --format=json. You can't pipe it to a pretty printer without ignoring the ... Josh Durgin
01:52 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
The implication is that 'N' is "the number of buckets of type 'type' available", but Sam believes it must really be "... Dan Mick
01:40 PM Bug #3684 (Resolved): filejournal: aio vector size is not limited
Sage Weil
01:34 PM rbd Feature #3456 (Closed): make exit code of ceph status commands status dependent
Josh Durgin
01:29 PM rbd Documentation #2992 (Resolved): doc: RBD parent/child snapshot
Josh Durgin
01:26 PM rbd Documentation #2992: doc: RBD parent/child snapshot
This should be resolved. John Wilkins
01:24 PM Documentation #3710 (Closed): crush-map.rst: talks about 'step choose' but does not document it
Dan Mick
01:23 PM Documentation #3411 (Resolved): doc: add introductory detail to the main doc page (index.rst)
John Wilkins
01:21 PM rgw Feature #3207 (In Progress): qa: swift functional tests in nightly
Sage Weil
01:21 PM rgw Feature #3366 (In Progress): rgw: dr: define management api
Sage Weil
01:18 PM Documentation #2980 (Resolved): doc: write upgrading Ceph version
This was checked in and also reviewed by Josh and Sage. John Wilkins
01:16 PM Documentation #3322 (Resolved): doc: Explain multi-tenant CephFS
This has been added to a the end of the Ceph Configuration file section. It may benefit from review, as I believe the... John Wilkins
01:12 PM Feature #647 (Duplicate): mon: refactor paxos interaction
Sage Weil
01:11 PM Feature #183 (Resolved): qa: xfstests workunit
Sage Weil
01:10 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
crush-map.rst claims that the bucket type defaults are as appear in the table, but they're
not defaults; they must b...
Dan Mick
01:09 PM Feature #3376 (Duplicate): use external leveldb package for default builds
Sage Weil
01:08 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
example includes:
item ceph-osd-server-1 2.00
this must have 'weight' explicitly in the line:
...
Dan Mick
01:03 PM Feature #3425 (Resolved): mon workload generator
Sage Weil
12:39 PM Bug #3702: OSD SIGABRT during startup
Attempting to start osd.131 (which was down due to the above noted problems) today resulted in quorum loss. Essential... Justin Lott
12:03 PM rgw Bug #3706 (Resolved): rgw functional test testSlashInName failed in nightly
logs: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33224... Tamilarasi muthamizhan
11:25 AM Revision a79493da (ceph): mds: skip frozen inode when assimilating dirty inodes' rstat
CDir::assimilate_dirty_rstat_inodes() may encounter frozen inodes that
are being renamed. Skip these frozen inodes be...
Yan, Zheng
11:25 AM Revision 2f96b472 (ceph): mds: fix anchor table commit race
Anchor table updates for a given inode is fully serialized on client side.
But due to network latency, two commit req...
Yan, Zheng
11:25 AM Revision 7e04504d (ceph): mds: fix on-going two phrase commits tracking
The slaves for two phrase commit should be mdr->more()->witnessed
instead of mdr->more()->slaves. mdr->more()->slaves...
Yan, Zheng
11:25 AM Revision b3796f46 (ceph): mds: indroduce DROPLOCKS slave request
In some rare case, Locker::acquire_locks() drops all acquired locks
in order to auth pin new objects. But Locker::dro...
Yan, Zheng
11:25 AM Revision b2d5005a (ceph): mds: fix lock state transition check
Locker::simple_excl() and Locker::scatter_mix() miss is_rdlocked
check; Locker::file_excl() miss is_rdlocked check an...
Yan, Zheng
11:25 AM Revision fe5936b1 (ceph): mds: remove unnecessary is_xlocked check
Locker::foo_eval() is always called for stable locks, so no need to
check if the lock is xlocked.
Signed-off-by: Yan...
Yan, Zheng
11:25 AM Revision f5ea5c36 (ceph): mds: don't defer processing caps if inode is auth pinned
We should not defer processing caps if the inode is auth pinned by MDRequest,
because the MDRequest may change lock s...
Yan, Zheng
11:25 AM Revision 5e8642a8 (ceph): mds: call maybe_eval_stray after removing a replica dentry
MDCache::handle_cache_expire() processes dentries after inodes, so the
MDCache::maybe_eval_stray() in MDCache::inode_...
Yan, Zheng
11:25 AM Revision 84224743 (ceph): mds: fix rename inode exportor check
Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is
inode exportor of rename operation is not reli...
Yan, Zheng
11:25 AM Revision 26279574 (ceph): mds: don't trigger assertion when discover races with rename
Discover reply that adds replica dentry and inode can race with rename
if slave request for rename sends discover and...
Yan, Zheng
11:25 AM Revision 5ae715be (ceph): mds: xlock stray dentry when handling rename or unlink
This prevents MDS from reintegrating stray before rename/unlink finishes
Signed-off-by: Yan, Zheng <zheng.z.yan@inte...
Yan, Zheng
11:25 AM Revision 7a520168 (ceph): mds: don't journal null dentry for overwrited remote linkage
Server::_rename_prepare() adds null dest dentry to the EMetaBlob if
the rename operation overwrites a remote linkage....
Yan, Zheng
11:25 AM Revision fcb9f988 (ceph): mds: use null dentry to find old parent of renamed directory
When replaying an directory rename operation, MDS need to find old parent of
the renamed directory to adjust auth sub...
Yan, Zheng
11:25 AM Revision d9d71473 (ceph): mds: don't trim ambiguous imports in MDCache::trim_non_auth_subtree
Trimming ambiguous imports in MDCache::trim_non_auth_subtree() confuses
MDCache::disambiguate_imports() and causes in...
Yan, Zheng
11:25 AM Revision 3b13d3dc (ceph): mds: only export directory fragments in stray to their auth MDS
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
11:25 AM Revision 61da9b18 (ceph): mds: mark rename inode as ambiguous auth on all involved MDS
When handling cross authority rename, the master first sends OP_RENAMEPREP
slave requests to witness MDS, then sends ...
Yan, Zheng
11:09 AM Linux kernel client Bug #2764 (Closed): xfstest hang; osd socket closed messages
The fix for the warning messages is:
28362986f8743124b3a0fda20a8ed3e80309cce1
libceph: report connection ...
Alex Elder
10:54 AM Bug #3698: filestore: ENOENT on clone
recent log: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33152 Tamilarasi muthamizhan
09:45 AM CephFS Bug #3700: mds: FAILED assert(!item_session_list.is_on_list())
fixed by revert of bad fix, see commit:6711a4c4038dbdf843f9dfe42c7809c5c37ae534 Sage Weil
09:37 AM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
Sage Weil
09:41 AM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ... Sage Weil
09:09 AM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
... Sage Weil
08:28 AM Feature #3704 (Resolved): mon: add min log level to send cluster msgs to syslog
e.g., WARN and above only, but not INFO. This is for the mon/LogMonitor.cc submission path, not log/Log.cc (for debu... Sage Weil
05:55 AM Revision e10267b5 (ceph): mds: fix Locker::simple_eval()
Locker::simple_eval() checks if the loner wants CEPH_CAP_GEXCL to
decide if it should change the lock to EXCL state, ...
Yan, Zheng
05:54 AM Revision 7e23321b (ceph): mds: don't renew revoking lease
MDS may receives lease renew request while lease is being revoked,
just ignore the renew request.
Signed-off-by: Yan...
Yan, Zheng

01/01/2013

06:36 PM Revision eb02eaed (ceph): Merge remote-tracking branch 'gh/wip-bobtail-docs'
Sage Weil
05:35 AM Revision f1196c7e (ceph): Merge branch 'master' of https://github.com/ceph/ceph
Gary Lowell
05:31 AM Revision 5dd6b199 (ceph): Merge branch 'next'
Gary Lowell
02:37 AM Revision 8f77ec7d (ceph): Merge branch 'next'
Sage Weil
02:36 AM Revision 94a5dd6b (ceph): Merge remote-tracking branch 'gh/wip-3675'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
01:10 AM Revision 1a32f0a0 (ceph): v0.56
Gary Lowell

12/31/2012

11:28 PM Revision 49ebe1ee (ceph): client: fix _create created ino condition
We get 8 bytes back for the created ino.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
11:26 PM Revision a10054bc (ceph): libcephfs: choose more unique nonce
We were using a per-process counter combined with the pid. A short
running process can easily loop through and reuse...
Sage Weil
11:26 PM Revision e2fef38d (ceph): client: fix _create
make_request() clear out req->reply and frees req; we can't inspect
it here.
Instead, just assume that extra_bl is t...
Sage Weil
06:35 PM rbd Bug #3697: rbd copy.sh test failing in nightly
FWIW I ran this in a loop and reproduced it after 7 iterations (well, a slightly different error actually, when it re... Sage Weil
05:42 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
Dan Mick
05:08 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Hm, doesn't reproduce on local vstart cluster. Pondering possible failure modes. Dan Mick
04:23 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Trying to reproduce now Dan Mick
06:17 PM Revision 7d70dd11 (ceph): Revert "kernel: move fsync test to marginal suite until it works"
This reverts commit acb91f7d0d4882d7393a99b142aec8687b9b4bb7.
Now fixed in master branch, commit b4d3bd06d4083d78075...
Sage Weil
06:16 PM Revision b4d3bd06 (ceph): Merge remote-tracking branch 'gh/wip-3625'
Sage Weil
05:38 PM rbd Bug #3703: osd: crash while encrypting
This is an osd crash.... Josh Durgin
02:55 PM rbd Bug #3703 (Can't reproduce): osd: crash while encrypting
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_19:00:03-regression-next-testing-basic/32113... Tamilarasi muthamizhan
04:11 PM Revision ed586c1b (ceph): task: ceph: don't wait for 'healthy' if 'wait-for-healthy' is false.
This new config option obviously defaults to 'true' in order to not only
maintain compatibility, but because it makes...
Joao Eduardo Luis
02:58 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
bringing back the marked out osd.1 in on burnupi06 while running the io hit the following,
2012-12-31 14:26:26.6...
Tamilarasi muthamizhan
02:30 PM Messengers Feature #3509 (Resolved): msgr: delay injection
Sage Weil
10:18 AM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
Sage Weil
09:06 AM Bug #3702 (Can't reproduce): OSD SIGABRT during startup
After conversion of OSD's from btrfs to XFS, some OSD's SIGABRT during their first startup on XFS:
2012-12-29 05:0...
Justin Lott
08:55 AM Bug #3683: mon: leak of MMonPaxos
recent logs: ubuntu@teuthology:/a/teuthology-2012-12-29_19:00:03-regression-next-testing-basic/31414 Tamilarasi muthamizhan
08:37 AM rbd Bug #3701 (Can't reproduce): qemu xfstest hung BUG: unable to handle kernel NULL pointer derefere...
logs: ubuntu@teuthology:/a/teuthology-2012-12-30_03:00:06-regression-master-testing-gcov/31929... Tamilarasi muthamizhan

12/30/2012

11:29 PM Revision ec5288a3 (ceph): Merge remote-tracking branch 'gh/wip-rbd-unprotect' into next
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
07:18 PM Revision 82cec48e (ceph): doc: add-or-rm-mons.rst: Add 'Changing Monitor's IPs' section
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
Joao Eduardo Luis
07:17 PM Revision 379f0792 (ceph): doc: add-or-rm-mons.rst: Clarify what the monitor name/id is.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
06:08 PM CephFS Fix #3630: mds: broken closed connection cleanup
... Sage Weil
06:06 PM CephFS Fix #3630: mds: broken closed connection cleanup
The con re-use looks like this:
- client connects
- mds ms_verify_authorizer creates a new session
- msgr see ex...
Sage Weil
06:04 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
see #3630..let's fix this properly. Sage Weil
08:06 AM Revision 85e9d4f0 (ceph): cls_rbd: get_children does not need write permission
This prevented a read-only user from being able to unprotect a
snapshot without write permission on all pools. This w...
Josh Durgin
08:06 AM Revision 91e941ae (ceph): OSD: remove RD flag from CALL ops
20496b8d2b2c3779a771695c6f778abbdb66d92a forgot to do this. Without
this change, all class methods required regular r...
Josh Durgin
08:06 AM Revision c67c789d (ceph): librbd: add {rbd_}open_read_only()
Since 58890cfad5f7bee933baa599a68e6c65993379d4, regular {rbd_}open()
would fail with -EPERM if the user did not have ...
Josh Durgin
08:06 AM Revision 47bf5195 (ceph): librbd: open parent as read-only during clone
We never write to the parent, and don't need to watch it during this process.
Signed-off-by: Josh Durgin <josh.durgi...
Josh Durgin
08:06 AM Revision 958addc0 (ceph): rbd: open (source) image as read-only
This allows users without write access to copy, export and list
information about an image.
Signed-off-by: Josh Durg...
Josh Durgin
08:06 AM Revision d0a14d11 (ceph): librbd: fix race between unprotect and clone
Clone needs to actually re-read the header to make sure the image is
still protected before returning. Additionally, ...
Josh Durgin
08:06 AM Revision 8bbb4a36 (ceph): doc: fix rbd permissions for unprotect
Unprotect examines all pools, so use blanket x before 0.54. After
that, use class-read restricted by object_prefix to...
Josh Durgin
05:00 AM Revision 7b0dbeb0 (ceph): doc/install/upgrading: edits to upgrade document
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:00 AM Revision 4aa6af76 (ceph): doc/release-notes: link to upgrade doc
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/29/2012

08:04 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
Possibly the same bug in teuthology:/a/joshd-3631-12-28-12_08.55/30739... Josh Durgin
07:45 PM Bug #3698: filestore: ENOENT on clone
Sage Weil
07:44 PM Bug #3698: filestore: ENOENT on clone
Can you add 'debug osd = 20' so the job you're running?
Sage Weil
04:30 PM Bug #3698: filestore: ENOENT on clone
Happened again in teuthology:/a/joshd-3631-12-28-12_08.53/30681 Josh Durgin
09:50 AM Bug #3698 (Resolved): filestore: ENOENT on clone
Full logs in teuthology:/a/joshd-3631-12-27-12_22.21/29826... Josh Durgin
04:38 PM Revision 6711a4c4 (ceph): Revert "mds: replace closed sessions on connect"
This reverts commit 8b599083705c2495810c00f9f5fd5bb8ace7f32e.
This fix is not correct. See #3696.
Sage Weil
04:28 PM Revision bb4a2c55 (ceph): rgw: enable logging in ceph.conf
Sage Weil
02:39 PM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
logs: ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30039... Tamilarasi muthamizhan
02:32 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
ubuntu@teuthology:/a/teuthology-2012-12-29_03:00:03-regression-master-testing-gcov/30036 Tamilarasi muthamizhan
09:43 AM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
reverted the broken fix, reproducing the original problem again. Sage Weil
02:19 PM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
cluster: burnupi06 [running osd.1 on v0.55.1] , burnupi07[running osd.3, osd.4, mon.b on argonaut], burnupi08[running... Tamilarasi muthamizhan
08:37 AM rbd Bug #3697 (Duplicate): rbd copy.sh test failing in nightly
... Sage Weil
01:21 AM Revision a5d692a7 (ceph): msgr: inject delays at inconvenient times
Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-...
Sage Weil
01:21 AM Revision 82f8bcdd (ceph): msg/Pipe: use state_closed atomic_t for _lookup_pipe
We shouldn't look at Pipe::state in SimpleMessenger::_lookup_pipe() without
holding pipe_lock. Instead, use an atomi...
Sage Weil
01:21 AM Revision 7bf0b085 (ceph): msgr: atomically queue first message with connect_rank
Atomically queue the first message on the new pipe, without dropping
and retaking pipe_lock.
Signed-off-by: Sage Wei...
Sage Weil
01:21 AM Revision 6339c5d4 (ceph): msgr: don't queue message on closed pipe
If we have a con that refs a pipe but it is closed, don't use it. If
the ref is still there, it is only because we a...
Sage Weil
01:21 AM Revision e99b4a30 (ceph): msgr: fix race on Pipe removal from hash
When a pipe is faulting and shutting down, we have to drop pipe_lock to
take msgr lock and then remove the entry. Th...
Sage Weil
01:19 AM Revision 83c8025d (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:19 AM Revision c2a75253 (ceph): test: mon: workloadgen: debug when message fsid != monmap fsid
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
01:19 AM Revision b30ab517 (ceph): test: mon: workloadgen: assert if monmap's fsid is zero after authenticate
Fixes: #3629
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
01:19 AM Revision 35836847 (ceph): doc: update Hadoop documentation
Updates configuration option names, and adds object.size,
localize.reads, and root.dir control options.
Signed-off-b...
Noah Watkins
01:12 AM Revision 942c7145 (ceph): init-ceph: ok, 8K files
16K might be a bit many.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
01:10 AM Revision 0a5d6d87 (ceph): msg/Pipe: remove broken cephs signing requirement check
Remove the special-case check, which does not inform the peer what
protocol features are missing. It also enforces t...
Sage Weil
12:00 AM Revision 65b787ea (ceph): msg/Pipe: include remote socket addr in debug output
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/28/2012

11:55 PM Revision 9e5e08f8 (ceph): doc: Added a new upgrade document.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:55 PM Revision 1553267e (ceph): doc: Minor edit.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:54 PM Revision 02b8bcd0 (ceph): doc: Added upgrade link to index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:44 PM Revision 076b418c (ceph): os/FileJournal: logger is optional
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:14 PM Revision 3debf0cf (ceph): client: fix fh leak in non-create case
We may take the O_CREAT path and get an fh from _create, but created can
still be false. In that case, skip the fina...
Sage Weil
11:10 PM Revision 7f35e5dd (ceph): client: Make ll_create use _create
This is a fix for bug #3625, where multiple clients race to create a
file, and the loser returns EEXIST instead of a ...
Sam Lang
11:10 PM Revision 67bc849c (ceph): mds: Return created inode in mds reply to create
If multiple clients race to create a file, multiple clients will send a
create request and get back a valid dentry+in...
Sam Lang
11:08 PM Revision 813787af (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
11:03 PM Revision ca34fc4d (ceph): osd: allow RecoveryDone self-transition in RepNotRecovering
In a mixed cluster where some OSDs support the recovery reservations and
some don't, the replica may be new code in R...
Sage Weil
10:15 PM Revision 0f5383f4 (ceph): Merge remote-tracking branch 'origin/wip-gl-docs'
Update release process documentation. Gary Lowell
10:05 PM Revision 1867b818 (ceph): docs: fix typo in release-process doc
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
10:00 PM Linux kernel client Bug #3519: rbd map hang during system startup
yay! Dan Mick
06:08 AM Linux kernel client Bug #3519 (Resolved): rbd map hang during system startup
Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8.
Alex Elder
09:47 PM Revision 3a8bf3af (ceph): doc/release-notes: document new 'max open files' default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:11 PM CephFS Bug #3696: mds: FAILED assert(session_map.count(s->inst.name) == 0)
Sage Weil
06:42 PM CephFS Bug #3696 (Resolved): mds: FAILED assert(session_map.count(s->inst.name) == 0)
This occurred shortly after startup when trying to reproduce another bug on the master branch:... Josh Durgin
08:34 PM Revision ea13ecc2 (ceph): osd: less noise about inefficient tmap updates
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
08:12 PM Revision 9483a032 (ceph): init-ceph: fix status version check across machines
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt...
Sage Weil
08:12 PM Revision 8fef9360 (ceph): init-ceph: use SSH in "service ceph status -a" to get version
When running "service ceph status -a", a version number was never
returned for remote hosts, only for the local. Thi...
Travis Rhoden
08:11 PM Revision 672c56b1 (ceph): init-ceph: default to 16K max_open_files
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
07:58 PM Revision 948e7524 (ceph): ceph-fuse: Avoid doing handle cleanup in dtor
The CephFuse::Handle class needs the client
pointer to be valid for finalizing, so don't finalize
in the destructor (...
Sam Lang
07:10 PM Revision ff2d4abb (ceph): ceph-fuse: Pass client handle as userdata
The fuse lowlevel API isn't getting the client
handle when when it gets initialized, resulting
in a null pointer for ...
Sam Lang
06:21 PM CephFS Fix #3630: mds: broken closed connection cleanup
Sage Weil
05:57 PM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
ceph version : 0.55.1-329-g01376d4 (01376d44d73189080d207f701fc7e38cf55c738d)
cluster:
burnupi15[running osd.1, ...
Tamilarasi muthamizhan
05:09 PM Bug #3675 (Resolved): osd: hang during intial peering
Sage Weil
04:55 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
the problem was old ceph-osd daemons on other hosts trying to connect. running code that didn't include commit:4d20b... Sage Weil
12:27 PM Bug #3690: osd crashed in FileStore::_do_transaction
made the default fd limit much higher in commit:672c56b18de3b02606e47013edfc2e8b679d8797 Sage Weil
10:39 AM Bug #3690: osd crashed in FileStore::_do_transaction
... Sage Weil
10:32 AM Bug #3690: osd crashed in FileStore::_do_transaction
... Tamilarasi muthamizhan
04:40 PM Bug #3684 (Fix Under Review): filejournal: aio vector size is not limited
wip-journal-aio Sage Weil
04:09 PM Revision acb91f7d (ceph): kernel: move fsync test to marginal suite until it works
Sage Weil
04:08 PM Revision 02e4eeff (ceph): kernel: move fsx to marginal suite until it passese
Sage Weil
04:06 PM Documentation #3694 (Closed): doc: how to use the admin socket interface
A couple pages in the docs mention specific commands, but there's no overall explanation of what it is, and how you c... Josh Durgin
01:56 PM Bug #3691: Lock issue in librados resulting in application hang
This affects small clusters more because a single osd is a larger proportion of the whole cluster. In bobtail, there ... Josh Durgin
01:46 PM Bug #3691: Lock issue in librados resulting in application hang
Well, this is an even worse issue. We are adding new osds (just 8 now), and the cluster has been staying "unhealthy" ... Xiaopong Tran
01:04 PM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
You're calling the synchronous version of write, and the spot where it's 'hung' is just waiting for the response from... Josh Durgin
04:53 AM Bug #3691 (Rejected): Lock issue in librados resulting in application hang
We ran into some nasty lock issue in librados, it's trying to write some data, and hangs there for a many seconds unt... Xiaopong Tran
12:46 PM RADOS Bug #3693 (Duplicate): crushtool compile fails with unhelpful message, diagnosis quite difficult
A user tried to create his own crushmap as follows:... Dan Mick
12:18 PM rbd Bug #2689 (In Progress): qemu iozone test hangs
This seems to still be a problem. I'll try to get more information about what's going on. It looks like there's an er... Josh Durgin
12:12 PM devops Documentation #2774: doc: ceph-disk man page
These would be useful. Someone on irc was confused earlier by the undocumented requirement to set --cluster-uuid (or ... Josh Durgin
12:08 PM rbd Bug #3692: OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
Chronology of events (UTC) in the latest example of this happening, in case it's relevant:
15:50:46 mon.b is s...
Justin Lott
12:01 PM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
I've seen this happen twice:
- Reboot a node running a number of OSD's
- Within a short period of time, seemingly...
Justin Lott
11:42 AM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
wip-3689 has a fix; please test! Sage Weil
10:58 AM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28728 Tamilarasi muthamizhan
10:53 AM Bug #3631: osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during librbd_fsx
ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662... Tamilarasi muthamizhan
10:34 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
recent log: ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28713 Tamilarasi muthamizhan
06:10 AM Bug #3657 (Resolved): rbd: crash mapping image
Done, pushed to master, and soon to be included in a pull request
to Linus for 3.8.
Alex Elder
06:08 AM Revision 9967cf24 (ceph): release-notes: rgw logging now off by default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:03 AM Revision 1c3e12a2 (ceph): doc: warn about using caching without QEMU knowing
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
06:02 AM Revision f6ce5dda (ceph): rgw: disable ops and usage logging by default
Most users don't need this, and having it on will just fill their clusters
with objects that will need to be cleaned ...
Sage Weil
04:28 AM Bug #3683 (In Progress): mon: leak of MMonPaxos
Joao Eduardo Luis
04:26 AM Bug #3633: mon: clock drift errors not reported by ceph status
wip-3633 now has a couple of patches that introduce a mechanism to keep track of clock skews on the monitors.
If s...
Joao Eduardo Luis
01:24 AM Revision 64b845f6 (ceph): features is uint64_t
This won't bite us for a while yet (we're on bit 26), but it will soon!
Signed-off-by: Sage Weil <sage@inktank.com>
...
Sage Weil
01:15 AM Revision 2fbe3e17 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:55 AM Revision 856f32ab (ceph): ceph-fuse: Split main into init/main/finalize
With the invalidate callback enabled for fuse, the Client::unmount
call requires the fuse channel and session objects...
Sam Lang
12:39 AM Revision c0fe3815 (ceph): java: remove deprecated libcephfs
Removes ceph_set_default_*
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
12:32 AM Revision 6c7b667b (ceph): init-ceph: fix status version check across machines
The local state isn't propagated into the backtick shell, resulting in
'unknown' for all remote daemons. Avoid backt...
Sage Weil

12/27/2012

11:39 PM Revision 774a54cb (ceph): docs: update release process documentation.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
10:23 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
This looks like a compatibility issue with recovery queueing:... Sage Weil
05:26 PM Bug #3689: osd: bad peering state machine event with mixed v0.52 and next cluster
Log from crashing osd with greater debug level https://dl.dropbox.com/u/5820195/ceph-osd.1.log.gz. Maciej Galkiewicz
05:09 PM Bug #3689 (Resolved): osd: bad peering state machine event with mixed v0.52 and next cluster
Reported by mgalkiewicz in #ceph. https://gist.github.com/raw/4393494/f3ae88406350b74ac6d608b8b75960f85435e85e/gist... Sage Weil
09:40 PM Revision af37cc3a (ceph): Merge remote-tracking branch 'gh/wip-mds'
Sage Weil
09:26 PM Revision 63567392 (ceph): osd: fix recovery assert for pg repair case
In the case of PG repair, this assert is not valid. Disable it for now.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:09 PM Revision 1fa8c83d (ceph): Merge branch 'wip-osd-flags'
Sage Weil
09:07 PM Revision 207e93ab (ceph): Merge remote-tracking branch 'gh/wip-mds-pool'
Reviewed-by: Sam Lang <sam.lang@inktank.com> Sage Weil
08:12 PM Revision 03f6dfa4 (ceph): osd: move rmw_flags to OpRequest, out of MOSDOp
It was very sloppy to put a server-side processing state inside the
messsage. Move it to the OpRequestRef instead.
...
Sage Weil
08:12 PM Revision f1dfd64f (ceph): messages/MOSDOpReply: remove misleading may_read/may_write
These are OpRequest properties, calculated/enforced at the OSD. They don't
belong in the MOSDOp or MOSDOpReply messa...
Sage Weil
08:12 PM Revision f2306038 (ceph): osd: only calculate OpRequest rmw flags once
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
08:04 PM Linux kernel client Bug #3519: rbd map hang during system startup
Nick reports:
I have some exciting news. After 215 test runs, no hung processes
were detected. I think we may...
Alex Elder
07:58 PM Bug #3657: rbd: crash mapping image
I'm currently testing two patches related to this bug, and
while I haven't pushed them to the testing branch yet I
...
Alex Elder
07:27 PM Revision 998f7194 (ceph): dropping xfs test 186 due to bug: 3685
Signed-off-by: tamil <tamil.muthamizhan@inktank.com> Tamilarasi muthamizhan
07:14 PM Revision 98e7b598 (ceph): docs: remove extra release-process2 file.
This file mostly duplicated the existing release documentation. Differences
have been merged into the primary file.
...
Gary Lowell
07:12 PM Revision 82c71716 (ceph): osd: drop 'osd recovery max active' back to previous default (5)
Having this too large means that queues get too deep on the OSDs during
backfill and latency is very high. In my tes...
Sage Weil
07:11 PM Revision 6f1f03c7 (ceph): journal: reduce journal max queue size
Keep the journal queue size smaller than the filestore queue size.
Keeping this small also means that we can lower t...
Sage Weil
07:09 PM Revision 0d2ad2f2 (ceph): mds: use set to store MDSMap data pools
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:53 PM Revision 80bcaa29 (ceph): rados: add filestore_idempotent test with journal aio = true
Sage Weil
05:55 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
this reproduced once out of ~60 runs on the fsx task.... Sage Weil
05:36 PM Revision 2137d5cd (ceph): mds: wait for client's mdsmap when specifying data pool
The client may have a newer map than we do; make sure we wait for it lest
we inadvertantly reply because we think the...
Sage Weil
05:33 PM Bug #3690: osd crashed in FileStore::_do_transaction
leaving the cluster as it is for someone to take a look at it. Tamilarasi muthamizhan
05:33 PM Bug #3690 (Resolved): osd crashed in FileStore::_do_transaction
ceph version: 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
I had a cluster[burnupi15, burnupi19,...
Tamilarasi muthamizhan
05:33 PM Revision 9da6d882 (ceph): doc: document mds config options
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:52 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
ceph version : 0.55.1-360-g6356739 (635673928a6b4dae6d4712cacad81cbac6412dc3)
rbd allows images created with zero ...
Tamilarasi muthamizhan
04:45 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
While debugging what I thought was a Ceph memory leak, I was pointed to
http://ceph.com/deprecated/Memory_Profiling
...
Faidon Liambotis
04:37 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
On http://ceph.com/docs/master/install/build-prerequisites/ , in the "On Debian/Squeeze, execute aptitude install ...... Nat Makarevitch
12:32 PM rbd Bug #3427: krbd: unmap does not remove block device properly
I am going to assume that the racing open is the cause of
the problem reported by Nikola Kotur.
To fix it, I will...
Alex Elder
12:17 PM rbd Bug #3427: krbd: unmap does not remove block device properly
> For RBD, wasn't the use_count something we just added? Would it cover this situation?
No.
The first warning i...
Alex Elder
08:53 AM rbd Bug #3427: krbd: unmap does not remove block device properly
For cephfs, the vfs normally handles that.
For RBD, wasn't the use_count something we *just* added? Would it cove...
Sage Weil
08:37 AM rbd Bug #3427: krbd: unmap does not remove block device properly
I also note, having taken a little closer look at Nikola Kotur's
kernel log that both an open and a close appear to ...
Alex Elder
08:31 AM rbd Bug #3427: krbd: unmap does not remove block device properly
It looks to me like the osd client code has nothing in place
to protect itself from one of its users (ceph client, m...
Alex Elder
12:01 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
There aren't known leaks in argonaut. If you can reproduce with valgrind massif and see where the heap is going, tha... Sage Weil
11:28 AM rbd Bug #2689: qemu iozone test hangs
Testing again since some possible causes were fixed. Josh Durgin
10:54 AM rbd Bug #3685 (Closed): xfs test 186 fails in the nightlies
logs: ubuntu@teuthology:/a/teuthology-2012-12-26_19:00:03-regression-next-testing-basic/28039
...
Tamilarasi muthamizhan
09:19 AM Bug #3684 (Resolved): filejournal: aio vector size is not limited
FileJournal::write_aio_bl does not limit the size of the iov to IOV_MAX. Sage Weil
08:23 AM Bug #3683 (Resolved): mon: leak of MMonPaxos
ubuntu@teuthology:/a/teuthology-2012-12-22_19:00:02-regression-next-testing-basic/22989
saw it a few days earlier,...
Sage Weil
01:34 AM Revision 916d1cf6 (ceph): doc: journaler config options
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/26/2012

10:27 PM Revision c34e38bc (ceph): log: 10,000 recent log entries
This is what we were (wrongly) doing before, so there are no memory
utilization surprises.
Signed-off-by: Sage Weil ...
Sage Weil
10:27 PM Revision 4daede79 (ceph): log: fix log_max_recent config
<facepalm>
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 4de7748b72d4f90eb1197a70015c199c15...
Sage Weil
10:27 PM Revision fdae0552 (ceph): log: fix flush/signal race
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav...
Sage Weil
08:54 PM Revision cedea139 (ceph): docs: Merge changes from release-process2 document.
Gary Lowell
07:58 PM Revision 850a056b (ceph): mds: add waiting_for_mdsmap queue
Defer events until we get a specific MDSMap epoch.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
07:58 PM Revision c764935d (ceph): mds: do not check for pool existence in osdmap
We don't have a wait mechanism to ensure the MSDMap has the latest osdmap
here. Just trust the MDSMap.
Signed-off-b...
Sage Weil
06:55 PM Revision 4929fc7d (ceph): qa: remove xfstests 172 and 173 from qemu testing
These seem to require newer xfs.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
05:42 PM Revision f5403f94 (ceph): doc/man/8/mkcephfs: update --mkfs a bit
Document that 'devs' and 'osd mkfs type' must be defined.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:23 PM rgw Bug #3682: valgrind errors seen when running rgw tests in nightlies
log: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27925 Tamilarasi muthamizhan
04:20 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
Logs: ubuntu@teuthology:/a/teuthology-2012-12-26_03:00:10-regression-master-testing-gcov/27924
ubuntu@teuthology:/...
Tamilarasi muthamizhan
03:48 PM Bug #3378 (Can't reproduce): common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
The suicide timeout is the symptom only. Usually it means the thread is blocked by a hung syscall. In your case, Ma... Sage Weil
02:38 PM rbd Bug #3427: krbd: unmap does not remove block device properly
I haven't spent time on this in almost a month so wanted to just
provide an update. We have been looking at and try...
Alex Elder
01:02 PM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
We are using 0.48.2 for the OSDs and our plan is to upgrade to 0.56 (or the next stable release) when it comes out. Kevin Scheunemann
11:45 AM Bug #3546 (Won't Fix): CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
The crash is a known problem with pre-3.4 kernels. Fixes have been backported to 3.4 stable and 3.6 stable kernels, ... Sage Weil
11:36 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
At the time, the clients where running 3.2.0-32, but we have since upgraded to 3.6.9 per another ceph bug.
We have...
Kevin Scheunemann
11:25 AM Bug #3546: CEPH 0.48.2 OSD crashed causing kernel RBD clients to reboot
What kernel version are you running? Sage Weil
11:05 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another run:... Sage Weil
09:38 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another one:... Sage Weil
09:59 AM CephFS Bug #3681 (Resolved): kclient fsx fails nightly
... Sage Weil
08:44 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
Xiaopong Tran wrote:
> I'm using xfs, with no specific mount options, just the default.
>
> I added the debug set...
Sage Weil
02:22 AM Bug #3676: osd keeps crashing at ReplicatedPG::scan_range()
I'm using xfs, with no specific mount options, just the default.
I added the debug settings, and got a large log f...
Xiaopong Tran
08:39 AM CephFS Feature #3679 (Closed): Any API to get metadata?
Yep! See libcephfs. There is... Sage Weil
01:08 AM CephFS Feature #3679 (Closed): Any API to get metadata?
hello,there.
I am wondering if there is any API to get the metadata of a file .
I have the ceph file system run by ...
lollipop king
07:20 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
err... that should have been each monitor's ip and port.
as in...
Joao Eduardo Luis
01:10 AM CephFS Tasks #3680 (Rejected): deduplication in ceph
I am wondering how to do deduplication in ceph...the big problem is how to get the metadata of a file
and how to mod...
lollipop king

12/25/2012

08:35 PM Bug #3378: common/HeartbeatMap.cc: 78: FAILED assert(0 == "hit suicide timeout")
Saw this show up during parametric sweep testing on EXT4 with 8 concurrent OSD disk threads. Ceph build is from gitb... Mark Nelson

12/24/2012

02:58 PM CephFS Feature #1448 (In Progress): test hadoop on sepia
Sage Weil
02:58 PM CephFS Cleanup #814 (Resolved): hadoop: refactor hadoop shim in terms of java libceph bindings
Sage Weil
02:56 PM rbd Feature #3580 (Resolved): rbd import from stdin could try harder to sparsify images
Sage Weil
02:54 PM rgw Feature #1950: rgw: create S3/Swift ACL interoperability suite
Sage Weil
12:27 PM Bug #3676 (Need More Info): osd keeps crashing at ReplicatedPG::scan_range()
... Sage Weil
12:04 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
... Sage Weil
09:22 AM rbd Bug #3654 (Fix Under Review): libvirt: colons in ipv6 monitor addresses are not escaped when sent...
Sage Weil
08:45 AM rbd Fix #3665: librbd: deadlock during flatten
the problem is that we are holding the snap_lock and then waiting for io. but we mostly use snap_lock as a tight inne... Sage Weil
04:01 AM Revision d18f3c2d (ceph): mds: don't force in->first == dn->first
The fullbit sets it now. For multiversion inodes, it's "first" can be in
the future, since this dentry may not have ...
Sage Weil
04:01 AM Revision 8b599083 (ceph): mds: replace closed sessions on connect
If a connection comes and there is a closed session attached, remove it.
This is probably a failure of an old session...
Sage Weil
04:01 AM Revision a3e70aed (ceph): mds: always send discover if want_xlocked is true
If want_xlocked is true, we can not rely on previously sent discover
because it's likely the previous discover is blo...
Yan, Zheng
04:01 AM Revision 96f48aa0 (ceph): mds: re-issue caps after importing caps
The imported caps may prevent unstable locks from entering stable
states. So we should call Locker::eval_gather() wit...
Yan, Zheng
04:01 AM Revision dd441576 (ceph): mds: take export lock set before sending MExportDirDiscover
Migrator::export_dir() only check if it can lock the export lock set
but not take the lock set. So someone else can c...
Yan, Zheng
04:01 AM Revision 1174dd31 (ceph): mds: don't retry readdir request after issuing caps
If remote linkage without inode is encountered after some caps are
issued, Server::handle_client_readdir() should sen...
Yan, Zheng
04:01 AM Revision f5e86ecb (ceph): mds: delay processing cache expire when state >= EXPORT_EXPORTING
It's possible that MDS receives cache expire in EXPORT_LOGGINGFINISH
and EXPORT_NOTIFYING states.
Signed-off-by: Yan...
Yan, Zheng
04:01 AM Revision efbca31d (ceph): mds: fix file existing check in Server::handle_client_openc()
Creating new file needs to be handled by directory fragment's auth
MDS, opening existing file in write mode needs to ...
Yan, Zheng
04:01 AM Revision 00025462 (ceph): mds: fix race between send_dentry_link() and cache expire
MDentryLink message can race with cache expire, When it arrives at
the target MDS, it's possible there is no correspo...
Yan, Zheng
04:01 AM Revision a1485f95 (ceph): mds: compare sessionmap version before replaying imported sessions
Otherwise we may wrongly increase mds->sessionmap.version, which
will confuse future journal replays that involving s...
Yan, Zheng
04:01 AM Revision 48d8ae58 (ceph): mds: alllow handle_client_readdir() fetching freezing dir.
At that point, the request already auth pins and locks some objects.
So CDir::fetch() should ignore the can_auth_pin ...
Yan, Zheng
04:01 AM Revision 0ab0744e (ceph): mds: properly mark dirfrag dirty
If predirty_journal_parents() does not propagate changes in dir's
fragstat into corresponding inode's dirstat, it sho...
Yan, Zheng
04:01 AM Revision b7e698a5 (ceph): mds: no bloom filter for replica dir
We should delete dir fragment's bloom filter after exporting the dir
fragment to other MDS. Otherwise the residual bl...
Yan, Zheng
04:01 AM Revision e6b8f0a6 (ceph): mds: set want_base_dir to false for MDCache::discover_ino()
When frozen inode is encountered, MDCache::handle_discover() sends
reply immediately if the reply message is not empt...
Yan, Zheng
04:01 AM Revision 69f9f024 (ceph): mds: fix error hanlding in MDCache::handle_discover_reply()
The error hanlding code in MDCache::handle_discover_reply() has two
main issues. MDCache::handle_discover_reply() doe...
Yan, Zheng
03:59 AM Revision d9673ca3 (ceph): Merge branch 'wip-create-layout'
Reviewed-by: Greg Farnum <greg@inktank.com>
The functional tests for the create operations should add and specify no...
Sage Weil
03:39 AM Revision d2f5890f (ceph): client, libcephfs: add method to get the pool name for an open file
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:39 AM Revision 8efcf54d (ceph): mds: *_pg_pool -> *_pool
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:39 AM Revision 697ed23c (ceph): client: remove set_default_*() methods
This is a poor interface. The hadoop stuff is shifting to specify this
information on file creation instead.
Signed...
Sage Weil
03:39 AM Revision 99d9e1da (ceph): mds: allow data pool to be specfied on create
Reuse old preferred_pg field. Only use if the new CREATEPOOLID feature
is present, and the value is >= 0.
Verify th...
Sage Weil
03:39 AM Revision 3f458217 (ceph): mds: verify that the pool id is valid on SET[DIR]LAYOUT
Make sure the data pool exists and is part of the MDSMap data pools list.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
03:39 AM Revision 32ab274a (ceph): client: specify data pool on create operations
Fill in the data pool field if specified by the client, or set to -1.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil

12/23/2012

11:21 PM Revision 61d43af7 (ceph): osd: make MOSDFailure output more sensible
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:21 PM Revision 850d1d54 (ceph): osd: fix dup failure cancellations
If we had a pending failure report, and send a cancellation, take it
out of our pending list so that we don't keep re...
Sage Weil
11:11 PM Revision 9df522e9 (ceph): mon: make osd failure report log msgs sensible
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:42 PM Revision 1290671f (ceph): Merge branch 'wip-scrub' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Conflicts:
src/osd/PG.cc
Sage Weil
09:53 PM Revision 8362e640 (ceph): monclient: fix get_monmap_privately retry interval
Use mon_client_hunt_interval (default 3) instead of hardcoding 1 second.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:53 PM Revision d843a64a (ceph): Makefile: fix 'base' rule
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:12 PM CephFS Cleanup #3677 (Closed): libcephfs, mds: test creation/addition of data pools, create policy
the create data pool argument is tested only with the default pools. once an lib is in place for the unit/functional... Sage Weil
09:06 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
No worries. Let us know if you do come across behavior that looks like a bug! Sage Weil
08:59 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hi Sage,
i am very sorry for taking your time with this issue, I feel like an idiot :(
The buggy client is runnin...
Roman Hlynovskiy
07:19 PM Revision 00b89c3f (ceph): Merge branch 'next'
Sage Weil
07:18 PM Revision a09f5b1b (ceph): init-ceph,mkcephfs: default inode64 for mounting xfs
According to hch this is now the default or new kernels.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
03:22 PM Bug #3675 (Fix Under Review): osd: hang during intial peering
wip-3675 Sage Weil

12/22/2012

09:39 PM Bug #3675: osd: hang during intial peering
ok, this is actually also a race that can cause the register_pipe assert. the locking needs to be reworked here. pu... Sage Weil
09:28 PM Bug #3675: osd: hang during intial peering
... Sage Weil
08:54 PM Bug #3675: osd: hang during intial peering
it took about 1500 iterations of this job to reproduce the hang:... Sage Weil
08:53 PM Bug #3675: osd: hang during intial peering
ubuntu@teuthology:/a/sage-peer1/21827 Sage Weil
08:52 PM Bug #3675: osd: hang during intial peering
this is a messenger bug. if there is a socket error at the end of accept(), after the register_pipe(), we then fail ... Sage Weil
08:11 AM Bug #3675 (Resolved): osd: hang during intial peering
the initial wait for healthy blocked on 2 pgs. ms inject socket failres = 500. everything was up.
no logs, so it...
Sage Weil
07:10 PM Revision 5f25f9f8 (ceph): init-ceph: default osd_data path
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:40 PM Bug #3657 (In Progress): rbd: crash mapping image
I got a response from Ugis. The patches I supplied to him
did stop the crashes he was seeing. So we'll want to get...
Alex Elder
09:22 AM Bug #3676 (Can't reproduce): osd keeps crashing at ReplicatedPG::scan_range()
This specific osd (osd.17) keeps crashing at the same location, as I tried to bring it back. It would start peering a... Xiaopong Tran
07:04 AM Documentation #3674 (Resolved): Deployment documentation is confusing
As a new user who spent hours googling and reading source code to decipher what each tool does, I thought of giving s... Faidon Liambotis
06:29 AM devops Feature #3255: ceph-disk: allow prepare without activate (for spares)
Couldn't ceph-disk-prepare take a lock by e.g. writing a file (or even flock()ing it) in /var/lib/ceph/ before it sta... Faidon Liambotis
04:37 AM Revision ad9bcc70 (ceph): PG: don't use a self-transition for WaitRemoteRecoveryReserved
Previously, using the state on active worked, but now we might
go back through WaitRemoteRecoveryReserved without res...
Samuel Just
04:37 AM Revision f6b2ca8b (ceph): OSD: always do a deep scrub when repairing
Otherwise, errors turned up in a deep-scrub will be
swept under the rug without being repaired.
Signed-off-by: Samue...
Samuel Just
04:35 AM Revision 2e96bb18 (ceph): PG: Handle repair once in scrub_finish
We don't want to change missing sets during a chunky
scrub since it would cause !is_clean() and derail
the rest of th...
Samuel Just
02:47 AM devops Feature #3673 (Rejected): ceph-disk-prepare should provide an option for SSD alignment
ceph-disk-prepare takes an option to use an external disk as a journal. It is commonly suggested that the journal is ... Faidon Liambotis
01:12 AM Revision bdcf6647 (ceph): .gitignore: Add ar-lib to ignore list
Gary Lowell
01:03 AM Revision 4a558048 (ceph): librbd: move buf_is_zero() to new common/util.cc and include/util.h
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Dan Mick
01:03 AM Revision 410903fe (ceph): rbd: check for all-zero buf in export, seek output if so
Use buf_is_zero in common/util.cc
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durg...
Dan Mick
01:03 AM Revision 5905d7fa (ceph): rbd: harder-working sparse import from stdin
Try to accumulate image-sized blocks when importing from stdin, even if
each read is shorter than requested; if we ge...
Dan Mick
01:03 AM Revision 6325a480 (ceph): import_export.sh: sparse import export
Add tests for:
- sparse import makes expected sparse images
- sparse export makes expected sparse files
- sp...
Dan Mick
12:55 AM Revision 51a900cf (ceph): autogen.sh: Create m4 directory for leveldb
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:47 AM Revision 8f5de156 (ceph): osd: fix pg stat msgs vs timeout
We can get a pattern like so:
- new mon session
- after say 120 seconds, we decide to send a stats msg
- outstanding...
Sage Weil
12:19 AM Revision 74473bb6 (ceph): leveldb: Update submodule
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:14 AM Revision 2bf4f42b (ceph): doc: Added new journaler page to CephFS section. Needs descriptions.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:14 AM Revision 53afac1a (ceph): doc: Added Journaler Configuration to toc tree.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:09 AM Revision 757902d6 (ceph): doc: Added --mkfs options.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:08 AM Revision 46d03344 (ceph): doc: Added running multiple clusters. Per Tommi.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:07 AM Revision e3d07566 (ceph): doc: Updated the Configuration File section.
- Replaced ceph.conf with Ceph configuration to clarify
when running multiple clusters on the same hardware.
- Adde...
John Wilkins

12/21/2012

11:20 PM Revision 00ed6657 (ceph): PG::scrub_compare_maps increment scrubber.fixed for missing repairs
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
11:16 PM Revision c9e05174 (ceph): PG::_compare_scrubmaps: increment scrubber.errors on missing object
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
11:15 PM Revision b564fdb8 (ceph): release-notes: remove warning about osd caps
This was only an issue from 0.49-0.52 upgrading to 0.53+
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
11:15 PM Revision 3076e459 (ceph): release-notes: pgnum is required now
This should have been in the 0.55 release notes.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
11:15 PM Revision 048567e0 (ceph): release-notes: fix typos
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
11:15 PM Revision b39928df (ceph): release-notes: remove bug fix that does not affect argonaut
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
11:15 PM Revision 4a039393 (ceph): release-notes: add more user-visible changes
These are from looking through the shortlog from 0.48.2..next.
The description of the min_size defaults could probabl...
Josh Durgin
10:54 PM Revision 09d4f036 (ceph): doc: Added sudo the ceph health for when cephx is on.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:53 PM Revision 085992f6 (ceph): doc: minor fix to syntax.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:23 PM Revision 206ffcd8 (ceph): mkcephfs: error out if 'devs' defined but 'osd fs type' not defined
We can infer btrfs if they use btrfs devs, but if they use devs there is
no default fs.
Signed-off-by: Sage Weil <sa...
Sage Weil
10:04 PM Revision 4a40067d (ceph): doc: update ceph.conf examples about btrfs default
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:00 PM Revision 677a7a5a (ceph): rgw: add swift tasks
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
09:56 PM Revision 11fb3141 (ceph): Merge remote-tracking branch 'gh/wip-scrub' into next
Sage Weil
09:45 PM Revision 47145d80 (ceph): Merge remote-tracking branch 'gh/wip-3643' into next
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
09:44 PM Revision 999ba1b2 (ceph): monc: only warn about missing keyring if we fail to authenticate
This avoids the situation where a librados or other user with the default
of 'cephx,none' and no keyring is authentic...
Sage Weil
09:10 PM Revision 5d5a42bc (ceph): osd: clear CLEAN on exit from Clean state
This means we can drop the scrub repair state_clear() call. We probably
can drop others, but lets leave that for ano...
Sage Weil
08:19 PM Revision b3e62ad6 (ceph): auth: use none auth if keyring not found
If both cephx and none are accepted auth methods, and
cephx keyring cannot be found then resort to using
none, instea...
Yehuda Sadeh
07:37 PM Revision ae044e64 (ceph): osd: allow transition from Clean -> WaitLocalRecoveryReserved for repair
If we do a scrub repair, we need to go from clean to recovery again to
copy objects around.
This fixes a simple repa...
Sage Weil
07:36 PM Revision 7c56d8fa (ceph): PG::sched_scrub: return true if scrub newly kicked off
The previous return value wasn't really what OSD::sched_scrub
wanted to know.
Signed-off-by: Samuel Just <sam.just@i...
Samuel Just
07:36 PM Revision 4d661e0d (ceph): PG::sched_scrub: only set PG_STATE_DEEP_SCRUB once reserved
Otherwise we would have +DEEP before we have +SCRUB.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
07:29 PM Revision 19e44bff (ceph): osd: clear scrub state if queued scrub doesn't start
We set SCRUBBING when we queue a pg for scrub. If we dequeue and
call scrub() but abort for some reason (!active, de...
Sage Weil
07:29 PM Revision 670afc6c (ceph): PG: in sched_scrub() set PG_STATE_DEEP_SCRUB not scrubber.deep
scrubber.deep gets reset in scrub() to match
state_test(PG_STATE_DEEP_SCRUB).
Signed-off-by: Samuel Just <sam.just@i...
Samuel Just
06:20 PM Revision c02d34dc (ceph): task/swift: change upstream repository url
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
06:20 PM Revision 2f829870 (ceph): task/swift: change upstream repository url
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
06:15 PM Revision feb0aad2 (ceph): doc: Moved path to individual OSD entires.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
04:41 PM Bug #3661: mon: idle/empty osds marked down after 15 min
commit:8f5de156056de78c90f1dc7bf7c5a131c32c1bb8 Sage Weil
03:58 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
ubuntu@ceph3:/etc/ceph$ ceph -m ip:port -s
server name not found: ip (servname not supported for ai_socktype)
unabl...
Anonymous
11:10 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Pat, just a little triage before I dive into this full head on, could you please try the following for each monitor?
...
Joao Eduardo Luis
02:39 PM CephFS Documentation #3672 (Resolved): doc: how to mount ceph-fuse from fstab
There's a new mount helper in bobtail for this. It contains these comments:... Josh Durgin
02:36 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
anywhere i inserted a hash "#", this lovely program made them into numbered columns, so if you see a block with
x...
Anonymous
02:23 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
ok, looks like my conf settings got munged.
let's try this again
i was trying to get the mkcephfs to create a defau...
Anonymous
02:26 PM rgw Feature #3671 (Resolved): Request for x-amz-grant-full-control support
DH is requesting support for x-amz-grant-full-control:
"With Amazon S3, you can do specific grants like
x-amz-g...
JuanJose Galvez
02:23 PM rgw Feature #3670 (Resolved): Request for bucket-owner-read and bucket-owner-full-control grants
From DH, they'd like to see two types of requests which we currently ignore.
"Amazon has bucket-owner-read and buc...
JuanJose Galvez
01:59 PM Linux kernel client Bug #1492 (Can't reproduce): fsx failure on kclient
Sage Weil
01:55 PM rgw Feature #3669 (Resolved): rgw: support acl grants through http headers
support x-amz-grant-* http header fields. Yehuda Sadeh
01:48 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
commit:47145d800951db396785560df4e6d5d344af97dd Sage Weil
12:17 PM Bug #3643 (Fix Under Review): default authentication on the client does not work without a config...
Yehuda Sadeh
11:18 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
pretty sure this was caused by the log bug and 'log max new = 1', fixed by commit:50914e7a429acddb981bc3344f51a793280... Sage Weil
11:03 AM Bug #3657: rbd: crash mapping image
hmm. yeah, it probably means we should set the required features during negotiation to include MSG_AUTH instead of ... Sage Weil
10:56 AM Bug #3657: rbd: crash mapping image
There is another thing that came from the two crash logs Ugis
just supplied. They both contained lines like this:
...
Alex Elder
10:47 AM Bug #3657: rbd: crash mapping image
Ugis supplied two more images containing captured crash
stack traces. Both contained lines like this:
[ 32...
Alex Elder
10:27 AM rgw Feature #3668 (Resolved): rgw: support CORS
Yehuda Sadeh
10:21 AM rgw Feature #3667 (Resolved): rgw: support extra canned acl params
bucket-owner-read, bucket-owner-full-control Yehuda Sadeh
10:20 AM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
... Noah Watkins
10:18 AM Bug #3650 (In Progress): osd: crash in Reset state -> start_peering_interval -> on_change -> proc...
Sage Weil
09:36 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
Sage Weil
10:03 AM rbd Fix #3665 (Resolved): librbd: deadlock during flatten
Ran into this trying to reproduce #3631.
The test_librbd_fsx process is still running on plana34 for debugging.
...
Josh Durgin
08:36 AM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
I ran this test throughout the day yesterday and couldn't reproduce it, with message delays enabled. Marking as can'... Sam Lang
08:32 AM rbd Bug #3664 (Resolved): osdc/ObjectCacher.cc: 517: FAILED assert(!i->size())
... Sage Weil
07:52 AM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hi Roman-
The logging levels are right, but in both mds logs neither mds was ever active; both were in the up:stan...
Sage Weil
05:45 AM Revision e765dcb4 (ceph): osd: only dec_scrubs_active if we were active
This fixes a bug that puts scrubs_active negative.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:44 AM Revision ada3e27f (ceph): osd: reintroduce inc_scrubs_active helper
This mostly generates nice debug output. It also slightly simplifies
code and makes things symmetric.
Signed-off-by...
Sage Weil
01:43 AM Revision ae26432d (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:49 AM Revision bc4f74c7 (ceph): ceph.spec.in: Fedora builds debuginfo by default.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
12:24 AM Revision accce830 (ceph): Merge remote-tracking branch 'upstream/wip_notify' into next
Reviewed-by: Sage Weil <sage@inktank.com> Samuel Just

12/20/2012

11:51 PM Revision 129a49ad (ceph): cephtool: mention ceph osd ls, fix ceph osd tell N bench
Add ceph osd ls to help; make help for ceph osd tell N bench look
more like injectargs, which says <osd-id or *> to m...
Dan Mick
11:32 PM Revision a36d1db1 (ceph): rgw: remove noisy log message
No need for that log message.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
11:30 PM Revision 5b5a19ac (ceph): rgw: fix daemonize initialization
Just call the common daemonize function. Otherwise we end up
not initializng stdout / stderr correctly.
Signed-off-b...
Yehuda Sadeh
11:10 PM Revision 754fc200 (ceph): release notes: Mention new cephtool commands
ceph osd ls and ceph tell osd.N version are new. Mention their use
for verifying that all OSDs are upgraded in the n...
Dan Mick
10:19 PM CephFS Bug #3663: ceph kernel client is getting stuck on xstat* operations
Hello Sage,
added 4 logs:
screen output from console of the laggy client. it ends up on 'jroger@pr02:~/data$ cp...
Roman Hlynovskiy
09:07 PM CephFS Bug #3663 (Need More Info): ceph kernel client is getting stuck on xstat* operations
Hmm. It's actually just saying its the oldest client; it's not actually too old (yet). The looping connect attempts... Sage Weil
08:48 PM CephFS Bug #3663 (Rejected): ceph kernel client is getting stuck on xstat* operations
there are 2 kernel clients happily working with ceph. as soon as I try mounting ceph from the third client, it's gett... Roman Hlynovskiy
09:54 PM Bug #3661 (In Progress): mon: idle/empty osds marked down after 15 min
Sage Weil
04:57 PM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
wip-mon Sage Weil
09:48 PM Revision 50914e7a (ceph): log: fix flush/signal race
We need to signal the cond in the same interval where we hold the lock
*and* modify the queue. Otherwise, we can hav...
Sage Weil
09:29 PM Revision c0e23712 (ceph): ReplicatedPG::remove_notify : don't leak the notify object
Following remove_notify, there are no other references to
notif, delete it.
Signed-off-by: Samuel Just <sam.just@ink...
Samuel Just
09:27 PM Revision b5031a22 (ceph): OSD,ReplicatedPG: do not track notifies on the session
handle_notify_timeout and remove_notify currently do not clean up this
state leaving dangling Notification*. Further...
Samuel Just
08:59 PM Revision 719679ea (ceph): doc: Added package and repo links for Apache and FastCGI. Added SSL ena...
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:59 PM Revision 04eb1e73 (ceph): doc: Fixed restructuredText usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:42 PM Bug #3662: mkcephfs --mkfs is not inserting any default settings
The algorithm appears to be
1) if 'devs' is not defined, look for 'btrfs devs'; if that's defined, use those for ...
Dan Mick
05:00 PM Bug #3662 (Won't Fix): mkcephfs --mkfs is not inserting any default settings
It was my understanding that "sudo mkcephfs -a -c ceph.conf -k ceph.keyring --mkfs" would format a device with btrfs ... Anonymous
07:39 PM Revision ea9fc87d (ceph): doc: Removed foo. Apparently myimage was added and foo not removed.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:07 PM Revision 9f67c450 (ceph): Merge branch 'next'
Sage Weil
07:04 PM Revision 17c627b5 (ceph): Merge remote-tracking branch 'gh/wip-cephtool' into next
Sage Weil
06:58 PM Revision 0953ce53 (ceph): rados: add cephtool test
Sage Weil
06:49 PM Revision f38d8911 (ceph): Merge branch 'wip-build-fixes' into next
Sage Weil
06:46 PM Bug #3627 (Resolved): osd: segfault in ~MOSDSubOp during thrashing+rbd_fsx
accce830514c6b099eb0e00a8ae34396d14565a3 should fix it. Samuel Just
06:45 PM Bug #3659 (Resolved): complete_notify crash
accce830514c6b099eb0e00a8ae34396d14565a3 should take care of it. Samuel Just
12:24 PM Bug #3659: complete_notify crash
Saw on alexandria Samuel Just
12:23 PM Bug #3659 (Resolved): complete_notify crash
l=0).accept connect_seq 58 vs existing 57 state standby
2012-12-19 18:20:42.186013 7f3b1afe7700 0 <cls> cls/rgw/cls...
Samuel Just
06:13 PM Revision a803159b (ceph): rgw: configurable exit timeout
Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s...
Yehuda Sadeh
06:07 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
I just did an "scp" to burnupi40.front.sepia.ceph.com:/home/ubuntu/3647.vm.tgz Anonymous
04:32 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
I am in Sunnyvale and the VMs reside on my desktop. I have snapshotted and created a tar file of my 3 node cluster. ... Anonymous
07:32 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Pat, do you still have the VMs in this state? If so, can I take a look? Joao Eduardo Luis
05:45 PM Revision 92b59e90 (ceph): rgw: don't try to assign content type if not found
Fixes: #3648
Cannot assign a NULL pointer into stl string. This is only
relevant to swift, when uploading an object w...
Yehuda Sadeh
04:53 PM Revision c02e9062 (ceph): Merge remote-tracking branch 'gh/wip-crushtool' into next
Reviewed-by: Caleb Miles <caleb.miles@inktank.com> Sage Weil
03:49 PM rbd Bug #3524: test_librbd_fsx: crash after flatten
Sam saw this come up again in: ubuntu@teutholog:/a/sam-ooo3/19022
It's a different cause of the same symptom. In t...
Josh Durgin
02:52 PM Bug #3660 (Resolved): osd: marking objects lost invalidates pg stats
If you lose an object, the pg stats become invalid, and the next scrub will report a problem.
We could mark the st...
Sage Weil
02:20 PM Linux kernel client Bug #3519: rbd map hang during system startup
We've learned a few things since my last update, but the main
thing is that Nick tried the latest thing I offered an...
Alex Elder
11:41 AM Bug #3496 (Resolved): doc: have old URL's redirect to new ones
John Wilkins
11:41 AM Documentation #3564 (Resolved): doc: many broken links since rearrangement
John Wilkins
11:40 AM rgw Documentation #2989 (Resolved): doc: write RGW troubleshooting
John Wilkins
11:40 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
Apparently foo was the image name, and myimage was added and foo not removed. John Wilkins
11:36 AM Bug #3656 (In Progress): docs: "foo" doesn't mean anything in rbd example
John Wilkins
05:29 AM Bug #3656: docs: "foo" doesn't mean anything in rbd example
Whoops, I forgot to assign it.
Alex Elder
05:28 AM Bug #3656 (Resolved): docs: "foo" doesn't mean anything in rbd example
Someone named "Ugis" on the mailing list was having trouble
with the rbd command. One of the things this person men...
Alex Elder
10:10 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
Fixed, commit:04e7a5ca1364166a6b93e6cd0fcf58faf629a01c Yehuda Sadeh
09:47 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
Fixed, commit:92b59e90590aee501ae090adebf58978912f9dd3. Yehuda Sadeh
09:42 AM Bug #3658: osd/mon: stops processing pg stat messages
see /a/sage-ooo2, /a/sam-ooo3 Sage Weil
09:42 AM Bug #3658 (Resolved): osd/mon: stops processing pg stat messages
... Sage Weil
09:37 AM Feature #3622 (Rejected): RADOS pools should support more than 65535 PGs
kernel limit only Sage Weil
07:40 AM Bug #3633: mon: clock drift errors not reported by ceph status
'HEALTH_OK' and 'HEALTH_WARN' are assessed in a way that makes it non-trivial to leverage the existing way of doing t... Joao Eduardo Luis
06:03 AM Revision 799c59ae (ceph): rgw: remove useless configurable, fix swift auth error handling
Fixes: #3649
No need to have an extra configurable to use keystone. Use keystone
whenever keystone url has been speci...
Yehuda Sadeh
06:03 AM Revision 08c64249 (ceph): rgw: don't initialize keystone if not set up
Fixes: #3653
No need to initialize keystone, including the keystone
revocation thread which was verbose if key stone ...
Yehuda Sadeh
05:56 AM Bug #3657 (Resolved): rbd: crash mapping image
I'm just creating this to track some activity from someone
on the mailing list reporting kernel crashes when attempt...
Alex Elder
01:07 AM Revision 3ed2d59e (ceph): rgw: fix error handling with swift
Fixes: #3649
verify_swift_token returns a bool and not an int.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
12:51 AM Revision 9a9778fb (ceph): Merge remote-tracking branch 'upstream/wip_pg_temp' into next
Reviewed-by: Sage Weil <sage@inktank.com>
Reviewed-by: Joao Luis <joao.luis@inktank.com>
Samuel Just

12/19/2012

11:19 PM CephFS Bug #3655 (Can't reproduce): client: hang in fsstress
fsstress stuck in _read_sync()
#0 pthread_cond_wait@@GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_6...
Sam Lang
10:22 PM Revision 5497d228 (ceph): doc: Modified the demo configuration file for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:02 PM Revision 40fdd773 (ceph): doc: Added Gateway Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:02 PM Revision 5281ee24 (ceph): doc: Added Gateway Quick Start configuration file.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:01 PM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
Fixed, commit:799c59ae89c9a70f08d9bf2e7624d25e6641d41f. Yehuda Sadeh
05:13 PM rgw Bug #3649 (Fix Under Review): rgw: swift list buckets returns empty result
Yehuda Sadeh
05:02 PM rgw Bug #3649: rgw: swift list buckets returns empty result
Backporting is required for a bad error handling that triggered the symptoms. Yehuda Sadeh
04:59 PM rgw Bug #3649: rgw: swift list buckets returns empty result
This was happening when trying to use keystone, but without specifying 'rgw swift use keystone'. Ended up shortcuttin... Yehuda Sadeh
07:39 AM rgw Bug #3649 (Resolved): rgw: swift list buckets returns empty result
Yehuda Sadeh
10:01 PM Revision 84fb371d (ceph): Updated Getting Started index to include Gateway Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:01 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
done, commit:08c64249eb8cd7922de5c398a9426538918db77c. Yehuda Sadeh
05:13 PM rgw Bug #3653 (Fix Under Review): In bobtail, turn off keystone errors in radosgw.log when not applic...
Yehuda Sadeh
01:30 PM rgw Bug #3653 (Resolved): In bobtail, turn off keystone errors in radosgw.log when not applicable
In bobtail, when radosgw is installed and configured on a cluster node, we see the following errors in radosgw.log, w... Tamilarasi muthamizhan
10:00 PM Revision 5e955103 (ceph): doc: Added REST Gateway link to 5-minute Quick Start.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:52 PM Revision c2b231e4 (ceph): doc: Updated the 5-minute Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:47 PM Revision f596cee7 (ceph): doc: Updated Block Device Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:46 PM Revision 60b2857d (ceph): doc: Updated CephFS Quick Start for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:45 PM Revision d17bd384 (ceph): doc: Added authentication and mkcephfs settings for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:36 PM Revision cd5c82db (ceph): doc: Added javascript code block tag.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
06:33 PM Revision 6122a9f6 (ceph): OSDMonitor: remove temp pg mappings with no up pgs
Otherwise, the pg won't be validly mapped until one of the temp
pgs comes back up.
Signed-off-by: Samuel Just <sam.j...
Samuel Just
06:32 PM Revision 2395af9f (ceph): OSDMap: make apply_incremental take a const argument
This requires us to copy bufferlists in two cases since bufferlist
does not have a const interator at this time.
Sig...
Samuel Just
05:17 PM rgw Tasks #3152 (Resolved): rgw: document usage testing
Done, commit:2f73c07511dce200b5dd298c6f86e03fbb9b3dd1 Yehuda Sadeh
05:16 PM rgw Feature #3494 (Closed): ceph S3 upload slowly
Closing, need more info about the specific user problem. Yehuda Sadeh
05:15 PM rgw Bug #3620 (Fix Under Review): rgw:improve multiple user access keys scalability
Yehuda Sadeh
05:13 PM rgw Bug #3648 (Fix Under Review): rgw: swift put object with empty mime type crashes
Yehuda Sadeh
07:39 AM rgw Bug #3648: rgw: swift put object with empty mime type crashes
[[https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1092137]] Yehuda Sadeh
07:38 AM rgw Bug #3648 (Resolved): rgw: swift put object with empty mime type crashes
Yehuda Sadeh
04:38 PM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
Given xml like:... Josh Durgin
04:37 PM Revision 2e49d5c4 (ceph): cephtool: add qa workunit
A few basic sanity checks, including a tell on a down osd.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:03 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
Proposed fix in wip-3637. The client's max size request in MClientCaps gets dropped if the file lock is in a non-sta... Sam Lang
03:10 PM Linux kernel client Bug #3519: rbd map hang during system startup
I found a possible explanation of the problem, and have
created and pushed a fix on top of the code that I most
rec...
Alex Elder
09:06 AM Linux kernel client Bug #3519: rbd map hang during system startup
Nick provided more information:
https://gist.github.com/raw/4330223/2f131ee312ee43cb3d8c307a9bf2f454a7edfe57/rbd...
Alex Elder
03:00 PM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
Dave Chinner has confirmed my explanation. The bug no
longer exists (in its current form) in the latest code,
so w...
Alex Elder
10:46 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
I'm fairly sure this is an XFS problem, so as suggested by
Ian I'm marking this "Won't Fix" (again). If new evidenc...
Alex Elder
06:27 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
Dave Chinner responded to my note with a few questions
requesting more information. I spent some time this
morning...
Alex Elder
12:30 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
Pushed fixes to wip-3625 (ceph and ceph-client repos) that implement #3 (mds sends back the created flag in reply to ... Sam Lang
12:29 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
David and I have posted comments on github about the fix to allow multiple
clients opening the same file to get a va...
Sam Lang
12:11 PM Bug #3652 (Duplicate): split should not mess up stats
this will be replaced with bugs corresponding to a design of some kind Samuel Just
12:10 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
Samuel Just
11:47 AM rbd Bug #3611 (Resolved): rbd.py: segfault with many snapshots
This was caused by c3107009f66bc06b5e14c465142e14120f9a4412. Reverting it fixes the problem. There is a corrected imp... Josh Durgin
11:44 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
This still occurs with the wip-3611 branch, so it is a different problem. Josh Durgin
11:15 AM Bug #3633: mon: clock drift errors not reported by ceph status
Here's my config: http://pastie.org/5554031
I'm pretty sure there was no warning when I did 'ceph -w', because I w...
Corin Langosch
08:25 AM Bug #3633 (In Progress): mon: clock drift errors not reported by ceph status
I'm looking into an adequate way to make 'ceph -s' return a warning when the clocks have drifted.
However, 'ceph -...
Joao Eduardo Luis
10:29 AM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
Below works, but "ceph -s" does not
ubuntu@ceph1:~$ ceph health
2012-12-19 18:27:51.090414 mon <- [health]
2012-...
Anonymous
10:22 AM Bug #2784 (Resolved): osd hit suicide timeout
Not actually a bug in the renzhi case. Samuel Just
08:18 AM rgw Feature #3207: qa: swift functional tests in nightly
from James' last bug report:... Sage Weil
08:02 AM Bug #3650: osd: crash in Reset state -> start_peering_interval -> on_change -> process_event Reset
that line of code is... Sage Weil
07:52 AM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
... Sage Weil
05:00 AM Revision d9c2396b (ceph): ceph.spec.in: Improve finding location of jni.h for sles11.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
04:08 AM Revision b2eb8bd2 (ceph): osd: implement 'version' tell command
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:40 AM Revision 46344105 (ceph): ceph.spec.in: Add packages for libcephfs-jni and libcephfs-java
Signed-off-by: Gary Lowell <gary.lowell@inktank.com> Gary Lowell
03:21 AM Revision 85763f09 (ceph): ceph: report error string to stderr, not stdout
If we return an error, send the message to stderr. This makes things
more easily scriptable because error messages w...
Sage Weil
03:20 AM Revision 5f24e23b (ceph): ceph: fix error reporting when tell target is invalid or down
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:11 AM Revision b00eb6fd (ceph): mon: 'ceph osd ls'
List osd ids that exist.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
01:00 AM Revision 212f6b56 (ceph): OSDMap::dump: tag pg_temp mappings with pgid
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com>
Samuel Just

12/18/2012

10:12 PM Revision 04e7a5ca (ceph): rgw: configurable exit timeout
Fixes: #3638
rgw exit timeout secs : number of seconds to wait for process
to exit cleanly before forcing exit. If s...
Yehuda Sadeh
08:59 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
The hang occurs because a client requests a max size increase, but doesn't have write caps, so the mds puts it on the... Sam Lang
07:53 AM CephFS Bug #3637 (Resolved): client: not issuing caps for with clients doing shared writes
With 3 clients running ceph-fuse, running the ior command:
/tmp/cephtest/binary/usr/local/bin/ior -e -w -r -W -b 1...
Sam Lang
07:40 PM Bug #3646: pg_temp with two down/out osds
sounds exactly right Sage Weil
07:29 PM Bug #3646: pg_temp with two down/out osds
It actually does that already. OSDMonitor::remove_redundant_pg_temp(). I'll hook in around there for the fix, doing ... Samuel Just
06:42 PM Bug #3646: pg_temp with two down/out osds
Good point. We can also remove mappings that match the crush result. Although that is a more expensive scan by the m... Sage Weil
04:41 PM Bug #3646 (Resolved): pg_temp with two down/out osds
Encountered on MassEffect, osdmap is attached.
{ "pgid": "2.25",
"osds": [
30,...
Samuel Just
06:08 PM Bug #3647: forgot the auth options for Cephx and added them later: Get msg: 7ff9faaad700 monclie...
added output for dmesg on ceph1 Anonymous
06:04 PM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
Seeing errors when setting up ceph from scratch with the options in the ceph.conf file. I forgot the auth options f... Anonymous
04:23 PM CephFS Feature #3645 (Resolved): Requesting the ability to rename CephFS snapshots inside the ".snap"-di...
I believe the ability to rename CephFS snapshots can come in handy in many cases. For example, if one wants to imple... Oliver Daudey
02:26 PM Bug #3644 (Resolved): ObjectCacher: discard_set ignores waiters
IO in flight contained entirely in a discarded section will not be acked to the caller, since the waiters are removed... Josh Durgin
01:41 PM Bug #3643 (Resolved): default authentication on the client does not work without a config file or...
On a single node bobtail cluster,the ceph-auth setting is as mentioned below,
ubuntu@burnupi09:/etc/ceph$ sudo cat...
Tamilarasi muthamizhan
01:26 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
Instead of using an assert version op, establish the watch before reading the header. This hasn't actually caused any... Josh Durgin
12:01 PM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely

Moved to #3641
Sam Lang
10:56 AM CephFS Bug #3639 (Duplicate): kclient: hit EOF prematurely
Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
...
Sam Lang
12:00 PM CephFS Bug #3641 (Resolved): kclient: hit EOF prematurely

Failures seen when running IOR on the kernel client:
WARNING: Task 1 requested transfer of 1048576 bytes,
...
Sam Lang
11:57 AM CephFS Bug #3640 (Duplicate): kclient: hang and kernel panic

Creating a placeholder for the following issue reported by Eric Renfro on the mailing list:
http://thread.gmane....
Sam Lang
11:17 AM rgw Feature #2941 (Fix Under Review): rgw: improve streaming read performance
Yehuda Sadeh
10:36 AM rgw Bug #3638 (Resolved): rgw: configurable exit timeout
Currently exit timeout is 5 seconds, we should make it configurable, and probably have a higher default. Yehuda Sadeh
09:46 AM Bug #2784: osd hit suicide timeout
This bug popped again on v0.55.1
renzhi on IRC stumbled upon it after upgrading from v0.48.2, and has been unable ...
Joao Eduardo Luis
08:54 AM rbd Bug #3611: rbd.py: segfault with many snapshots
This survived overnight testing (with the python librbd tests) with 56 passes. Josh Durgin
08:03 AM Linux kernel client Bug #3519: rbd map hang during system startup
I looked through the latest log message supplied by Nick
Bartos. I scanned through it to look only at the rbd
acti...
Alex Elder
07:39 AM Linux kernel client Bug #3519: rbd map hang during system startup
There has been quite a lot of activity on this bug but it's
all been recorded on the mailing list rather than here.
...
Alex Elder
06:40 AM Bug #3624 (In Progress): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last ...
Answer to my question, based on evidence in this bug:
The control (yaml) file contains this:
overrides:
...
Alex Elder
04:42 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
Note how this was on a cluster with *very* few OSDs (4 at the time!) as I originally mentioned and this may play a fa... Faidon Liambotis
01:12 AM Revision dbe6fb72 (ceph): crushtool: only dump usage on -h|--help
Instead, output a useful error message.
Fix error code to be a success.
Add test for the output usage.
Signed-off-...
Sage Weil
01:12 AM Revision 6c7ec2d4 (ceph): crushtool: nicer error message on extra args
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:51 AM Revision 0dd13025 (ceph): Merge remote-tracking branch 'gh/testing' into next
Sage Weil
12:38 AM Revision fd482a27 (ceph): ceph.spec.in: Update pre-reqs for ceph-fuse pacakge.
Gary Lowell
12:29 AM Revision 1b67a438 (ceph): Revert "objecter: don't use new tid when retrying notifies"
This reverts commit c3107009f66bc06b5e14c465142e14120f9a4412.
This appears to be causing problems in the objecter by...
Sage Weil
12:14 AM Feature #3288: docs: document the chooseleaf command in crush
Commit 9f0510 added docs for multiple crush hierarchies and the examples use chooseleaf, which is still undocumented. Faidon Liambotis

12/17/2012

10:59 PM rbd Bug #3611 (Fix Under Review): rbd.py: segfault with many snapshots
wip-3611 contains a respin of the bad commit. It's passing test_stress_watch with failure injection and the python te... Josh Durgin
11:09 AM rbd Bug #3611: rbd.py: segfault with many snapshots
also, ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16289 Tamilarasi muthamizhan
11:08 AM rbd Bug #3611: rbd.py: segfault with many snapshots
recent log: ubuntu@teuthology:/a/teuthology-2012-12-15_19:00:04-regression-next-testing-basic/16281 Tamilarasi muthamizhan
10:53 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Thanks for the logs. All the differences there are zeroes where actual data should be, but the librbd debug log shows... Josh Durgin
10:41 PM Revision bdc998ef (ceph): mon: OSDMonitor: add option 'mon_max_pool_pg_num' and limit 'pg_num' ac...
Instead of having a hardcoded default, use a configurable one. It is
limited to 65536 until future testing guarantees...
Joao Eduardo Luis
10:39 PM Revision 21c47c6a (ceph): osd: debug EMSGSIZE / OSD_WRITETOOBIG
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:39 PM Revision f81ca898 (ceph): doc/release-notes: don't use format 2 rbd images until after osds upgrade
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
07:14 PM Revision 3c246226 (ceph): crushtool: add --set-chooseleaf-descend-once to help
We forgot to update this in 88f218181a9e6d2292e2697fc93797d0f6d6e5dc.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
06:53 PM Revision 874b2732 (ceph): doc/release-notes: 'mon max pool pg num'
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:30 PM Feature #1655: gitbuilder aggregator page
... Sage Weil
04:25 PM Feature #1655: gitbuilder aggregator page
I'm not sure if anyone else has asked, but any chance of sharing the updated server side cgi script which now has aja... Jimmy Tang
04:10 PM Bug #3617 (In Progress): Ceph doesn't support > 65536 PGs(?) and fails silently
Looking closer, I have a feeling this was a large # of pgs making a different bug surface. Jim has been running his ... Sage Weil
04:03 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
Note how your commit changed the (default) limit from 65535 to 65536. Faidon Liambotis
04:01 PM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
The default is now 65536, and can be adjusted using the option 'mon max pool pg num' if higher values are desired. Joao Eduardo Luis
03:57 PM Revision e8b8531e (ceph): doc: fix typo in config file
The option is host, not hostname
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
02:35 PM Bug #3636 (Resolved): sub_op_modify assert(!missing.is_missing(soid));
Encountered in Alexandria, fixed in 047aecd90f1dbfb172f48f9d10b67e82b3a8ce15, may it rest in piece. Samuel Just
12:40 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
Trivial change. Biggest decision is which libc routine to use to spawn the command... Dan Mick
11:42 AM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
The rbd command line interface creates mappings by sending
output to the /sys/bus/rbd/add file system entry, and rem...
Alex Elder
11:14 AM rgw Feature #3634 (Resolved): rgw: improve teuthology radosgw-admin test
Yehuda Sadeh
11:10 AM rgw Bug #3620: rgw:improve multiple user access keys scalability
Possibly impacts interoperability Ian Colle
11:09 AM rgw Bug #3628: rgw: leak of object parts on partial upload
Appears to only be in Argonaut Ian Colle
09:31 AM rgw Bug #3628: rgw: leak of object parts on partial upload
Actually, per user, this affects older versions (argonaut), but does not happen in newer version. Looking at the code... Yehuda Sadeh
10:53 AM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
Tried to reproduce this behavior to no avail.
There are operations on the test that do hang for a long time, but a...
Joao Eduardo Luis
09:49 AM Bug #3624: BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last function: xfs_...
When XFS gets an I/O error, there is not a lot it can do.
If it happens to involve user data blocks it could continu...
Alex Elder
09:42 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
XFS bug Ian Colle
09:42 AM Bug #3599 (Resolved): mkcephfs should fail out when ceph.conf has an error
Sage Weil
09:38 AM Bug #3632: occasional testrados failure: process_8 exited with a signal
Possibly related to 3611 Ian Colle
09:14 AM Bug #3629: test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
I've gone through the logs again and again, as well as through the code. The logs only show the last couple hundred l... Joao Eduardo Luis
08:08 AM Linux kernel client Bug #2764: xfstest hang; osd socket closed messages
I have posted a fix for the "socket closed" messages, and it has
been reviewed and will fairly soon be pushed to the...
Alex Elder
07:36 AM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
Using argonat 0.48.2. Today all ceph commands were randomly slow. So I checked all hosts, all monitors (3) and osds (... Corin Langosch

12/16/2012

08:49 PM Bug #3632: occasional testrados failure: process_8 exited with a signal
... Sage Weil
08:48 PM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
seen several of these in qa, e.g.... Sage Weil
08:29 PM Revision e9231fe6 (ceph): Makefiles: Two new packages needed in the debian build depdencies.
The ceph test programs that are now being built by default require the junit
and libboost-program-options packages. ...
Gary Lowell
08:29 PM Revision bc9d9d8a (ceph): Refactor rule file to separate arch/indep builds.
Prior to the ceph fs java bindings, all packages where
architecture depdendent so the packaging rules file
worked OK;...
James Page
12:44 PM rbd Bug #2872 (Resolved): RBD resize command allows image size -1
Sage Weil
11:02 AM rbd Bug #2689: qemu iozone test hangs
let's retest this with all of the recent caching fixes? Sage Weil
10:46 AM Bug #3609: mon: track down the Monitor's memory consuption sources
Which in memory maps? Nothing should grow without bound, except perhaps some of the intern monitor messages... Sage Weil
04:30 AM Bug #3609: mon: track down the Monitor's memory consuption sources
Attaching 3 heap profiles from the monitors.
The monitors were under load from the mon workload gen, as well as so...
Joao Eduardo Luis
10:12 AM Bug #3631 (Resolved): osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during ...
old symptom, presumably new bug.... Sage Weil
09:48 AM CephFS Fix #3630 (Resolved): mds: broken closed connection cleanup
Consider:
- client->mds REQUEST_CLOSE
- mds->client CLOSE
- client closes con
- mds see fault, goes to stan...
Sage Weil
07:25 AM Bug #3629 (Fix Under Review): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsi...
Pushed a fix to wip-3629.
After looking into what the OSD does in this case and go through the code, I realized th...
Joao Eduardo Luis
01:45 AM Revision 4bf90782 (ceph): osdc/Objecter: prevent pool dne check from invalidating scan_requests i...
We iterate over ops and, if the pool dne and other conditions are true,
we will immediately return ENOENT and cancel ...
Sage Weil

12/15/2012

09:00 PM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
... Sage Weil
08:33 PM rgw Bug #3628 (Resolved): rgw: leak of object parts on partial upload
Yehuda Sadeh
08:23 PM Bug #3613 (Resolved): Objecter::scan_requests crash
commit:4bf9078286d58c2cd4e85cb8b31411220a377092
passed 100 iterations of the test (previously failed after ~15).
Sage Weil
05:44 PM Bug #3613: Objecter::scan_requests crash
the pool dne check invalidated the iterator. switching to map<> and incrementing hte iterator at hte top of the loop Sage Weil
03:54 PM Bug #3627 (Resolved): osd: segfault in ~MOSDSubOp during thrashing+rbd_fsx
... Sage Weil
09:59 AM Bug #3458: aio enabled but not used
I didn't compile it, it's the version from the ubuntu quantal repository. Is there any way to see which feature have ... Corin Langosch
09:46 AM Bug #3458 (Need More Info): aio enabled but not used
this probably means that libaio wasn't found when you compiled the code? Sage Weil
09:45 AM rbd Fix #3588 (In Progress): rbd.py's clone should take stripe parms, call rbd_clone2
Sage Weil
09:45 AM rbd Feature #2601 (Resolved): rbd: Show image size with an "ls"
Sage Weil
09:44 AM rbd Feature #2634 (Resolved): teuthology: add networking to qemu task
Sage Weil
09:43 AM rbd Bug #3619 (In Progress): librbd: read_iterate sparse behavior broken
Sage Weil
09:42 AM rbd Bug #2689: qemu iozone test hangs
Sage Weil
09:00 AM Bug #3616 (Resolved): osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
Sage Weil
09:00 AM Bug #3603 (Resolved): osd/msgr: mutex assert failure in try_get_pipe
Sage Weil
09:00 AM Bug #3221 (Resolved): disconnect_session_watchers missing pg
Sage Weil
09:00 AM Bug #2954 (Resolved): osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/2538528...
Sage Weil
01:08 AM Revision 601a6c93 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:56 AM Revision 6f978aa5 (ceph): doc: draft bobtail release notes
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil

12/14/2012

11:56 PM Revision 50614811 (ceph): doc: correct meaning of 'pool' in crush
This was recently made less confusing by renaming the default crush
'pool' type to 'root'. Use this terminology every...
Josh Durgin
11:31 PM Revision 286dcbeb (ceph): test: remove underscores from cephfs test names
Google Test documentation strongly suggests avoiding underscores from
unit test names to avoid accidental conflicts w...
Noah Watkins
11:28 PM Revision c9b81510 (ceph): add an fsync-tester workunit to the fuse and kclient suites
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
11:27 PM Revision 673b6820 (ceph): put fsx back in the kernel suite. Looks like this was lost accidentally?
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
11:24 PM Revision 1ec70aa0 (ceph): qa: add a workunit for fsync-tester
It turns out that our suites don't exercise fsync, at least not very much
(I couldn't find it in all the places I loo...
Greg Farnum
10:32 PM Revision 79db5a40 (ceph): Merge branch 'wip_watch' into next
Sage Weil
10:20 PM Revision 641b077f (ceph): mkcephfs: fix == -> =
Another bashism.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
10:17 PM Revision a7de975d (ceph): lockdep: Decrease lockdep backtrace skip by 1
Skipping the top 4 (it starts at 0) calls in the
backtrace actually skips the call that does the lock.
Skip 3 instead...
Sam Lang
09:58 PM Revision bf01b7b2 (ceph): map-unmap.sh: use udevadm settle for synchronization
This script was heuristically using short sleep commands in order to
give udev activity time to complete.
There's a ...
Alex Elder
09:51 PM Revision c728171b (ceph): Merge branch 'wip-upstart' into next
Reviewed-by: Greg Farnum <greg@inktank.com> Sage Weil
09:49 PM Revision e597482f (ceph): upstart: only start when 'upstart' file exists in daemon dir
We need to distinguish between daemons managed by upstart and sysvinit
(and, eventually, systemd). Only start daemon...
Sage Weil
09:49 PM Revision 96f40b14 (ceph): upstart: make starter jobs consistent
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:49 PM Revision 02aca683 (ceph): ceph-disk-activate: mark dir as upstart-managed
Mark the directory so that upstart will manage the daemon. Eventually,
this should be generalized to allow ceph-disk...
Sage Weil
09:38 PM Revision 6ab7db67 (ceph): ReplicatedPG: use default priority for Backfill messages
Backfill messages modify the stats on the replica and therefore
must be sent with the same priority as sub_op_modify ...
Samuel Just
09:38 PM Revision 7e133569 (ceph): ReplicatedPG: do not use priority from client op
There are internal ordering requirements which may be sensitive
to assigned priority. We don't want a mix of priorit...
Samuel Just
07:15 PM rbd Bug #3611: rbd.py: segfault with many snapshots
It looks like the op in the objecter has been corrupted, similar to #3613. In this case, op->objver ends up pointing ... Josh Durgin
07:00 PM Revision b63940ca (ceph): Merge branch 'wip-3610' into next
Sam Lang
06:31 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
I made some commits to wip-3625, which resolve the EEXIST, but now the test returns an EIO... Sam Lang
03:31 PM CephFS Bug #3625 (Resolved): client: EEXIST error on multiple clients to create
Discovered with IOR shared file test on ceph-fuse, if multiple clients attempt to create a file at the same time (do ... Sam Lang
05:53 PM Revision 8d73f3e9 (ceph): Fix comment in sample.ceph.conf
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
04:21 PM Revision 9f051024 (ceph): crush-map.rst: add info about multiple crush heirarchies
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
04:07 PM rbd Bug #3589 (Resolved): rbd.py should check for method existence before calling new methods
Josh Durgin
03:51 PM CephFS Feature #3626 (Resolved): mds: debug mode to generate traceless replies to clients
Sage Weil
03:50 PM Bug #3616: osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
Sage Weil
03:50 PM Bug #3221: disconnect_session_watchers missing pg
Sage Weil
03:50 PM Bug #2954: osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/25385282 bytes.
Sage Weil
11:08 AM Bug #2954 (In Progress): osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/2538...
recent log: ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13809... Tamilarasi muthamizhan
12:43 PM Bug #3602 (Resolved): ceph-fuse crashed when client tries to ceph-fuse mount without keyring
The fix was cherry-picked into wip-stable-fuse and merged to stable. Sam Lang
11:15 AM Bug #3623: libcephfs Caps.ReadZero lockdep weirdness
The fix for this has been merged to next, so we should stop seeing these in qa. Sam Lang
10:46 AM Bug #3623: libcephfs Caps.ReadZero lockdep weirdness
recent log: ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13713... Tamilarasi muthamizhan
11:13 AM CephFS Bug #3610 (Resolved): client: Possible lock cycle in client/objectcacher
Sam Lang
11:13 AM CephFS Bug #3610: client: Possible lock cycle in client/objectcacher
Merged wip-3610 to next. Sam Lang
10:56 AM Bug #3624 (Won't Fix): BUG: workqueue leaked lock or atomic: kworker/0:1/0x00000000/17554 last fu...
log: ubuntu@teuthology:/a/teuthology-2012-12-13_19:00:03-regression-next-testing-basic/13746... Tamilarasi muthamizhan
07:28 AM rbd Bug #2410: hung xfstest #68
This appears to be an XFS problem, where the file
system is having trouble getting space in its
journal. I inquire...
Alex Elder
07:13 AM rbd Bug #2608: rbd: hung xfstest 270
We should re-evaluate this with XFS found in newer kernels.
Maybe this should just be closed and re-opened (or open
...
Alex Elder
07:11 AM Linux kernel client Bug #2764: xfstest hang; osd socket closed messages
I'm pretty sure the "socket closed" messages are fairly
harmless, and the cod that issues them should be changed
to...
Alex Elder
06:08 AM rbd Feature #3418: krbd: write path (layering)
I did a little research on this before starting on the write
path. This work will require the kernel rbd client, th...
Alex Elder
06:00 AM rbd Feature #3417: krbd: read path (layering)
Work on this really started in November 2012.
In October there were a number of cleanup tasks we agreed
should ge...
Alex Elder
03:49 AM Revision f16e5717 (ceph): client: Add config option to inject sleep for tick
Testing the tick delay with a fork/suspend is causing
corruption in the lockdep code. This approach uses
a config op...
Sam Lang
01:43 AM Revision 8cf367cb (ceph): rbd.py: check for new librbd methods before use
This way attempting to use format 2 images works when you upgrade the
python bindings before librbd, and attempting t...
Josh Durgin
12:27 AM Revision c9894ff0 (ceph): osd: up != acting okay on mkpg
This can happen when:
- mon sends create pg
- it gets created
- osd remaps the pg to a different osd
but osd...
Sage Weil

12/13/2012

11:48 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Attached files as requested.
Compare was stopped early to save on file size.
Matt Anderson
11:38 PM Revision e3ed28eb (ceph): mon: OSDMonitor: don't allow creation of pools with > 65535 pgs
There are some limitations to the number of possible pg's per pool, and
by allowing the 'osd pool create' command to ...
Joao Eduardo Luis
10:46 PM Revision 8103414a (ceph): rbd: handle images disappearing while in ls -l
rbd.list() returns a list of names, but nothing stops them from
going away before rbd.open(); check for ENOENT and ig...
Dan Mick
10:23 PM Bug #3623 (Duplicate): libcephfs Caps.ReadZero lockdep weirdness
nm, dup of #3610 Sage Weil
10:05 PM Bug #3623 (Duplicate): libcephfs Caps.ReadZero lockdep weirdness
... Sage Weil
10:05 PM Revision 24523913 (ceph): rgw_op: enforce minimum part size in multi-part uploads
Signed-off-by: caleb miles <caleb.miles@inktank.com> caleb miles
08:48 PM Revision aa2214c3 (ceph): mds: document EXCL -> (MIX or SYNC) transition decision
Previously (in w26f6a8e48ae575f17c850e28e969d55bceefbc0f), for reasons that
are somewhat obscured by passage of time,...
Sage Weil
07:54 PM Bug #2843 (Can't reproduce): filestore: replay failure on xfs
Sage Weil
07:53 PM Bug #2819 (Won't Fix): krbd: lockup on large writes, msgr fault injection
Sage Weil
07:49 PM CephFS Bug #3610 (Fix Under Review): client: Possible lock cycle in client/objectcacher
This appears to be related to the fork() in the caps.cc test messing up data structures in the lockdep code when its ... Sam Lang
06:52 PM Revision f2c083ef (ceph): OSD: disconnect_session_watches obc might not be valid after we relock
If disconnect_session_watches races with watch removal, the session
might no longer have a valid obc ref. In that ca...
Samuel Just
06:52 PM Revision 97cc55d5 (ceph): OSD: put connection in disconnect_session_watches as well as the session
obc->watchers now has a ref to the connection as well. This piece of
disconnect_session_watchers essentially paralle...
Samuel Just
06:39 PM Revision c17d628b (ceph): clarify/correct some of sample.ceph.conf
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
05:24 PM Bug #3615: Reproducible OSD crash when recovering the journal
Samuel was kind enough to clarify a bit on IRC: I should be looking for a pginfo/directory named 6.1bc7, since they'r... Faidon Liambotis
04:34 PM Bug #3615: Reproducible OSD crash when recovering the journal
Thanks, I figured as much when you mentioned the existence of that file.
I still think it's a bug though: I believ...
Faidon Liambotis
02:06 PM Bug #3615 (Rejected): Reproducible OSD crash when recovering the journal
Running with nobarrier could explain this error if the pginfo file link wasn't flushed when the sync finished. Our u... Samuel Just
11:43 AM Bug #3615: Reproducible OSD crash when recovering the journal
Thanks for the quick reply. I couldn't find a 6.7111 pg. Am I doing something wrong?
# find /var/lib/ceph/osd/ceph...
Faidon Liambotis
10:54 AM Bug #3615 (Need More Info): Reproducible OSD crash when recovering the journal
on pg 6.7111 the info appears corrupt somehow. can you attach the contents of the meta/.../pginfo file for 6.7111 on... Sage Weil
09:15 AM Bug #3615 (Rejected): Reproducible OSD crash when recovering the journal
Hi,
After an abrupt powercycle of one of the servers, one of the OSDs has trouble booting up. It seems to be getti...
Faidon Liambotis
05:03 PM rbd Bug #3611: rbd.py: segfault with many snapshots
Finally got a backtrace. It seems something is overwriting a Mutex::Locker on the stack:... Josh Durgin
04:41 PM Bug #3616: osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
It has happened quite a few times, but I'm afraid it happens at random when adding new OSDs, so I'd have to run with ... Faidon Liambotis
01:52 PM Bug #3616: osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
This is probably the crash which characterizes the handle_watch_timeout-push race which wip_watch should take care of... Samuel Just
01:20 PM Bug #3616 (Need More Info): osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
Is this something you can reproduce reliably? If so, that would be extremely helpful. For example:
ceph osd tel...
Sage Weil
09:27 AM Bug #3616 (Resolved): osd/ReplicatedPG.cc: 4534: FAILED assert(!missing.is_missing(soid))
Hi,
On a test cluster of ours with Ceph 0.55, we often see OSDs crash due to an assert() call when we add them and...
Faidon Liambotis
04:30 PM Revision 83ee85b8 (ceph): Merge remote branch 'origin/next'
Josh Durgin
04:29 PM rgw Documentation #2483 (Resolved): doc: radosgw api diffs to swift
Sage Weil
01:03 PM rgw Documentation #2483: doc: radosgw api diffs to swift
Done, at commit:6a8a58dc4b71df6d291d67ddad0b5667289d6d3b. Yehuda Sadeh
04:29 PM Revision e6dd0681 (ceph): qa: echo commands run by rbd map-unmap workunit
It's hard to figure out what failed without this.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
04:28 PM Feature #2727 (Resolved): filestore: add split
Sage Weil
04:28 PM Feature #2728 (Resolved): OSD: handle split
Sage Weil
04:27 PM Bug #3614 (Resolved): osd: mkpg 92.5 up [1] != acting [1,0]" in cluster log
Sage Weil
04:18 PM Bug #3614 (Fix Under Review): osd: mkpg 92.5 up [1] != acting [1,0]" in cluster log
Sage Weil
08:26 AM Bug #3614: osd: mkpg 92.5 up [1] != acting [1,0]" in cluster log
np Ian Colle
04:16 PM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
Joao wrote a monitor patch that prohibits counts greater than 65535, and it's merged in. I've created #3622 to deal w... Greg Farnum
11:51 AM Bug #3617 (In Progress): Ceph doesn't support > 65536 PGs(?) and fails silently
Joao Eduardo Luis
09:40 AM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
Hi,
While playing with a test cluster and trying to size it according to production needs & future growth, we deci...
Faidon Liambotis
04:16 PM Feature #3622 (Rejected): RADOS pools should support more than 65535 PGs
That's a pretty low limit for a single-pool cluster, since we find users need ~100 PGs/OSD to get good data distribut... Greg Farnum
03:53 PM CephFS Feature #3568 (Resolved): client: Allow hold_caps_until to be configured
Sage Weil
03:49 PM CephFS Feature #3621 (Closed): qa: add knfsd reexport tests to qa suite
Sage Weil
03:46 PM CephFS Cleanup #3423 (Resolved): Install java libraries into the correct directory
Sage Weil
03:41 PM rgw Bug #3620 (Resolved): rgw:improve multiple user access keys scalability
One solution would be: instead of keeping the complete user data in the key index, it should just point at the uid in... Yehuda Sadeh
03:28 PM Bug #3618: 0.55.1 ceph-mon crashes under heavy load
This is fixed on next, but the commit didn't reach 0.55.1. Joao Eduardo Luis
03:16 PM Bug #3618 (Duplicate): 0.55.1 ceph-mon crashes under heavy load
This bug is a duplicate of #3495. Joao Eduardo Luis
12:43 PM Bug #3618 (Duplicate): 0.55.1 ceph-mon crashes under heavy load
Here is the tail of the monitor log: https://pastee.org/e9347
It core dumped:
https://pastee.org/j4maf
Matthew Via
03:28 PM Bug #3495 (Resolved): ceph-mon crash
Sorry, jumped the gun here. Thought the fix was on 0.55.1 but it is not. It's on next though. Joao Eduardo Luis
03:18 PM Bug #3495 (In Progress): ceph-mon crash
This bug occurred again on 0.55.1, as described on bug #3618. Re-opening as it appears that there's still something e... Joao Eduardo Luis
03:07 PM Feature #3000 (Resolved): osd: balance recovery vs client io
Sage Weil
03:02 PM Bug #3592 (Resolved): Assert (oinfo.last_epoch_started == info.last_epoch_started)
Samuel Just
01:31 PM rbd Bug #3619 (Resolved): librbd: read_iterate sparse behavior broken
Instead of getting a NULL for a hole, we get a zeroed buffer.
Reported on the ML
Sage Weil
01:28 PM CephFS Bug #3559 (Resolved): mds: not issuing RDCACHE to exclusive client for some files
Thought more about it, and I think it's right. Committed something to master that describes the logic in a big comment. Sage Weil
10:11 AM CephFS Bug #3559 (In Progress): mds: not issuing RDCACHE to exclusive client for some files
After discussion this apparently needs a bit more thought. Greg Farnum
10:04 AM CephFS Bug #3559 (Resolved): mds: not issuing RDCACHE to exclusive client for some files
Sage Weil
01:26 PM Bug #3221 (In Progress): disconnect_session_watchers missing pg
Sage Weil
01:00 PM rgw Documentation #2991 (Resolved): doc: expand/complete RGW Swift API reference
Yehuda Sadeh
12:54 PM rgw Bug #3297 (Rejected): Rados Gateway does not handle Transfer-Encoding: chunked
Yehuda Sadeh
12:43 PM Bug #3613: Objecter::scan_requests crash
I don't see any evidence pointing to the recent objecter changes yet. The op->ops vector seems to be invalid though. Josh Durgin
11:15 AM Linux kernel client Cleanup #3583 (Rejected): Convert tabs to spaces
yeah this is a no-no for kernel. if you touch a line of code, fix whitespace, otherwise let it be. Sage Weil
11:14 AM Linux kernel client Bug #2429: ceph-client: verify_authrizer_reply con method never called
Sage Weil
10:13 AM Bug #2683 (Can't reproduce): ceph-fuse: crash during fsstress
We believe this was fixed in the ObjectCacher code changes over the last several months. Greg Farnum
10:04 AM Bug #2683 (Resolved): ceph-fuse: crash during fsstress
Sage Weil
06:01 AM Revision 975003bf (ceph): auth: guard decode_decrypt with try block
This will catch buffer decoding errors (maybe the block is empty) and
return an error string.
May fix (or possibly p...
Sage Weil
05:14 AM Revision ae100cfd (ceph): mount.fuse.ceph: add ceph-fuse mount helper
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:14 AM Revision 448db479 (ceph): mount.fuse.ceph: strip out noauto option
mount -a uses this, but also passes it to mount.fuse.ceph, and libceph
complains:
fuse: unknown option `noauto'
Sig...
Sage Weil
03:40 AM Revision ac92e4d6 (ceph): /etc/init.d/ceph: fs_type assignment syntax error
This handles the remainder of 3581; it's a lot like the problem in
mkcephfs, but it isn't mkcephfs.
Fixes: #3581
Sig...
Dan Mick
01:17 AM Revision 4605fddc (ceph): filestore: Don't keep checking for syncfs if found
Valgrind outputs a warning for unrecognized system calls,
and does so for the syscall(__SYS_syncfs,...) and
syscall(_...
Sam Lang
12:24 AM Revision 8e25c8d9 (ceph): v0.55.1
Gary Lowell

12/12/2012

11:30 PM Revision dba09607 (ceph): OSD: pg might be removed during disconnect_session_watches
We don't hold the osd_lock between the session->watches traversal
and the obc checks.
Signed-off-by: Samuel Just <sa...
Samuel Just
11:30 PM Revision 047aecd9 (ceph): PG,ReplicatedPG: handle_watch_timeout must not write during scrub/degraded
Currently, handle_watch_timeout will gladly write to an object while
that object is degraded or is being scrubbed. N...
Samuel Just
10:51 PM Revision 0dfe6c84 (ceph): ReplicatedPG:, remove_notify, put session after con
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
10:50 PM Revision 695bb3b0 (ceph): ReplicatedPG: only put if we cancel evt in unregister_unconnected_watcher
If we fail to cancel the callback, the callback will fire and
release those resources.
Signed-off-by: Samuel Just <s...
Samuel Just
10:50 PM Revision fdf66b6a (ceph): ReplicatedPG: watchers must grab Connection ref as well
Session refs are not really valid on their own, the
corresponding Connection must remain live for at least
as long as...
Samuel Just
10:38 PM Revision 5f55b388 (ceph): doc: Updated per comments in the mailing list.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:04 PM Bug #3459 (Resolved): osd crash in CephXAuthorizer::verify_reply
other bug is #3414, but it doesn't appear related.
going to merge this change in.
Sage Weil
10:02 PM Bug #3459: osd crash in CephXAuthorizer::verify_reply
these all appear to be unrelated. i had broken tests in my lock teuthology repo, or they were other bugs.
except ...
Sage Weil
11:44 AM Bug #3459: osd crash in CephXAuthorizer::verify_reply
The patch looks fine on its face but several tests in the suite failed. I need to track down if they're familiar erro... Greg Farnum
10:03 PM Bug #3614 (Resolved): osd: mkpg 92.5 up [1] != acting [1,0]" in cluster log
/var/lib/teuthworker/archive/sage-2012-12-11_19:47:13-rados-wip-3459-testing-basic/11975
Sage Weil
09:49 PM Revision 9d714560 (ceph): docs: better documentation of new rgw feature
Document rgw_extended_http_attrs config option.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
09:45 PM Revision 87087245 (ceph): rgw: option to provide alternative s3 put obj success code
Fixes: #3529
Added a new option: rgw_s3_success_create_obj_status.
Expected values are 0, 200, 201, 204. A value of 0...
Yehuda Sadeh
09:45 PM Revision 3a95d976 (ceph): rgw: configurable list of object attributes
Fixes: #3535
New object attributes are now configurable. A list
can be specified via the 'rgw extended http attrs'
co...
Yehuda Sadeh
09:08 PM Revision bece012c (ceph): doc: document swift compatibility
Add a table that specifies swift features compatibility
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
09:08 PM Revision 88229a49 (ceph): docs: add rgw POST object as supported feature
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
09:00 PM CephFS Bug #2288: libcephfs: setxattr returns EEXIST following removexattr
Ah, I wasn't aware of this bug. The commit you mentioned is 323a52ee909621ed0169b86e158370394ba36f62. It makes remo... Sam Lang
06:40 PM CephFS Bug #2288: libcephfs: setxattr returns EEXIST following removexattr
SamL did some stuff involving projected xattrs; was this problem included in that set of changes or is it more compli... Greg Farnum
07:43 PM Bug #3581 (Resolved): init script errors after upgrade from 0.48 to 0.55
above fix in
commit:ac92e4d6bd453ffc77e88ab3ec2d2015b70ba854
Dan Mick
07:35 PM Bug #3581: init script errors after upgrade from 0.48 to 0.55
I think this still needs... Dan Mick
07:33 PM Bug #3581: init script errors after upgrade from 0.48 to 0.55
I was mistaken about the commit above; that fixes the issue for mkcephfs, but not for init-ceph.in. That bug still e... Dan Mick
06:38 PM CephFS Bug #3369 (Resolved): journaled two client session close events
Sage merged that workaround a long time ago, and I think Zheng's recent patches might have fixed some potential root ... Greg Farnum
05:35 PM Bug #3613 (Resolved): Objecter::scan_requests crash
One of the failures in #3459 was due to testrados_watch_notify crashing:... Greg Farnum
05:21 PM Bug #3221: disconnect_session_watchers missing pg
This has popped up again.
I think the easy fix is to make the map reference pg's and obc's by name, and do the loo...
Sage Weil
05:20 PM Bug #3612 (Duplicate): disconnect_session_watches assert(pg) failed
dup of #3221 Sage Weil
03:12 PM Bug #3612: disconnect_session_watches assert(pg) failed
The assert is bad, no guarantee that the pg is still around after we drop the watch_lock. remove_watchers_and_notifi... Samuel Just
03:09 PM Bug #3612 (Duplicate): disconnect_session_watches assert(pg) failed
... Greg Farnum
04:13 PM rbd Bug #3611: rbd.py: segfault with many snapshots
Downgrading priority since this isn't an actual bug. Josh Durgin
03:41 PM rbd Bug #3611: rbd.py: segfault with many snapshots
Without lockdep, I could not reproduce a crash.
Running with lockdep enabled results in this backtrace:...
Josh Durgin
11:52 AM rbd Bug #3611 (Resolved): rbd.py: segfault with many snapshots
From the nightly python api tests, test_many_snaps failed with a segfault in all runs.
Logs are in:...
Josh Durgin
03:35 PM Bug #3603: osd/msgr: mutex assert failure in try_get_pipe
Yep, this definitely looks to be that bug. Sam has a branch he's testing at wip_watch. Greg Farnum
02:21 PM Bug #3603 (In Progress): osd/msgr: mutex assert failure in try_get_pipe
Well, I dug into this a bit despite the lack of messenger logging, but Sam is pretty sure this is related to a ref-co... Greg Farnum
02:00 PM rgw Bug #3453 (Resolved): rgw: Resume download fails because of mismatched ETags that should match
Yehuda Sadeh
01:45 PM rgw Feature #3535 (Resolved): rgw: configurable list of http attributes
Yehuda Sadeh
01:39 PM rgw Feature #3529 (Resolved): rgw: configurable success status response for put obj
Yehuda Sadeh
12:23 PM Bug #3606: Current -next causes failed assert in ceph-osd
Took an alternative route to making the packages, and the commit does in fact resolve the issue. Sorry for the trouble. Matthew Via
09:29 AM Bug #3606: Current -next causes failed assert in ceph-osd
I've applied the most recent set of patches, and still get the same crash: https://pastee.org/n8bnq
I suppose I co...
Matthew Via
10:31 AM Bug #3609: mon: track down the Monitor's memory consuption sources
Are we failing to clean up forwarded messages? That would certainly be a pretty bad leak that we need to deal with as... Greg Farnum
10:04 AM Bug #3609: mon: track down the Monitor's memory consuption sources
Looks like it make sense that mon.c has a bigger memory consumption, from what the profile indicates. Apparently, mos... Joao Eduardo Luis
09:13 AM Bug #3609 (Resolved): mon: track down the Monitor's memory consuption sources
Left a couple of monitors running overnight with heap profiling active.
Don't have the logs, as I wasn't expecting...
Joao Eduardo Luis
09:56 AM Bug #3608: librbd rbd_remove returns wrong errno when image does not exist
EBUSY is returned from rbd_remove when there are still watchers on the header (i.e. the image is still open, or a pre... Josh Durgin
02:04 AM Bug #3608 (Can't reproduce): librbd rbd_remove returns wrong errno when image does not exist
I'm currently working with librbd and noticed that rbd_remove returns an errno of -16 (EBUSY) when the specified imag... Corin Langosch
09:20 AM CephFS Bug #3610 (Resolved): client: Possible lock cycle in client/objectcacher
Teuthology reported this lock cycle while running test_libcephfs. It was triggered by the Caps.ReadZero test.
For...
Sam Lang
08:14 AM Bug #3607: FileStore::_write conditional code for HAVE_SYNC_FILE_RANGE seems wrong
the if needs to go in the #ifdef, yeah.. but hte way it's split across the if () guts is messy. can you just clean i... Sage Weil
08:07 AM Bug #3607: FileStore::_write conditional code for HAVE_SYNC_FILE_RANGE seems wrong
Tracked this down to https://github.com/ceph/ceph/commit/1477ec73e354972664424b3d98d78a20c24a2ff4 but I still can't r... Gerben Meijer
06:27 AM Revision 64cefe2c (ceph): PG,ReplicatedPG: move write_blocked_by_scrub logic into a helper
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
04:46 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
That's a once or twice thing in a life time. Shouldn't happen often. As Sage pointed out before, there are ways to de... Joao Eduardo Luis
01:49 AM Revision 54618afa (ceph): docs: fix spacing in radosgw config-ref
Needed to add an extra empty line between header and properties.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
01:18 AM Revision 8e6a5353 (ceph): qa: exclude some more xfstests
These worked on a newer kernel, but I forgot I had not updated it for the final image.
Signed-off-by: Josh Durgin <j...
Josh Durgin
01:16 AM Revision 84f90a09 (ceph): Merge branch 'next'
Sage Weil
01:15 AM Revision caea0cbf (ceph): os/JournalingObjectStore: un-break op quiescing during journal replay
Commit d9dce4e9273adb4279519d65a0d8bfdfecb5c516 broke journal replay
because the commit thread may try to do a commit...
Sage Weil
01:07 AM Revision 6a8a58dc (ceph): doc: document swift compatibility
Add a table that specifies swift features compatibility
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
01:07 AM Revision cf28e787 (ceph): docs: add rgw POST object as supported feature
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Yehuda Sadeh
12:47 AM Revision f0d85b73 (ceph): Merge branch 'next'
Josh Durgin
12:39 AM Revision 326dd347 (ceph): Merge remote branch 'origin/wip-double-notify' into next
Reviewed-by: Sage Weil <sage.weil@inktank.com> Josh Durgin

12/11/2012

11:44 PM Revision 39501822 (ceph): st_rados_watch: tolerate extra notifies
With retries, it's possible for notifies to be received more than once
when they are resent to different OSDs, since ...
Josh Durgin
11:11 PM Revision f2dbe5ed (ceph): CephManager: add ability to test split
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
11:08 PM Revision 29307d3b (ceph): mds: shutdown cleanly if can't authenticate
Fixes: #3590
This was triggered when tried to run mds with cephx enabled
against a mon without cephx support. We didn...
Yehuda Sadeh
11:07 PM Revision 6007088c (ceph): Merge remote-tracking branch 'gh/wip-conf' into next
Reviewed-by: Greg Farnu <greg@inktank.com> Sage Weil
10:07 PM Revision 0cd84b3a (ceph): New ssh task that adds keys for node -> node ssh.
This generates a new keypair, pushes it to all nodes
in the context and adds all hosts to all other hosts
.ssh/author...
Joe Buck
10:07 PM Revision 0890d48a (ceph): Adding a Hadoop task.
This task configures and starts a Hadoop cluster.
It does not run any jobs, that must be done after
this task runs.
C...
Joe Buck
10:07 PM Revision b916f679 (ceph): pexec.py: Parse out role ID from the back.
Also, do not assume that the command needs to run from a specific directory.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Joe Buck
08:38 PM Bug #3607 (Resolved): FileStore::_write conditional code for HAVE_SYNC_FILE_RANGE seems wrong
I won't claim to fully understand this, but it looks wrong:... Dan Mick
07:46 PM Bug #3606: Current -next causes failed assert in ceph-osd
I build the package with the provided specfile, but in such a way that it used a .55 source checkout and patched with... Matthew Via
07:35 PM Bug #3606 (Resolved): Current -next causes failed assert in ceph-osd
this was just fixed in next a couple hours ago, commit:caea0cbf9f63d74506d69a596dd3f78097d68da5
by the way, the du...
Sage Weil
06:49 PM Bug #3606 (Resolved): Current -next causes failed assert in ceph-osd
After making packages from 0.55 + the contents of the next branch, one (only one of 8) OSD will not start, and produc... Matthew Via
07:13 PM Revision c3107009 (ceph): objecter: don't use new tid when retrying notifies
Watches update the on-disk state in the OSD, and aren't idempotent,
so refreshing them must be treated as a separate ...
Josh Durgin
06:08 PM Revision dfd31036 (ceph): client: Fix for #3184 cfuse segv with no keyring
Fixes bug #3184 where the ceph-fuse client segfaults if authx is
enabled but no keyring file is present. This was du...
Sam Lang
06:08 PM Revision acebcce9 (ceph): mon: Monitor: resolve keyring option to a file before loading keyring
Otherwise our keyring default location, or any other similarly formatted
location, will be taken as the actual locati...
Joao Eduardo Luis
06:04 PM Bug #3459 (Fix Under Review): osd crash in CephXAuthorizer::verify_reply
the try/catch may be treating hte symptom, but it's definitley correct, and the binary for the qa run is long gone so... Sage Weil
05:59 PM Bug #3459: osd crash in CephXAuthorizer::verify_reply
wth, i could have sworn i pushed something that added a try/catch block around the decode, but now i don't see it. p... Sage Weil
05:19 PM Bug #3507 (Resolved): rados api system tests failure
commit:c3107009f66bc06b5e14c465142e14120f9a4412 Josh Durgin
04:54 PM Bug #3507 (Fix Under Review): rados api system tests failure
wip-double-notify Josh Durgin
05:06 PM Revision 9a40ef01 (ceph): mds: fix journaling issue regarding rstat accounting
Rename operation can call predirty_journal_parents() several times.
So a directory fragment's rstat can also be modif...
Yan, Zheng
05:04 PM rbd Bug #3413: rbd bench-write fails with assert when rbd caching turned on
Josh Durgin
04:58 PM rbd Bug #3589 (Fix Under Review): rbd.py should check for method existence before calling new methods
wip-rbdpy-compat Josh Durgin
04:53 PM rbd Feature #2568 (Resolved): qa: run xfstests on qemu+rbd
Josh Durgin
04:50 PM rgw Bug #3557 (Resolved): rgw: error reading user info after creating subusers
Fixed, commit:0639cd9c479d69b077175f0385eb569ebb839349. Yehuda Sadeh
04:44 PM Revision b9d717cd (ceph): fix build of unittest_formatter
Add CRYPTO_CXXFLAGS to unittest_formatter_CXXFLAGS to find pk11pub.h to
be included in src/common/ceph_crypto.h.
Sig...
Danny Al-Gaaf
04:44 PM Revision be372765 (ceph): include/atomic.h: add stdlib.h for size_t
Include missing stdlib.h needed for size_t.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf
03:34 PM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
The hang on selfmanaged_snap_create seems like a monitor issue; bouncing the monitor, or hanging out in gdb long enou... Dan Mick
03:17 PM Bug #3590 (Resolved): mds segfault at MonClient::wait_auth_rotating during upgrade
Fixed, commit:29307d3b32e523da7f25a6c3bc904749790b1e18. Fix would just make it exit gracefully. Yehuda Sadeh
11:19 AM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
mon is running argonaut, mds is running 0.55+. The problem is that mds doesn't handle gracefully a case where mon rej... Yehuda Sadeh
02:10 PM CephFS Bug #3370 (Resolved): All nfsd hung trying to lock page(s) on export of kclient ceph
commit: 2978257c56935878f8a756c6cb169b569e99bb91 David Zafman
11:34 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
Real world example of needing to change ip numbers....
john roman john.roman@dreamhost.com via hq.newdream.net...
Anonymous
09:28 AM CephFS Bug #3597 (Can't reproduce): ceph-fuse: denying root access
I don't see this behavior with fuse 2.9.0 and latest ceph. Does it happen only on some files? What are the permissi... Sam Lang
06:00 AM Revision bcf1461c (ceph): Merge remote-tracking branch 'upstream/wip_split2' into next
Reviewed-by: Greg Farnum <greg@inktank.com> Samuel Just
05:01 AM Subtask #3605 (Resolved): mon: print lookup path when reporting -ENOENT to user-space
See parent task. Joao Eduardo Luis
04:59 AM Feature #3604 (Resolved): print lookup path when reporting -ENOENT to user-space
With the advent of 0.55, some users mentioned that the error message is not helpful when the keyring file is not foun... Joao Eduardo Luis
03:03 AM Revision 1699b7dc (ceph): OSD: get_or_create_pg doesn't need an op passed in
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
02:07 AM Bug #3603 (Resolved): osd/msgr: mutex assert failure in try_get_pipe
This happened on the next branch with a client side objecter change I was testing with 1 osd using vstart and test_st... Josh Durgin
01:45 AM Revision 6a4fa89a (ceph): LFNIndex: fix move_subdir comments
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:40 AM Revision fdcdca7d (ceph): HashIndex: fix typo in reset_attr documentation
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:39 AM Revision 7eac9682 (ceph): HashIndex: init exists in col_split_level and reset_attr
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:31 AM Revision 12673c24 (ceph): PrioritizedQueue: increment ret when removing items from list
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:30 AM Revision 80cca214 (ceph): PrioritizedQueue: move if check out of loop in filter_list_pairs
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:08 AM Revision 331c2504 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
12:41 AM Revision a50c7d3b (ceph): config: do not always print config file missing errors
Do not generate errors each time we fail to open a config file; only
generate one at the end if a search path was spe...
Sage Weil

12/10/2012

10:44 PM Revision 6fb9a558 (ceph): config: always complain about config parse errors
Complain about config parsing errors even when it is the default
config file.
We may also want to fail instead of co...
Sage Weil
10:37 PM Revision a5b9939e (ceph): ceph.conf: default to smaller recovery chunk
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Samuel Just
10:34 PM Revision e4d0aeac (ceph): Merge remote-tracking branch 'gh/wip-filestore2' into next
Reviewed-by: Sam Just <sam.just@inktank.com> Sage Weil
10:14 PM Revision 2e7cba7b (ceph): doc: fixed indent in python example.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:53 PM Revision 788992bb (ceph): config_opts.h: adjust recovery defaults
osd max backfills: 5 was too low for a default, 10
seems to work better in testing. The message
priority system sh...
Samuel Just
08:55 PM Revision 45865285 (ceph): Merge remote-tracking branch 'gh/wip-3559' into next
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
08:12 PM Bug #3602: ceph-fuse crashed when client tries to ceph-fuse mount without keyring
I tried installing ceph-fuse from next branch, and I dont see any segfault.
ubuntu@burnupi62:~$ sudo ceph-fuse -v
...
Tamilarasi muthamizhan
05:45 PM Bug #3602: ceph-fuse crashed when client tries to ceph-fuse mount without keyring
Did you check for similar bugs? Not certain but I believe this has been fixed in a dev release since then. (Maybe sho... Greg Farnum
05:43 PM Bug #3602 (Resolved): ceph-fuse crashed when client tries to ceph-fuse mount without keyring
Bug found by Ken:
upgraded argonaut cluster to bobtail [burnupi16, burnupi17, burnupi18] and the client[burnupi62]...
Tamilarasi muthamizhan
08:06 PM Revision f7b26958 (ceph): Merge branch 'next'
Josh Durgin
08:06 PM Revision 1bdd5c3b (ceph): Fix qemu options for xfstests
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
07:57 PM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
Hm #2. Both runs are deadlocked on the mutex IoCtxImpl::selfmanaged_snap_create::mylock. Will look for owner next. Dan Mick
07:23 PM rbd Bug #3600: rbd: assert in objectcacher destructor after flatten
Looking at the code, I don't really see a clean "stop caching" mechanism. While I look, hacked fsx to do only write/... Dan Mick
11:40 AM rbd Bug #3600 (Duplicate): rbd: assert in objectcacher destructor after flatten
From ubuntu@teuthology:/a/teuthology-2012-12-09_19:00:03-regression-master-testing-gcov/10977:... Josh Durgin
07:52 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
I was able to reproduce this issue on burnupi06[running mds], burnupi07[osds], burnupi08[mon] cluster with debugs on.... Tamilarasi muthamizhan
04:47 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
I am planning to reproduce it with debug on and show it to Greg Tamilarasi muthamizhan
04:44 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
Not from my end. I'm not working at Inktank, today. I'll be in tomorrow and I hope to make progress on it then.
Anonymous
04:12 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
Anything new here? Sage Weil
06:51 PM Revision f4be3c8d (ceph): doc: Added sudo to ceph -k command.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
06:24 PM Revision 37095195 (ceph): doc: Fixed typo.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
06:19 PM Revision 47c81a3b (ceph): Makefile.am: add missing flags to some tests targets
adding CRYPTO_CXXFLAGS to some targets. This is required when
building --with-nss.
Signed-off-by: Yehuda Sadeh <yehu...
Yehuda Sadeh
04:49 PM Bug #3593: MDS crash in MDCache.cc _recovered()
Both of these logs show a respawn because the MDS got removed from the map (generally, for not heartbeating). That ag... Greg Farnum
04:31 PM Bug #3593: MDS crash in MDCache.cc _recovered()
Here is a piece of log from an mds dying with ms=1 and mds=20: https://pastee.org/hf3cy
Here's another from another ...
Matthew Via
11:03 AM Bug #3593: MDS crash in MDCache.cc _recovered()
Oh, I see. It does indeed start with a bunch of broken pipe messages between the MDS and monitor. Can you post more o... Greg Farnum
10:59 AM Bug #3593: MDS crash in MDCache.cc _recovered()
At the moment I can't do any more to debug this (until this evening), but I have set the beacon grace to 120 seconds ... Matthew Via
10:55 AM Bug #3593: MDS crash in MDCache.cc _recovered()
This is nothing to do with the pipe changes. It's getting ESHUTDOWN back from the OSD, not out of the local pipe — we... Greg Farnum
04:46 PM Bug #2437 (Resolved): osd: very slow during recovery
Sage Weil
04:19 PM Bug #3221: disconnect_session_watchers missing pg
Sage Weil
04:18 PM Bug #3221: disconnect_session_watchers missing pg
I think the locking here is just broken. The obc always goes away at PG reset time when it is removed from the sessi... Sage Weil
11:37 AM Bug #3221 (In Progress): disconnect_session_watchers missing pg
recent log: ubuntu@teuthology:/a/teuthology-2012-12-09_19:00:03-regression-master-testing-gcov/10911
Tamilarasi muthamizhan
02:10 PM Bug #3591: auth: could not find secret_id=0
They were running 0.55. I don't think it's failure to rotate keys, as the secret_id was consistently 0. I'd expect a ... Yehuda Sadeh
01:56 PM Bug #3591: auth: could not find secret_id=0
That i can't remember.. but that message shouldn't come up on the server ever unless it is failing to rotate its keys... Sage Weil
01:44 PM Bug #3591: auth: could not find secret_id=0
Did you see it happening with anything other than the kernel clients? Yehuda Sadeh
01:35 PM Bug #3591: auth: could not find secret_id=0
Have seen this pop up in several places.
I bet we can find it with 'debug auth = 0/20' (so that it is all logged i...
Sage Weil
01:57 PM Bug #3599 (In Progress): mkcephfs should fail out when ceph.conf has an error
wip-conf will spam stderr unconditionally about parse errors. There might be other fallout, though, and the behavior... Sage Weil
11:37 AM Bug #3599: mkcephfs should fail out when ceph.conf has an error
I think the "right" fix here would be for things parsing the conf to generate something on stderr if they see the syn... Sage Weil
11:24 AM Bug #3599: mkcephfs should fail out when ceph.conf has an error
mkcephfs is a terrible, terrible thing. You're right that it should, but there are lots and lots of things that it sh... Greg Farnum
11:16 AM Bug #3599 (Resolved): mkcephfs should fail out when ceph.conf has an error
In the /etc/ceph/ceph.conf, there was a missing bracket after [mon.b , which did not get caught during mkcephfs. It ... Anonymous
01:14 PM CephFS Bug #3601 (Resolved): client: With multiple clients, file remove doesn't free up space
If two or more clients have ceph mounted and one client removes a file, the space for that file doesn't get freed on ... Sam Lang
01:02 PM CephFS Bug #3572 (Resolved): High CPU after equivalent to node very busy
commit:ed75ec2cd19b47efcd292b6e23f58e56f4c5bc34 David Zafman
12:48 PM Bug #3507: rados api system tests failure
recent log:
ubuntu@teuthology:/a/teuthology-2012-12-10_07:00:03-regression-testing-master-basic/11115
Tamilarasi muthamizhan
12:38 PM Bug #3578 (Resolved): auth: auth_client_required has not default
Sage Weil
12:35 PM devops Feature #2699 (In Progress): crowbar: change barclamp-glance to use rbd
Greg Farnum
12:34 PM devops Feature #2583 (Resolved): crowbar: change barclamp-nova to use rbd
This exists and has been posted on the ceph.com site (and shared with partners) for a while now! Issues with distribu... Greg Farnum
11:40 AM rbd Bug #3524 (Resolved): test_librbd_fsx: crash after flatten
That's a different bug. Created #3600 to track it. Josh Durgin
11:36 AM rbd Bug #3524 (In Progress): test_librbd_fsx: crash after flatten
recent log :
ubuntu@teuthology:/a/teuthology-2012-12-09_19:00:03-regression-master-testing-gcov/10977
Tamilarasi muthamizhan
11:14 AM Bug #3594 (Duplicate): MDS crash in auth code?
I believe this is a duplicate of #3593, just with a little less helpful output in the log. :) Greg Farnum
09:33 AM Bug #3594: MDS crash in auth code?
Not sure if it's in the auth code. The thread id for the previous message is different than the one that triggered th... Yehuda Sadeh
11:11 AM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Since the size isn't an issue, it'd be great if you could:

1) generate a log of qemu-img convert with 'rbd cache ...
Josh Durgin
11:05 AM Bug #3538 (Resolved): rbd fsx test causes osd attr value mismatch err
Sage Weil
11:03 AM Bug #3538: rbd fsx test causes osd attr value mismatch err
This may have been fixed by 58890cfad5f7bee933baa599a68e6c65993379d4. Watch operations weren't marked as writes, and... Samuel Just
11:05 AM Bug #2954 (Resolved): osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/2538528...
Sage Weil
11:02 AM Bug #2954: osd: scrub stat mismatch, got 18/19 objects, 14/15 clones, 22478527/25385282 bytes.
This may have been fixed by 5c8cbd28207195b094799a7bdbad0019669682a8. Samuel Just
11:04 AM CephFS Bug #3598 (Resolved): MDS should shut down cleanly on EBLACKLIST
From #3593, suicide() apparently doesn't turn the MDS off so it asserts and core dumps. Greg Farnum
09:32 AM CephFS Bug #3597: ceph-fuse: denying root access
"denying root access"? You mean root can't read the files, but other people can? Or nobody can?
Either way this is...
Greg Farnum
08:47 AM CephFS Bug #3597 (Resolved): ceph-fuse: denying root access
lxo: ceph-fuse also recently started denying root access to files that shouldn't be readable except for root superpo... Sam Lang
08:46 AM CephFS Bug #3596 (Can't reproduce): ceph-fuse: crash in mds rejoin
lxo: ceph-fuse crashes in the destructor of a non-empty xlist of a snap_realm upon mds rejoin. no quick reproducer ... Sam Lang
05:17 AM Bug #3595 (Won't Fix): ceph-osd and ceph-mds crash on Debian Squeeze
Using the official packages for 0.48 and 0.55 on Debian Squeeze always leads to a crash of the ceph-osd and ceph-mds ... Jörg Blank

12/09/2012

05:44 AM Revision 333b3f43 (ceph): mon: fix leak of pool op reply data
We pass a pointer because it is an optional argument, but we shouldn't
put the bufferlist on the heap or else we have...
Sage Weil

12/08/2012

05:32 PM Revision d9dce4e9 (ceph): filestore: simplify op quescing
The delicate balancing with op_apply_start() and that fact that it can
block was making it very hard to determine how...
Sage Weil
05:32 PM Revision ad4158d1 (ceph): os/JourningObjectStore: drop now-useless max_applying_seq
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:32 PM Revision a88b5849 (ceph): os/JournalingObjectStore: remove unused ops_submitting
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:32 PM Revision f66fe778 (ceph): os/JournalingObjectStore: simplify op_submitting sanity check
A list is overkill; just use a seq and make sure it increments to ensure
the op_submit_finish calls are in order.
Si...
Sage Weil
05:31 PM Revision d4c6a22d (ceph): rgw: document admin api web interface.
Signed-off-by: caleb miles <caleb.miles@inktank.com> caleb miles
05:24 PM Revision 25ea0696 (ceph): osd: make pool_stat_t encoding backward compatible with v0.41 and older
In particular, this is the encoding that is used in precise.
Fixes: #3212
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:18 PM Revision 81e567c9 (ceph): Merge remote-tracking branch 'gh/wip-ceph-test' into next
Sage Weil
05:17 PM Revision e227c709 (ceph): crush/CrushWrapper: do not crash if you move an item with no current home
This will let us take an existing orphan and place it somewhere.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:16 PM Revision 1acb6910 (ceph): mon: Elector: init elector before each election
Fixes: #3587
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
05:12 PM Revision 42d21937 (ceph): Merge branch 'testing' into next
Sage Weil
05:12 PM Revision f3029833 (ceph): init-ceph: =, not ==
Reported-by: v@alan.lt
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
11:29 AM Bug #3593: MDS crash in MDCache.cc _recovered()
This looks like the objecter is trying to send and getting the ESHUTDOWN error code, because the mds tries to reconne... Sam Lang
10:40 AM Bug #3593 (Can't reproduce): MDS crash in MDCache.cc _recovered()
While rsyncing to cephfs, the active mds frequently crashes. Attached is the tail of the logfile of one of them. Matthew Via
10:45 AM Bug #3594 (Duplicate): MDS crash in auth code?
In my three-mds setup, the active MDS frequently loses its active status (and sometimes crashes in the process). Att... Matthew Via
09:20 AM Bug #3587 (Resolved): mon: election doesn't finish during heavy mon thrashing
Sage Weil
09:16 AM Bug #3581: init script errors after upgrade from 0.48 to 0.55
Fixed (in next and testing branches), thanks! Sage Weil
06:32 AM Bug #3581: init script errors after upgrade from 0.48 to 0.55
Hi,
not all. After apply path I see error "/etc/init.d/ceph: 303: [: btrfs: unexpected operator"
root@bacula-s...
Alan V
07:11 AM Revision 8816b39a (ceph): debian: add ceph.postinst to remove /etc/init/ceph.conf on update
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Dan Mick
06:36 AM Revision fc58299e (ceph): PG: remove last_epoch_started asserts in proc_primary_info
These asserts are valid for a uniform cluster, but they won't hold
for a replica running a version without the info.l...
Samuel Just
06:33 AM Revision 81fdea13 (ceph): auth: set default auth_client_required
Fixes: #3578
Set auth_client_required to default to "cephx, none".
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
06:33 AM Revision a3908a68 (ceph): auth: changed order of test for legacy and new authentication
Changed order of test for legacy and new configuration options
in several places.
Signed-off-by: Peter Reiher <reihe...
Peter Reiher
06:32 AM Revision 907da185 (ceph): auth: improve logging
Add some logging around failure cases.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
12:43 AM Revision f9d090ef (ceph): Merge branch 'next'
Merge of wip-rbd-export-progress
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Dan Mick
12:41 AM Revision 83557330 (ceph): rbd: use ExportContext for progress, not cerr
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
Dan Mick

12/07/2012

11:49 PM Revision 13e3e586 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
11:48 PM Revision e0761fbd (ceph): doc: Added sudo to the service start command.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:49 PM Revision 778bad12 (ceph): doc: Moved sudo to before ssh instead of before tee.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:34 PM Revision 413b5d0a (ceph): doc: inverted the steps per doc feedback.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:32 PM Revision f098cb91 (ceph): Merge branch 'next'
Merge of wip-rbd-create:
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Josh Durgin
08:51 PM Bug #3592: Assert (oinfo.last_epoch_started == info.last_epoch_started)
This was fixed in 0756052cff542ab02d653b40c37a645b395f31b3 Dan Mick
08:23 PM Bug #3592 (Resolved): Assert (oinfo.last_epoch_started == info.last_epoch_started)
Since upgrading one of my boxes to 0.55 I have been receiving this assert:
(oinfo.last_epoch_started == info.last_...
Jeff Mitchell
08:18 PM Revision 636048db (ceph): mds/locker: Add debugging for excl->mix trans
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
08:18 PM Revision 07b36992 (ceph): mds: move from EXCL to SYNC if nobody wants to write
We were moving to the MIX even if nobody wanted to write; that is not
useful, since if we only want to read SYNC will...
Sage Weil
08:18 PM Revision fa5a46c7 (ceph): test/libcephfs: Add a test for validating caps
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
08:18 PM Revision 10bf1509 (ceph): client: Add routine to get caps of file/fd
In order to properly validate the client capabilities,
we need to be able to access them from libcephfs.
Signed-off-...
Sam Lang
06:55 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
I ran ... Matt Anderson
12:44 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
It looks like qemu-img info is also reporting the size after using integer division and multiplication by 512, so it ... Josh Durgin
12:36 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
Just reproduced a bad size (504 bytes less) when using qemu-img 1.1.2 to convert a 1024119288 byte file. It seems to ... Josh Durgin
01:18 AM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
All of the imported images are showing the exact same size. TS7 is a qemu-img import and TS6 is a rbd import. Using q... Matt Anderson
01:05 AM rbd Bug #3585 (In Progress): Image import via QEMU-IMG results in a corrupt rbd
As a workaround you can use 'rbd import file pool/image' on a raw file. Does the corrupted image show the correct siz... Josh Durgin
12:21 AM rbd Bug #3585 (Closed): Image import via QEMU-IMG results in a corrupt rbd
This is a follow on from the mailing list topic VM Corruption on "0.54 when 'client cache = false'". After upgrading ... Matt Anderson
06:37 PM Revision c1bf2291 (ceph): librbd: bump version for new functions
copy2, clone2, and create3 are new.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
06:37 PM Revision 57d5c699 (ceph): librbd: clean up after errors in create
Split format 1 and 2 image creation into separate functions for better
readability. Format 2 requires more error hand...
Josh Durgin
06:37 PM Revision efc66148 (ceph): librbd: change internal order parameter to pass-by-value
It doesn't change in any of these places.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
05:25 PM rbd Bug #3589 (In Progress): rbd.py should check for method existence before calling new methods
Josh Durgin
04:02 PM rbd Bug #3589 (Resolved): rbd.py should check for method existence before calling new methods
If rbd.py is upgraded past librbd, it should not fail because e.g. rbd_create3 does not exist in the old version. Josh Durgin
05:24 PM Bug #3591 (Closed): auth: could not find secret_id=0
User reported that when using kernel rbd, periodically appears in the log:... Yehuda Sadeh
05:16 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
Can you reproduce this with debug ms = 10, debug auth = 10, debug mds = 10 on? Greg Farnum
05:14 PM Bug #3590: mds segfault at MonClient::wait_auth_rotating during upgrade
I restarted the running cluster after removing the entry "auth client required=cephx,none" in ceph.conf and it is sti... Tamilarasi muthamizhan
05:05 PM Bug #3590 (Resolved): mds segfault at MonClient::wait_auth_rotating during upgrade
Upgrade from ceph argonaut v0.48.2 to v0.55
Test steps:
1. Running a local cluster burnupi06[mds], burnupi07[os...
Tamilarasi muthamizhan
03:59 PM rbd Fix #3588: rbd.py's clone should take stripe parms, call rbd_clone2
In danger of forgetting about this unless I write it down somewhere:... Dan Mick
03:57 PM rbd Fix #3588 (Resolved): rbd.py's clone should take stripe parms, call rbd_clone2
Dan Mick
03:58 PM rbd Bug #2677 (Resolved): librbd: create does not clean up well
Josh Durgin
11:06 AM rbd Bug #2677 (Fix Under Review): librbd: create does not clean up well
Josh Durgin
12:27 PM CephFS Bug #3559: mds: not issuing RDCACHE to exclusive client for some files
I changed those calls to ceph_debug_get_* and update the wip-3559 branch. Sam Lang
12:16 PM Revision bc6f7268 (ceph): mon: PGMonitor: erase entries from 'creating_pgs_by_osd' when set is empty
This patch avoids sending empty MOSDPGCreate's every tick.
Fixes: #3571
Signed-off-by: Joao Eduardo Luis <joao.luis...
Joao Eduardo Luis
12:10 PM Revision f81d7207 (ceph): doc/install/os-recommendations: fix syncfs notes
For argonaut, squeeze and wheezy lack syncfs.
For bobtail, only older kernels are problematic; we don't depend on gl...
Sage Weil
12:09 PM Revision 4d43c863 (ceph): doc: fix bobtail version in os-recommendations
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:04 PM Revision e1c27fe1 (ceph): mon: Monitor: rework 'paxos' to a list instead of a vector
After adding the gv patches, during Monitor::recovered_leader() we started
waking up contexts following the order of ...
Joao Eduardo Luis
12:00 PM Revision 58f6798f (ceph): Merge branch 'testing' into next
Sage Weil
11:25 AM Revision 533f847c (ceph): Merge remote-tracking branch 'gh/wip_doc'
Sage Weil
09:07 AM Bug #3569: Monitor & OSD failures when an OSD clock is wrong
Oh... scratch that. Only after a closer look after reproducing the bug did I notice that what I saw was from VIRT; RS... Joao Eduardo Luis
08:46 AM Bug #3569: Monitor & OSD failures when an OSD clock is wrong
Might not be related, but when I triggered bug #3587 earlier today I noticed that the monitor started consuming over ... Joao Eduardo Luis
08:38 AM Bug #3587 (Fix Under Review): mon: election doesn't finish during heavy mon thrashing
Haven't been able to reproduce the bug since commit e6c15e73543593fc55ba3846197fb7f83f949bb7 from wip-3587. Joao Eduardo Luis
05:18 AM Bug #3587: mon: election doesn't finish during heavy mon thrashing
This is being caused by the fact that, from the other monitors point-of-view, mon.a never left the quorum, thus they ... Joao Eduardo Luis
04:19 AM Bug #3587 (Resolved): mon: election doesn't finish during heavy mon thrashing
While trying to trigger #3495 using... Joao Eduardo Luis
08:36 AM Bug #3495 (Resolved): ceph-mon crash
Joao Eduardo Luis
08:36 AM Bug #3495: ceph-mon crash
Good to know Matthew. I haven't been able to reproduce it either since using a patched version.
And the patch has ...
Joao Eduardo Luis
07:46 AM Bug #3495: ceph-mon crash
I was encountering this particular bug, causing all my monitors to crash shortly after start. I patched my 0.55 pack... Matthew Via
04:21 AM Bug #3495: ceph-mon crash
Managed to trigger this bug using... Joao Eduardo Luis
06:53 AM Revision 15d89937 (ceph): PG: update info.last_update_started in split_into
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:53 AM Revision 9f169ac0 (ceph): OSD: account for split in project_pg_history
split causes a new interval.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
06:53 AM Revision 27071f3b (ceph): OSD: store current pg epoch in info and load at that epoch
Prior to split, this did not matter. With split, however, it's
crucial that a pg go through advance_pg() for the map...
Samuel Just
06:53 AM Revision fb738506 (ceph): PG: set child up/acting in split_into
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:53 AM Revision 338f3688 (ceph): OSDMonitor: require --allow-experimental-feature to increase pg_num
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:52 AM Revision 3f412e88 (ceph): OSD: do _remove_pg in add_newly_split_pg is pool if gone
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:51 AM Revision 5f8a3634 (ceph): PG: split ops for child objects into child
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:51 AM Revision 9835e190 (ceph): osd/: mark info.stats as invalid after split, fix in scrub
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:51 AM Revision 19e6861d (ceph): osd/: dirty info and log on child during split
Otherwise, the log may not get written out.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
06:51 AM Revision 9981bee5 (ceph): OSD: add initial split support
PGs are split after updating to the map on which they split.
OSD::activate_map populates the set of currently "splitt...
Samuel Just
04:16 AM Bug #3571 (Resolved): MOSDPGCreate is unconditionally generated at max frequency by monitors
Sage Weil
04:10 AM Bug #3582 (Resolved): Some wrong info on OS recommendations
commit:f81d7207663633d82ad591d438c5a7ddbee26ff3 Sage Weil
04:01 AM RADOS Feature #3586 (Resolved): CRUSH: separate library
Sage Weil
12:56 AM Revision 58890cfa (ceph): librados: watch() should set the WRITE flag on the op
This caused a bug where the watch operation bypassed the is_degraded()
check in the write path and the repop got sent...
Samuel Just
12:56 AM Revision f2914af5 (ceph): HashIndex: fix list_by_hash handling of next->is_max()
get_path_str() should not handle hobject_t::get_max(). get_path_str()
now asserts that the passed object is not max ...
Samuel Just

12/06/2012

11:58 PM Revision 0c010949 (ceph): rbd: remove block-by-block messages when exporting
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick
10:20 PM Revision ef24f531 (ceph): doc: Change per doc request.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:25 PM Revision ca1a4db4 (ceph): release: add note about 'ceph osd create' syntax
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
05:16 PM Bug #3584: Ranlib fails from 64-bit client on a file in 32-bit based Ceph cluster.
(two 3.6.7-4.fc17.i686 and three 3.5.3-1.fc17.i686, precisely)
2. mount it from a 64-bit Linux machine (3.6.7-4.fc...
Yasuhiro Ohara
05:05 PM Bug #3584 (Resolved): Ranlib fails from 64-bit client on a file in 32-bit based Ceph cluster.

I could setup 32-bit Ceph cluster just fine, so I tried to mount it from 64-bit machine and tried a compilation of ...
Yasuhiro Ohara
03:34 PM Bug #3581 (Resolved): init script errors after upgrade from 0.48 to 0.55
commit:0a137d76bd9ee924c43a42abc33f4c6c06a03d5e Dan Mick
02:23 PM Bug #3459 (In Progress): osd crash in CephXAuthorizer::verify_reply
Tamilarasi muthamizhan
02:21 PM Bug #3459: osd crash in CephXAuthorizer::verify_reply
A user reports this same crash today in IRC with 0.55:
https://pastee.org/f4dgd
Dan Mick
01:30 PM Revision 214c7a17 (ceph): client: Allow cap release timeout to be configured
The delay for releasing an inode's capability is
hardcoded to 5 seconds. This patch takes the timeout
value from a c...
Sam Lang
01:27 PM Revision 0a137d76 (ceph): mkcephfs: fix fs_type assignment typo
Reported-by: Matthew Via <via@matthewvia.info>
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
01:26 PM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
Verified just means we agree it's a bug. New->Verified->In Progress->Need Review->Testing->Resolved is the simple pa... Dan Mick
10:17 AM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
How was this bug verified? Would you provide the test and its description in this bug, please. Anonymous
01:26 PM Revision 4c31598e (ceph): upstart: fix radosgw upstart job
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
01:26 PM Revision 47266cda (ceph): upstart: rename ceph -> ceph-all
This avoids a conflict with the sysvinit job.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
11:28 AM rbd Bug #3562 (Can't reproduce): incorrect progress shown for rbd resize image
Neither Josh nor I have a good theory for how this even could have happened, and I can't reproduce it, so...reopen if... Dan Mick
11:12 AM Linux kernel client Cleanup #3583: Convert tabs to spaces
I think we should avoid doing changes that make it hard to see the history. Specifically, swiping spaces to tabs conv... Yehuda Sadeh
11:08 AM Linux kernel client Cleanup #3583 (Rejected): Convert tabs to spaces
It was recently suggested that there are places in the client where we use spaces for indentation instead of tabs as ... caleb miles
09:35 AM Bug #3571: MOSDPGCreate is unconditionally generated at max frequency by monitors
Joao Eduardo Luis
09:35 AM Bug #3571: MOSDPGCreate is unconditionally generated at max frequency by monitors
Proposed fix on wip-3571 commit 5d9decaa7118ec7a3ce152ecd29bd6120e1a9078 Joao Eduardo Luis
08:07 AM Bug #3582 (Resolved): Some wrong info on OS recommendations
http://ceph.com/docs/master/install/os-recommendations/
Neither Debian Squeeze nor wheezy should be listed as havi...
Mark Nelson
06:52 AM Bug #3495: ceph-mon crash
Sure. I'll try to reproduce this bug again before bobtail is released. From the looks of it, messing with the MDS eno... Joao Eduardo Luis
05:27 AM Bug #3495: ceph-mon crash
> If you have the chance, could you try commit 46c55a19e72ae1f16037807fda8abe59b671f7d3 on wip-3495? It should fix th... Artem Grinblat
06:47 AM CephFS Bug #3559: mds: not issuing RDCACHE to exclusive client for some files
libcephtools is wrapping the monitor interaction stuff; i think this belongs in libcephfs, we just need to advertise ... Sage Weil
06:43 AM rgw Feature #3443: radosgw - Add Log messages to indicate restart, attempt, success and failure
we could make the script print a warning of there are .'s in the host line? it's hard, though, because sometimes the... Sage Weil
06:41 AM CephFS Feature #3575: ceph-fuse: Add support for forget_multi
yay! i think this is the answer to #3289.
actually, i think the normal invalidate will work too, one inode at a t...
Sage Weil
05:47 AM CephFS Fix #2215: ceph-fuse does not invalidate page cache
wip-2215 looks good to me! Sage Weil
05:33 AM CephFS Bug #3572: High CPU after equivalent to node very busy
Ah, nice catch! We just need to cycle through the list once to requeue the requests for the MDS. If Yan's patch loo... Sage Weil
03:39 AM Revision 7ab00a79 (ceph): .gitignore: Add m4 macro directories to ignore list
Gary Lowell
02:19 AM Revision 0d2e8858 (ceph): Merge branch 'next'
Dan Mick
02:18 AM Revision 3e98d1af (ceph): Merge branch 'testing' into next
Dan Mick
02:17 AM Revision b7b72429 (ceph): rbd: update manpage for import/export
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick
01:39 AM Revision 7f906b5a (ceph): Merge branch 'next'
Pull in fixes for 3567 and 3524 Dan Mick
01:38 AM Revision 64ecc870 (ceph): Striper: use local variable inside if() that tested it
Signed-off-by: Dan Mick <dan.mick@inktank.com>
(cherry picked from commit 917a6f296323164f9d79df94916932722e66fc0a)
Dan Mick
01:38 AM Revision b2ccf11d (ceph): librbd: handle parent change while async I/Os are in flight
During a test_librbd_fsx run including flatten, ImageCtx->parent
was being dereferenced while null. Between the time...
Dan Mick
01:38 AM Revision e9653f27 (ceph): librbd: hold AioCompletion lock while modifying global state
C_AioRead::finish needs to add in each chunk of a partial read
request to the 'partial' map in the AioCompletion's st...
Dan Mick
01:05 AM Revision 917a6f29 (ceph): Striper: use local variable inside if() that tested it
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick
01:05 AM Revision 41e16a3b (ceph): librbd: handle parent change while async I/Os are in flight
During a test_librbd_fsx run including flatten, ImageCtx->parent
was being dereferenced while null. Between the time...
Dan Mick
01:05 AM Revision a55700cc (ceph): librbd: hold AioCompletion lock while modifying global state
C_AioRead::finish needs to add in each chunk of a partial read
request to the 'partial' map in the AioCompletion's st...
Dan Mick
12:17 AM Revision 90d81562 (ceph): qemu: set qemu cache mode based on rbd cache setting
If we don't do this, qemu assumes no caching is used and doesn't send flushes.
Signed-off-by: Josh Durgin <josh.durg...
Josh Durgin
12:08 AM Revision 4acc0789 (ceph): Merge branch 'next'
Josh Durgin
 

Also available in: Atom