Project

General

Profile

Activity

From 01/01/2013 to 01/30/2013

01/30/2013

11:41 PM Revision ab778cb1 (ceph): doc: v0.56.2 release notes
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:21 PM Revision 4a950aa9 (ceph): Move read_log() function to prep for next commit
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
David Zafman
10:21 PM Revision 3c8d7d78 (ceph): osd: create tool to extract pg info and pg log from filestore
New application ceph-filestore-dump created that mounts filstore
and can dump info or log in JSON when an OSD is not ...
David Zafman
08:52 PM Revision a63fac32 (ceph): task: mon_clock_skew_check: use absolute value when comparing mon_skew
The monitors may report either positive or negative clock skews, and by
not using an absolute value we were constantl...
Joao Eduardo Luis
08:52 PM Revision 89e09fa9 (ceph): task: mon_clock_skew_check: mark as ran once if an expected skew was found
... even if we didn't get a clean/finished result from the monitors
This ought to significantly cut the waiting time...
Joao Eduardo Luis
07:50 PM Revision b571f8ee (ceph): PGMap: fix -Wsign-compare warning
Fix -Wsign-compare compiler warning:
mon/PGMap.cc: In member function 'void PGMap::apply_incremental
(CephContext*,...
Danny Al-Gaaf
07:32 PM Revision b0d4dd21 (ceph): test_libcephfs: fix xattr test
Ignore the ceph.*.layout xattrs.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:02 PM Bug #3970: cls_log should be declared with __attribute__(format..) so -Wformat validates the form...
Dan Mick
04:57 PM Bug #3970 (Resolved): cls_log should be declared with __attribute__(format..) so -Wformat validat...
It'll involve some changes to callers to fix all the harmless errors, but may find some significant
ones and avoid a...
Dan Mick
04:49 PM Bug #3938 (In Progress): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 b...
Have a cluster set-up and ready to start trying to reproduce this in the morning. Joao Eduardo Luis
02:13 PM Bug #3938: ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 bobtail)
No, didn't have it set up. I could probably reproduce if necessary. Samuel Just
04:32 PM devops Feature #3965: upstart: ulimit -n hardcoded; doesn't use 'max open files' config setting
I guess there are settings in the upstart config files, but they aren't derived from ceph.conf.
I imagine there are w...
Dan Mick
11:09 AM devops Feature #3965 (Rejected): upstart: ulimit -n hardcoded; doesn't use 'max open files' config setting
3900 tweaked the setting of ulimit -n "max open files" on all daemons in the cluster, but,
at present, we only have...
Dan Mick
03:37 PM CephFS Feature #3540 (Fix Under Review): mds: maintain per-file backpointers on first file object
Initial implementation in wip-bt. Needs review. Sam Lang
02:13 PM Bug #3883: osd: leaks memory (possibly triggered by scrubbing) on argonaut
The burnupi57 cluster (wip-f) does not appear to be leaking after all, the osds seem to have leveled off at around 35... Samuel Just
02:10 PM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
The patch is reviewed and ready to push to the testing
branch, and I will do that in a day or so.
I'm going to le...
Alex Elder
02:09 PM rgw Feature #3968 (Resolved): https should work for rest-bench
Trying to set the protocol to https by using the --protocol=https flag does not work. ... Kevin Horan
02:08 PM rbd Bug #3940: krbd: decrement obj request count when deleting
Reviewed and ready to push to master. Will do that in a day or so. Alex Elder
02:07 PM rbd Bug #3427: krbd: unmap does not remove block device properly
Reviewed and ready to push to the ceph-client "testing" branch.
I'm going to wait a day or two before pushing this...
Alex Elder
01:34 PM Linux kernel client Bug #3967 (Resolved): libceph: complete linger requests only once
Currently if a linger request gets resubmitted by the osd
client, its callback function (if provided) will get calle...
Alex Elder
01:05 PM Documentation #3960 (In Progress): [Document bug]MON and MDS do not need a ssd for data storage.
You are correct. The machines and processes would only boot a bit faster. The way to accelerate metadata servers is t... John Wilkins
12:12 PM Bug #3966 (Resolved): osdthrasher: does tell on osd just after restarting it
figured out where the thrasher errors are coming from:... Sage Weil
11:31 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
...and to answer your other question Alex, there's now a workunit test Sage just added
in c782d2ac531cbb7650968e62f0...
Dan Mick
11:00 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
Josh thinks 32-bitness probably doesn't matter, and remembers problems with snapshots that were fixed long ago; I gue... Dan Mick
10:55 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
I don't know if Sage tested 32-bit, or if it matters, and no, that script was just a reproduction scenario; as far as... Dan Mick
06:25 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
So is this then a request to port whatever it was that
fixed the problem back to 3.2?
If so, how do we prioritize...
Alex Elder
01:10 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
added test to suite, commit:c782d2ac531cbb7650968e62f0b24e6136a64359 Sage Weil
12:15 AM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
This works fine on current testing 3.6.0-00210-g8cc17ca Sage Weil
11:16 AM rbd Bug #3961 (Resolved): 32-bit cls_rbd tries cls_log with %d for 64-bit int, segfaults
commit:e253830abac76af03c63239302691f7fac1af381 on next
Dan Mick
09:37 AM rbd Subtask #3741: krbd: rework request tracking code
My testing on this code is nearly complete. However, I'm going
to hold off on pushing this (along with the changes ...
Alex Elder
06:34 AM rbd Subtask #3741: krbd: rework request tracking code
Alex Elder
09:30 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
I have finished my testing and have now updated the
ceph-client "testing" branch to be based on 3.8-rc5,
with the p...
Alex Elder
06:14 AM Linux kernel client Bug #3740: ceph-client: change to be based on 3.8-rc2
I discussed this with Sage yesterday. We're now up to
Linux 3.8-rc5. Merging our testing branch into v3.5-rc5
pro...
Alex Elder
09:08 AM Revision 0c872491 (ceph): rbd: add rbd_cli_misc with map-snapshot-io.sh
Sage Weil
09:06 AM Revision c782d2ac (ceph): qa: add test for rbd map and snapshots
This tests for the behavior reported in #3964. It passes on the current
code, but fails on 3.2 in squeeze (and 32-bi...
Sage Weil
09:05 AM Revision 6b493502 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
08:56 AM Linux kernel client Bug #3798 (In Progress): libceph/rbd: take reference to all bio's in list
It looks like the extra reference that the osd client requires
of the first bio on the list isn't necessary. Nor wo...
Alex Elder
08:41 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
Sage, I already implemented the fix, and it's pretty trivial,
and it's generally useful. By "won't fix" do you mean...
Alex Elder
07:54 AM Revision 586538e2 (ceph): v0.56.2
Gary Lowell
07:40 AM Linux kernel client Bug #3799 (In Progress): libceph/rbd: bio refs are messed up
Looking at the code here, the osd client isn't really doing
anything with the bio pointer. It is simply a middleman...
Alex Elder
07:34 AM Revision bcb8dfad (ceph): cls_rbd, cls_rgw: use PRI*64 when printing/logging 64-bit values
caused segfaults in 32-bit build
Fixes: #3961
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil ...
Dan Mick
07:32 AM Revision e253830a (ceph): cls_rbd, cls_rgw: use PRI*64 when printing/logging 64-bit values
caused segfaults in 32-bit build
Fixes: #3961
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Reviewed-by: Sage Weil ...
Dan Mick
06:47 AM rbd Bug #3927 (Closed): krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
It turns out this new behavior is a good thing, we're just
reporting errors now where we apparently did not previous...
Alex Elder
06:47 AM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
I no longer believe this is a problem. Although there is no
aggregate result value for a collection of osd requests...
Alex Elder
06:36 AM Linux kernel client Bug #3959 (Duplicate): krbd: decrement img_request->obj_request_count when deleting
Found it! http://tracker.ceph.com/issues/3940
already documents this.
Alex Elder
06:35 AM rbd Feature #3877: krbd: don't wait for notify ack to complete
Alex Elder
06:35 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
Alex Elder
06:35 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Alex Elder
03:48 AM Revision 77f57411 (ceph): mds: move lexical_cast and assert re-#include to the top
We should keep the re-#includes immediately following the offender, and
documented.
Signed-off-by: Sage Weil <sage@i...
Sage Weil
03:11 AM Bug #3948: problems from leveldb static linkage and leveldb downgrade
Hi Sage,
does it matter that the OSD is now down for around 1-2 days or will it just pickup any changes made to th...
Corin Langosch
03:00 AM Revision 35e5d74e (ceph): Don't install rbd-fuse binary
fixes packaging warnings
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick
02:43 AM Revision 23923ee9 (ceph): mds/Server.cc: fix warring assert.h's
New include boost/lexical_cast.hpp apparently drags in the system
assert.h on quantal and squeeze at least, breaking ...
Dan Mick
02:42 AM Revision 25e9a0be (ceph): mon: require name for 'auth add ...' command
Otherwise we interpret the empty string as 'unknown.'.
Fixes: #3956
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
02:19 AM Bug #3595: ceph-osd and ceph-mds crash on Debian Squeeze
root@cluster:~# ceph-osd
Segmentation fault
root@cluster:~# ceph-osd -h
Segmentation fault
root@cluster:~# ceph-...
Jörg Blank
01:07 AM Revision a731da99 (ceph): Merge remote-tracking branch 'origin/wip-fuse-create-fix'
Reviewed-by: Greg Farnum <greg@inktank.com> Greg Farnum
01:05 AM Revision e9a6694d (ceph): client: return errors to the user if fsync fails
To do so, we allow callers of _flush(Inode) to pass in a Context
as well. This Context is then given to the ObjectCac...
Greg Farnum
01:03 AM RADOS Feature #3807 (Fix Under Review): crush: simple commands to create common rules
see wip-osd-commands Sage Weil
12:49 AM Revision 5a7c5088 (ceph): init-ceph: make ulimit -n be part of daemon command
ulimit -n from 'max open files' was being set only on the machine
running /etc/init.d/ceph. It needs to be added to ...
Dan Mick
12:48 AM Revision 84a024b6 (ceph): init-ceph: make ulimit -n be part of daemon command
ulimit -n from 'max open files' was being set only on the machine
running /etc/init.d/ceph. It needs to be added to ...
Dan Mick
12:34 AM Revision c2e50e58 (ceph): Merge remote-tracking branch 'gh/wip-recovery-stats-b'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
12:26 AM Revision 1564c3a0 (ceph): Merge branch 'wip-vxattr'
Reviewed-by: Sam Lang <sam.lang@inktank.com>
Reviewed-by: Greg Farnum <greg@inktank.com>
Sage Weil
12:25 AM Revision ba32ea94 (ceph): client: list only aggregate xattr, but allow setting subfield xattrs
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:25 AM Revision 84751489 (ceph): client: note presence of dir layout in inode operator<<
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:25 AM Revision 09f28541 (ceph): mds: fix client view of dir layout when layout is removed
We weren't handling the case where the projected node has NULL for the
layout properly. Fixes the client's view when...
Sage Weil
12:25 AM Revision ebebf72f (ceph): mds: handle ceph.*.layout.* setxattr
Allow individual fields of file or dir layouts to be set via setxattr.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
12:25 AM Revision db31a1f9 (ceph): mds: allow dir layout/policy to be removed via removexattr on ceph.dir....
This lets a user remove a policy that was previously set on a dir.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
12:25 AM Revision 61fbe27a (ceph): qa: add layout_vxattrs.sh test script
Test virtual xattrs for file and directory layouts.
TODO: create a data pool, add it to the fs, and make sure we can...
Sage Weil
12:24 AM Revision e51299fb (ceph): mds: open mydir after replay
In certain cases, we may replay the journal and not end up with the
dirfrag for mydir open. This is fine--we just ne...
Sage Weil
12:24 AM Revision ad7ebad7 (ceph): client: allow ceph.* xattrs
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:24 AM Revision febb9650 (ceph): client: move xattr namespace enforcement into internal method
This captures libcephfs users now too.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
12:24 AM Revision 3f82912a (ceph): client: implement ceph.file.* and ceph.dir.* vxattrs
Display ceph.file.* vxattrs on any regular file, and ceph.dir.* vxattrs
on any directory that has a policy set.
Sign...
Sage Weil

01/29/2013

11:40 PM rbd Bug #3964: krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd image with sn...
... Dan Mick
11:28 PM rbd Bug #3964 (Won't Fix): krbd: 32-bit, kernel 3.2.0 system can't do O_DIRECT writes to mapped rbd i...
fghaas reported, I reproduced on a precise 32-bit system:
create an image, map, writes work fine, even with dd ofl...
Dan Mick
11:06 PM Bug #3963 (Won't Fix): cls_log should check should_gather before vsnprintf()
1) faster
2) would have allowed workaround for 3961
Dan Mick
11:04 PM Bug #2481 (Won't Fix): ceph tell has almost no error reporting
this should get cleaned up with whatever refactor we do with the api work, but not worth spending time on individuall... Sage Weil
11:01 PM Bug #3577 (Can't reproduce): osd missing reported by osd_recovery.test_incomplete_pgs workload
we fixed several things that could explain this. Sage Weil
11:01 PM Bug #3595 (Need More Info): ceph-osd and ceph-mds crash on Debian Squeeze
Is this still a problem with the bobtail packages? Sage Weil
10:58 PM Bug #2721 (Resolved): Ceph status does not work in 0.48 even if it is still documented
wrong monitor version was running Sage Weil
10:57 PM Bug #2647 (Can't reproduce): osd: old request, waiting for subops
Sage Weil
10:56 PM Bug #2500 (Resolved): osd: unprotected ::decodes in ReplicatedPG::do_osd_ops
cleaned up ages ago Sage Weil
10:55 PM Bug #1197 (Resolved): osd: make inconsistent state durable
this got fixed in commit:2475066c3247774a2ad048a2e32968e47da1b0f5 Sage Weil
10:54 PM Bug #3646 (Resolved): pg_temp with two down/out osds
commit:6122a9f62f9eeae1410d1703fecb8939a35fb03f Sage Weil
10:46 PM rbd Bug #3961 (Resolved): 32-bit cls_rbd tries cls_log with %d for 64-bit int, segfaults
32-bit system: rbd create i -s 1; rbd rm i causes death of osd in cls_log();
presumably this is because of cls_log(%...
Dan Mick
10:42 PM Revision 59ac4d35 (ceph): qa: add rbd/concurrent workunit
This defines a new workunit shell script that performs a bunch of
rbd operations concurrently in order to exercise co...
Alex Elder
10:35 PM Revision 3bc21143 (ceph): ObjectCacher: fix flush_set when no flushing is needed
C_GatherBuilder takes ownership of the Context we pass it. Deleting it
in flush_set after constructing the C_GatherBu...
Josh Durgin
10:10 PM RADOS Feature #3807: crush: simple commands to create common rules
ceph osd crush rule list
ceph osd crush rule create-simple <name> <root> <failure domain>
ceph osd crush rule create-...
Sage Weil
10:04 PM Revision 19f42731 (ceph): peer: fix filtering out of scrub from pg state
Sage Weil
09:59 PM Revision 95677fc5 (ceph): mon: OSDMonitor: only share osdmap with up OSDs
Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'...
Joao Eduardo Luis
09:59 PM Revision e4d76cb8 (ceph): utime: fix narrowing conversion compiler warning in sleep()
Fix compiler warning:
./include/utime.h: In member function 'void utime_t::sleep()':
./include/utime.h:139:50: warnin...
Danny Al-Gaaf
09:33 PM Documentation #3960 (Resolved): [Document bug]MON and MDS do not need a ssd for data storage.
From :http://ceph.com/docs/master/install/hardware-recommendations/#data-storage
it says:
Since the storage requi...
Xiaoxi Chen
09:17 PM Revision a8964107 (ceph): rgw: fix crash when missing content-type in POST object
Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud...
Yehuda Sadeh
08:38 PM Linux kernel client Bug #3959 (Duplicate): krbd: decrement img_request->obj_request_count when deleting
Each image request keeps a count of its object requests.
Adding a object request to or deleting one from an image
r...
Alex Elder
08:34 PM Feature #2472: osd: add opaque 'class <name> <foo>' cap that class can interpret/enforce
Sage Weil
08:34 PM CephFS Bug #1946 (Resolved): snapshot inherits timestamp/size/etc from modified trunk dir upon mds restart
commit:7842bb50c7814cc16c22589bf41df7db1f7492eb Sage Weil
08:33 PM Feature #3890 (Fix Under Review): osd: create tool to extract pg info and pg log from filestore
In final review to merge from wip-3890 branch. David Zafman
08:33 PM Bug #3126 (Can't reproduce): mds crashed bool CDir::check_rstats()
we'll see i this comes up with all of yan's fixes in now. Sage Weil
08:33 PM rbd Bug #3566 (Resolved): log max new = 1 can cause hang on process exit
fixed a few weeks ago, commit:813787af3dbb99e42f481af670c4bb0e254e4432 and a few prior commits Sage Weil
08:32 PM Bug #3125 (Resolved): Assertion Error in peer.py - failure from the nightly run
this is fixed up now, most recent commit was 3772d437dd4c562a6490f84124eb4757e22eca92 Sage Weil
08:26 PM rbd Bug #3958 (Resolved): rbd fsx fails with EBUSY
... Sage Weil
07:41 PM CephFS Bug #3553 (Won't Fix): MDS core dumped running 0.48.2argonaut
if/when see this on bobtail or later, we'll investigate. Sage Weil
07:32 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
it works. it just doesn't leave the pg in backfill_wait, as i was expecting. Sage Weil
07:30 PM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
in bobtail, commit:e6bceeedb0b77d23416560bd951326587470aacb Sage Weil
07:24 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Sage Weil wrote:
> Aaron Schulz wrote:
> > Ian Colle wrote:
> > > Aaron are you still seeing this?
> >
> > Sorr...
Aaron Schulz
12:31 PM rgw Bug #3365 (Can't reproduce): Broken metadata (duplicated as CSV)
Thanks for trying to reproduce this on Bobtail, Aaron. I'm moving it to Can't Reproduce. Ian Colle
12:26 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
I'm having a hard time reproducing this on bobtail. If I remove the metadata normalization code in the MediaWiki/Clou... Aaron Schulz
07:07 PM Bug #3938: ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons, 1 bobtail)
is there a core for this? Sage Weil
06:51 PM Revision 7cd4e50d (ceph): client: Wait for caps to flush when flushing metadata.
Embarrassingly, this conditional has been backwards since
I committed it in 818e7939. But we want to do the wait when...
Greg Farnum
06:44 PM Revision 11e1f3ac (ceph): ReplicatedPG: make_snap_collection when moving snap link in snap_trimmer
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry...
Samuel Just
06:43 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
Dan Mick
06:40 PM Bug #3957 (Resolved): new #include breaks assert.h (again)
#include <boost/lexical_cast.hpp> in mds/Server.cc apparently re-includes the system assert.h,
blowing up dout(). F...
Dan Mick
06:42 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
commit:25e9a0be63fdad9fd8f7909585c9270a3729dc44 Sage Weil
06:00 PM Bug #3956 (Resolved): ceph auth add/del entity name parameter check
It's currently (as of v0.56.1) possible to run "ceph auth add" without any further parameters. This results in the ad... Alex Moore
05:28 PM Revision 907c709c (ceph): mds: Send created ino in journaled_reply
The MDS avoids sending an early reply if a request
triggered inode allocation (no preallocated inodes yet).
For creat...
Sam Lang
05:27 PM Bug #3955 (Resolved): Configure should explicity check for c++ compiler.
If no c++ compiler is installed, configure fails with a misleading message when checking for boost libraries. Anonymous
05:11 PM Bug #3747 (Closed): PGs stuck in active+remapped
I think this was probably related to the lagging pg peering workqueue.. is there anything to suggest that isn't the c... Sage Weil
05:09 PM Bug #3948 (Need More Info): problems from leveldb static linkage and leveldb downgrade
Corin-
Just restart the osd. And check dmesg for any kernel malfeasance... that is usually what triggers this. A...
Sage Weil
04:51 PM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
commit:84a024b647c0ac2ee5a91bacdd4b8c966e44175c in next, cherry-pick -x'ed to bobtail
Dan Mick
03:21 PM Bug #3900 (Fix Under Review): init-ceph should do ulimit -n's with do_root_cmd
Dan Mick
04:37 PM Subtask #3840 (Resolved): osd: ack push after apply+commit
as part of #3833 Sage Weil
04:36 PM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
commit:c2e50e580d18107162d2d101c5c243c665e56124 Sage Weil
04:33 PM CephFS Feature #3953 (Resolved): kclient: get/set layout via virtual xattrs
Sage Weil
04:32 PM CephFS Feature #1236 (Resolved): libceph: set layout via virtual xattrs (libceph/cfuse)
commit:1564c3a0a3efbde5a326001586238fde8f6648ad for userspace bits.
the kernel bits still need review.. opening se...
Sage Weil
04:18 PM Revision cf7c3f7d (ceph): client: Don't use geteuid/gid for fuse ll_create
Fixes a bug in ll_create where files that already exist at the MDS
don't get the created flag set on reply. This cau...
Sam Lang
03:11 PM rbd Bug #3952 (Resolved): krbd: no need for object header version
The header object watch operation had a sort of half implemented
use of the version of the object. It apparently is...
Alex Elder
03:08 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
Just an extra delete in a code path in flush_set that wasn't exercised before. Fixed by commit:3bc21143552b35698c9916... Josh Durgin
02:44 PM rbd Bug #3946: rbd fsx failing in nightly
Reproducing locally seems to confirm this, since there was a recent change to replace commit_set() with flush_set():
...
Josh Durgin
12:06 PM rbd Bug #3946: rbd fsx failing in nightly
I'm guessing these are related to recent objectcacher changes, since they didn't affect runs without caching. The cor... Josh Durgin
02:48 PM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
I just committed the test script to the ceph master branch.
The script is located here: qa/workunits/rbd/concurrent...
Alex Elder
09:16 AM rbd Feature #3949: krbd: create test script that exercises concurrent operations
Well the script is really nice. And I just got a new
crash while running it on a real machine (rather than
my UML ...
Alex Elder
08:22 AM rbd Feature #3949 (Resolved): krbd: create test script that exercises concurrent operations
I suggested doing this in http://tracker.ceph.com/issues/3427.
That issue is about a bug where an image unmapping ca...
Alex Elder
01:50 PM rgw Bug #3941 (Resolved): s3tests crash on bobtail
Crash fixed, commit:f41010c44b3a4489525d25cd35084a168dc5f537.
Also, pushed a change to s3-tests.git, setting a requi...
Yehuda Sadeh
01:27 PM Bug #3268: osd: localize reads handling is incorrect
Yes, the OSDs will serve replica reads as things stand. Greg Farnum
01:11 PM Bug #3268: osd: localize reads handling is incorrect
I'm starting on this bug now. Before fixing the flag handling described in the ticket, I want to make sure that the O... Noah Watkins
12:43 PM Bug #3810: btrfs corrupts file size on 3.7
I'm making an attempt. Mike Lowe
12:36 PM Bug #3810: btrfs corrupts file size on 3.7
Mike, Bill: are you able to test Josef's patch? Sage Weil
11:45 AM Revision e805b7d6 (ceph): admin_socket: don't bother remote executing if there is no test
Sage Weil
11:30 AM CephFS Bug #3951: ceph-fuse: permissions error on create
I've got a question in for Sam, but other than that this looks good to me! Greg Farnum
09:37 AM CephFS Bug #3951 (Resolved): ceph-fuse: permissions error on create
Reported by Greg Farnum:
gregf@kai:~/ceph/src [master]$ cd mnt/
gregf@kai:~/ceph/src/mnt$ sudo chown gregf.gregf ...
Sam Lang
11:12 AM Revision c9201d0e (ceph): ReplicatedPG: correctly handle new snap collections on replica
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(cherry...
Samuel Just
11:10 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
OK, I do have the osd request pointer now. It was available
in register R14. And with a little work I can determin...
Alex Elder
10:35 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
The object being operated on is the rbd header image, in
this case named "image.5X5ZNB.rbd". The object request typ...
Alex Elder
10:06 AM rbd Bug #3950: krbd: new assertion failure running concurrent rbd test
Weird. It looks to me like the object request that's
just completing is already done, meaning we got
a callback fr...
Alex Elder
09:19 AM rbd Bug #3950 (Can't reproduce): krbd: new assertion failure running concurrent rbd test
(I think this is a new issue, I haven't investigated it yet.)
I hit an assertion failure while running my new test...
Alex Elder
10:34 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I've opened a new issue that has symptoms similar to this
but not identical:
http://tracker.ceph.com/issues/395...
Alex Elder
09:41 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
Putting back to in-progress. The preferred solution is to replace the perl filter line with sed or python and remove... Anonymous
09:38 AM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: 0b66994c180b1ce5856a38518423d82fbebc8a2e
...
Anonymous
09:15 AM rbd Bug #3427: krbd: unmap does not remove block device properly
I have opened this to cover developing that test script
http://tracker.ceph.com/issues/3949
Alex Elder
07:53 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
...yes, yes it is. I've been working in FUSE so far. *sigh* Well, it needed the fix too. Greg Farnum
07:26 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
I don't see wip-2753-fsync-errors in the repo. Also, note that this problem was reported on the cephfs kernel client... Sam Lang
06:49 AM Revision 0b66994c (ceph): ceph.spec.in: package rbd udev rule
Package udev/50-rbd.rules per bug 3930.
Signed-off-by: Gary Lowell <gary.lowell@inktank.com>
Gary Lowell
04:22 AM Revision 1c311949 (ceph): osd_recovery: inject a recovery delay
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
04:22 AM Revision e33b425d (ceph): osd_recovery: use --no-cleanup for rados bench
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
03:53 AM Revision 3b27c9ec (ceph): osd_backfill: --no-cleanup for rados bench
Sage Weil
03:46 AM Revision a7d15afb (ceph): mon: smooth pg stat rates over last N pgmaps
This smooths the recovery and throughput stats over the last N pgmaps,
defaulting to 2.
Signed-off-by: Sage Weil <sa...
Sage Weil
03:17 AM Revision 0f7a9e56 (ceph): Merge remote-tracking branch 'yan/wip-mds'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
03:03 AM Revision ecda1208 (ceph): doc: fix overly-big fixed-width text in Firefox
Changed font size for ... Ross Turk
03:01 AM Revision d5008602 (ceph): btrfs.yaml: increase osd op thread timeout
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
02:50 AM Revision 4aea19ee (ceph): osd_types: add recovery counts to object_sum_stats_t
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:50 AM Revision a2495f65 (ceph): osd: track recovery ops in stats
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:50 AM Revision 76e9fe5f (ceph): mon/PGMap: include timestamp
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:50 AM Revision 208b02a7 (ceph): mon/PGMap: report recovery rates
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:50 AM Revision 3f6837e0 (ceph): mon/PGMap: report IO rates
This does not appear to be very accurate; probably the stat values we're
displaying are not being calculated correctl...
Sage Weil
02:49 AM Revision 193dbedb (ceph): rbd-fuse: fix warning
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:44 AM Revision 1e24ce22 (ceph): doc: Removed indep, and clarified explanation.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
02:17 AM Revision 0e9c8124 (ceph): mds: add projected rename's subtree bounds to ESubtreeMap
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:17 AM Revision e69e7e5d (ceph): mds: fix 'discover' handling in the rejoin stage
If the MDS is the resolve stage, current MDCache::handle_discover() only handles
'discover' from MDS that it has alre...
Yan, Zheng
02:17 AM Revision abc4c785 (ceph): mds: allow handling slave request in the clientreplay stage
replaying a client request may need to create slave request and the slave
MDS can be also in the clientreplay stage.
...
Yan, Zheng
02:17 AM Revision 58841776 (ceph): mds: mark export bounds for cross authority directory rename
this guarantees that the importing MDS gets directory fragment's
up-to-date fragstat/rstat.
Signed-off-by: Yan, Zhen...
Yan, Zheng
02:17 AM Revision 829aeba6 (ceph): mds: clear inode dirty when slave rename finishes.
The inode is linked to a non-auth directory, so remove it from LogSegment's
dirty inode list.
Signed-off-by: Yan, Zh...
Yan, Zheng
02:17 AM Revision c93cf2d2 (ceph): mds: fix for MDCache::disambiguate_imports
In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.
...
Yan, Zheng
02:17 AM Revision 0cf5e4e5 (ceph): mds: journal inode's projected parent when doing link rollback
Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.
Signed-off-by: Ya...
Yan, Zheng
02:17 AM Revision 9a0cfcc5 (ceph): mds: don't journal opened non-auth inode
If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the ...
Yan, Zheng
02:17 AM Revision 4fc68a48 (ceph): mds: properly clear CDir::STATE_COMPLETE when replaying EImportStart
when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal ent...
Yan, Zheng
02:17 AM Revision 710bba3a (ceph): mds: move variables special to rename into MDRequest::more
My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used...
Yan, Zheng
02:17 AM Revision f4abf00a (ceph): mds: rejoin remote wrlocks and frozen auth pin
Includes remote wrlocks and frozen authpin in cache rejoin strong message
Signed-off-by: Yan, Zheng <zheng.z.yan@int...
Yan, Zheng
02:17 AM Revision 77946dcd (ceph): mds: fetch missing inodes from disk
The problem of fetching missing inodes from replicas is that replicated inodes
does not have up-to-date rstat and fra...
Yan, Zheng
02:17 AM Revision 9944d9fb (ceph): mds: don't journal non-auth rename source directory
After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to jour...
Yan, Zheng
02:17 AM Revision 1a6626f0 (ceph): mds: preserve non-auth/unlinked objects until slave commit
The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave ren...
Yan, Zheng
02:17 AM Revision 844cd46c (ceph): mds: fix slave rename rollback
The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assump...
Yan, Zheng
02:17 AM Revision a42a9187 (ceph): mds: split reslove into two sub-stages
The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MD...
Yan, Zheng
02:17 AM Revision 3a66656b (ceph): mds: send resolve messages after all MDS reach resolve stage
Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when so...
Yan, Zheng
02:17 AM Revision 85294a59 (ceph): mds: always use {push,pop}_projected_linkage to change linkage
Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::ad...
Yan, Zheng
02:17 AM Revision e0aa64d0 (ceph): mds: don't replace existing slave request
The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the...
Yan, Zheng
02:17 AM Revision baa6bd6b (ceph): mds: fix for MDCache::adjust_bounded_subtree_auth
After swallowing extra subtrees, subtree bounds may change, so it
should re-check.
Signed-off-by: Yan, Zheng <zheng....
Yan, Zheng
02:17 AM Revision c9ff21a9 (ceph): mds: fix "had dentry linked to wrong inode" warning
The reason of "had dentry linked to wrong inode" warning is that
Server::_rename_prepare() adds the destdir to the EM...
Yan, Zheng
02:17 AM Revision ce431eb5 (ceph): mds: splits rename force journal check into separate function
the function will be used by later patch that fixes rename rollback
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Yan, Zheng
02:17 AM Revision fb497135 (ceph): mds: force journal straydn for rename if necessary
rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the ...
Yan, Zheng
02:17 AM Revision cd8d9107 (ceph): mds: don't set xlocks on dentries done when early reply rename
_rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified b...
Yan, Zheng
02:15 AM Revision 87d85fa2 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:51 AM Revision e58fe519 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
01:50 AM Revision b429a3a3 (ceph): doc: Updated to add indep and first n to chooseleaf. Num only used with...
fixes: #3711
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
01:31 AM Revision f41010c4 (ceph): rgw: fix crash when missing content-type in POST object
Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud...
Yehuda Sadeh
01:22 AM Revision 26988038 (ceph): Merge branch 'wip-osd-down-out'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
01:14 AM Revision 09522e5a (ceph): rgw: fix crash when missing content-type in POST object
Fixes: #3941
This fixes a crash when handling S3 POST request and content type
is not provided.
Signed-off-by: Yehud...
Yehuda Sadeh
01:13 AM Revision 75f6ba56 (ceph): crush: implement get_children(), get_immediate_parent_id()
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
01:13 AM Revision 2b8ba7ca (ceph): osdmap: implement subtree_is_down() and containing_subtree_is_down()
Implement two methos to see if an entire subtree is down, and if the
containing parent node of type T of a given node...
Sage Weil
01:13 AM Revision b955a599 (ceph): mon: set limit so that we do not an entire down subtree out
Add new configurable 'mon osd down out subtree limit' so that you can
prevent marking out an entire subtree. If for ...
Sage Weil
01:12 AM Revision 2efdfb41 (ceph): mon: Elector: reset the acked leader when the election finishes and we ...
Failure to do so will mean that we will always ack the same leader during
an election started by another monitor. Th...
Joao Eduardo Luis
01:12 AM Revision 428ddb7d (ceph): Merge remote-tracking branch 'gh/wip-timecheck
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
12:58 AM Revision 81ed1bc7 (ceph): rados: add pool_ops workunit to cephtool test
Josh Durgin
12:53 AM Revision c79f7c6c (ceph): Merge branch 'wip-pool-delete'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
12:52 AM Revision 97b78924 (ceph): doc: update ceph man page link
It's not the wiki anymore, and the man page needed to be regenerated.
Signed-off-by: Josh Durgin <josh.durgin@inktan...
Josh Durgin
12:52 AM Revision 91a0bc89 (ceph): ceph, rados: update pool delete docs and usage
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin

01/28/2013

11:25 PM Revision 1a6197a7 (ceph): qa: fix mon pool_ops workunit
Use ! for clarity when commands are supposed to fail.
Check a few other cases that should fail, and correct deleting
...
Josh Durgin
10:54 PM Revision 826e5860 (ceph): cram: fix for runs with coverage enabled
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
10:50 PM Bug #3948 (Resolved): problems from leveldb static linkage and leveldb downgrade
Two days ago I upgraded one of my osds to 0.48.3 (see http://tracker.ceph.com/issues/3797) and everything worked fine... Corin Langosch
09:56 PM Revision 014fc6d6 (ceph): utime: fix narrowing conversion compiler warning in sleep()
Fix compiler warning:
./include/utime.h: In member function 'void utime_t::sleep()':
./include/utime.h:139:50: warnin...
Danny Al-Gaaf
09:56 PM Revision fb85c7f6 (ceph): rbd: don't ignore return value of system()
Check for the return value of system() and handle the error if needed
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bi...
Danny Al-Gaaf
09:56 PM Revision f74265b0 (ceph): configure: fix check for fuse_getgroups()
Check for fuse_getgroups() only in case we have found libfuse already.
Moved the check to the check for --with-fuse.
...
Danny Al-Gaaf
09:56 PM Revision 21673e8b (ceph): rbd-fuse: fix usage of conn->want
Fix usage of conn->want and FUSE_CAP_BIG_WRITES. Both need libfuse
version >= 2.8. Encapsulate the related code line ...
Danny Al-Gaaf
09:56 PM Revision 818e9a2c (ceph): rbd-fuse: fix printf format for off_t and size_t
Fix printf format for off_t and size_t to print the same on 32 and 64bit
systems. Use PRI* macros from inttypes.h.
S...
Danny Al-Gaaf
09:51 PM Bug #3930 (In Progress): ceph.spec: udev rule for rbd not in rpms
Anonymous
09:50 PM Bug #3945 (In Progress): osd: dynamically link to leveldb
Anonymous
04:56 PM Bug #3945 (Resolved): osd: dynamically link to leveldb
We hit a problem with quantal that underscored the danger of linking statically to libleveldb. After some discussion... Sage Weil
09:21 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
wip-2753-fsync-errors has a patch which makes fsync return an error if the client gets back an error from the Objecte... Greg Farnum
05:32 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
Looked at this briefly; I see that the way we do fsyncs is attached to a "FIXME: this could starve" comment, and I be... Greg Farnum
09:18 PM rbd Bug #3947 (Resolved): krbd: read zeroing freed bio?
This happened to me once before but I wasn't sure what
I did. Now I think I do know. This is with the new
request...
Alex Elder
08:52 PM Revision 4edef483 (ceph): Merge branch 'wip-java-api'
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sage Weil <...
Noah Watkins
08:45 PM Feature #3833: osd: improve recovery throttling
commit:d6db239ce5134a9c410554fb292c54981375c628 Sage Weil
08:20 PM Feature #3833: osd: improve recovery throttling
Commit? Ian Colle
07:32 PM Feature #3833 (Resolved): osd: improve recovery throttling
Sage Weil
07:27 PM Revision 0ded0fdf (ceph): mon: Monitor: rework timecheck code to clarify logic boundaries
The initial timecheck implementation relied on a cleanup function to
clean the state each time we changed epochs (or ...
Joao Eduardo Luis
06:13 PM Revision 3a089420 (ceph): doc: fix rbd create syntax
--dest-pool does not apply to create. Also remove extraneous
whitespace.
Signed-off-by: Josh Durgin <josh.durgin@ink...
Josh Durgin
06:08 PM RADOS Documentation #3830: crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggregates' i...
Can we get something moving on this bug, or give it to John to research? (and btw, firstn|indep has
been addressed u...
Dan Mick
05:20 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
This isn't something that's worth dealing with on the monitor side right now. Sage Weil
05:19 PM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
see #3376 Sage Weil
04:43 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
The conclusion:
- quantal had a newer libleveldb than we built statically into our debs
- downgrading made compac...
Sage Weil
05:02 PM rbd Bug #3946 (Resolved): rbd fsx failing in nightly
... Sage Weil
04:49 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
This appears to be something that was triggered and exacerbated by now-fixed issues. Until we can trigger it, I'm in... Sage Weil
08:04 AM Bug #3905: incomplete & stale (lost?) PGs
Due to some other issues and after a chat with Sage, I restarted all of my osds and this disappeared since. So I'm af... Faidon Liambotis
04:28 PM Bug #3944 (Resolved): ceph tool should prevent --admin-socket
Misremembering, I tried several 'ceph --admin-socket' commands rather than 'ceph --admin-daemon'; the result was that... Dan Mick
04:12 PM Bug #3810: btrfs corrupts file size on 3.7
sent a report to linux-btrfs Sage Weil
03:41 PM Bug #3810: btrfs corrupts file size on 3.7
Ok, this looks like a btrfs bug to me. On osd.3, the write extends the file size to 4194304, but the later stat sees... Sage Weil
10:59 AM Bug #3810: btrfs corrupts file size on 3.7
I ran part of the workload and found an inconsistent pg. I've uploaded ceph.log and logs from the primary and second... Mike Lowe
03:50 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Things I mentioned in comment 4 are still present; I'd like to either change them or update here why we're not. Dan Mick
02:11 PM rbd Bug #3427 (Fix Under Review): krbd: unmap does not remove block device properly
I have posted two patches for review, the second of which
should fix this problem. I have not actually reproduced
...
Alex Elder
12:50 PM devops Feature #3479: ceph-deploy: uninstall
commit:93082e82df56b01c524d0195e20068f6a6c8ca26 Sage Weil
12:49 PM devops Feature #3910: ceph-deploy: uninstall purge
ceph-deploy commit:93082e82df56b01c524d0195e20068f6a6c8ca26
Sage Weil
12:48 PM devops Feature #3341: ceph-disk-activate: Make --mount the default
I made it autodetect whether to mount or not based on whether you pass a directory or block device in. Simpler all a... Sage Weil
12:48 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Aaron Schulz wrote:
> Ian Colle wrote:
> > Aaron are you still seeing this?
>
> Sorry I need to get the time to ...
Sage Weil
10:24 AM Feature #3890 (In Progress): osd: create tool to extract pg info and pg log from filestore
Ian Colle
10:10 AM rgw Cleanup #3777 (In Progress): rgw: audit code for reading NULL env variables
reopening, see #3941. Yehuda Sadeh
10:09 AM rgw Bug #3941: s3tests crash on bobtail
Yeah, similar to that other issue (#3777)... Yehuda Sadeh
09:21 AM CephFS Feature #3540 (In Progress): mds: maintain per-file backpointers on first file object
Sam Lang
02:18 AM Revision 6bd676ea (ceph): mds: fix end check in Server::handle_client_readdir()
commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the ...
Yan, Zheng
02:18 AM Revision 5176cb71 (ceph): mds: check deleted directory in Server::rdlock_path_xlock_dentry
Commit b03eab22e4 (mds: forbid creating file in deleted directory)
is not complete, mknod, mkdir and symlink are miss...
Yan, Zheng
02:18 AM Revision 919df3bf (ceph): mds: lock remote inode's primary dentry during rename
commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry)
makes Server::handle_client_rename() xlocks r...
Yan, Zheng
02:18 AM Revision 67144973 (ceph): mds: allow journaling multiple root inodes in EMetaBlob
In some cases (rename, rmdir, subtree map), we may need journal multiple
root inodes (/, mdsdir) in one EMetaBlob. Th...
Yan, Zheng
02:18 AM Revision 6daec530 (ceph): mds: introduce XSYN to SYNC lock state transition
If lock is in XSYN state, Locker::simple_sync() firstly try changing
lock state to EXCL. If it fail to change lock st...
Yan, Zheng
02:18 AM Revision 659d1a39 (ceph): mds: properly set error_dentry for discover reply
If MDCache::handle_discover() receives an 'discover path' request but
can not find the base inode. It should properly...
Yan, Zheng

01/27/2013

06:12 PM Revision c5478161 (ceph): mon: Elector: reset the acked leader when the election finishes and we ...
Failure to do so will mean that we will always ack the same leader during
an election started by another monitor. Th...
Joao Eduardo Luis
03:59 PM Bug #3810: btrfs corrupts file size on 3.7
I can do that, it will take somewhere between 12 and 24 hours to run. Mike Lowe
03:34 PM Bug #3810: btrfs corrupts file size on 3.7
Mike, would it be possible to reproduce this with debug file store = 20? That will tell us if what Ceph thinks it di... Sage Weil
02:10 PM Bug #3810: btrfs corrupts file size on 3.7
I deleted the rbd's with inconsistent pg's, recreated the rbd's, ran rsync with the same data set, made sure no btrfs... Mike Lowe
02:15 PM Revision d74b31b2 (ceph): mon: Monitor: force timecheck cleanup on finish_election()
Fixes: #3854
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
12:58 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Ok, everything still looks good :). Last question: should I upgrade my whole cluster to this version or will a new ar... Corin Langosch
12:01 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Ok, after around 10 minutes of runtime everything seems normal. Thanks for the fast and great help! :-)
ceph versi...
Corin Langosch
12:00 PM Bug #3797 (Fix Under Review): osd takes 100% cpu after upgrading from 0.48.2argonaut to the lates...
Sage Weil
11:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
that fixed it it seems. we could
- update argonaut and bobtail to newer leveldb :/
- link dynamically for quant...
Sage Weil
11:08 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
looks like levedb spinning on background compaction.
his .2 package is quantals, which is leveldb 1.5.. newer than...
Sage Weil
10:56 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Output of gdb /usr/bin/ceph-osd $pid, then 'thread apply all bt' Corin Langosch
10:25 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Sage, here we go. Is it enough data or do you need more? I didn't disable the logging yet... Corin Langosch
10:00 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Corin-
Can you enable 'debug osd = 20' for a bit and attach that log? I think this is related to commit:830b8f...
Sage Weil
08:31 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Just another small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of ... Corin Langosch
05:50 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Just a small update - nothing changed so far. The cluster is still healthy, but the osd is still using 100% of one co... Corin Langosch
05:06 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Here's a nice graph to see the difference before/ after upgrade of disk activity....
The cluster is clean, no reco...
Corin Langosch
05:03 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Hi Sage,
sorry for the delay. I just shutdown the osd, upgraded it and started it again. It's again using almost 1...
Corin Langosch
10:29 AM rgw Bug #3941 (Resolved): s3tests crash on bobtail
... Sage Weil
09:28 AM Revision f666c617 (ceph): Revert "librbd: ensure header is up to date after initial read"
Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th...
Josh Durgin
09:28 AM Revision 10053b14 (ceph): librbd: establish watch before reading header
This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou...
Josh Durgin
09:28 AM Revision 76f93751 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-...
Dan Mick
09:28 AM Revision a16c6f3d (ceph): rbd: fix bench-write infinite loop
I/O was continously submitted as long as there were few enough ops in
flight. If the number of 'threads' was high, or...
Josh Durgin
08:58 AM Revision 575a5866 (ceph): os/FileStore: only adjust up op queue for btrfs
We only need to adjust up the op queue limits during commit for btrfs,
because the snapshot initiation (async create)...
Sage Weil
08:47 AM Revision c9eb1b0a (ceph): common/HeartbeatMap: fix uninitialized variable
Introduced by me in 132045ce085e8584a3e177af552ee7a5205b13d8. Thank you,
valgrind!
Signed-off-by: Sage Weil <sage@i...
Sage Weil
06:35 AM Revision fa421cf5 (ceph): configure: remove -m4_include(m4/acx_pthread.m4)
Since we use already AC_CONFIG_MACRO_DIR, no need to include m4/acx_pthread.m4
extra.
Signed-off-by: Danny Al-Gaaf <...
Danny Al-Gaaf
06:34 AM Revision 32276e9a (ceph): configure: fix RPM_RELEASE
Use git to get RPM_RELEASE only if this is a git repo
clone and if the git command is available on the system.
Signe...
Danny Al-Gaaf
04:49 AM Revision 341e6760 (ceph): osdmaptool: fix clitests
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:33 AM Revision 54c392e0 (ceph): osd: dump/display pool min_size
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
01:24 AM devops Feature #3479 (Resolved): ceph-deploy: uninstall
Sage Weil
01:24 AM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
Sage Weil

01/26/2013

09:46 PM Revision 1ba4c80b (ceph): qa/workunits/rbd/copy.sh: use non-deprecated --image-format option
--format is deprecated.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:45 PM Revision bbb86ec7 (ceph): mon: safety interlock for pool deletion
Require that the pool name be passed twice along with an force option
before we irreversibly delete an entire pool of...
Sage Weil
09:26 PM Revision 700bcede (ceph): Revert "mon: implement safety interlock for deleting pools"
This reverts commit c993ac9b1fa4037f4cc2674455728ee38a7c978b.
This is too hard to test. Requiring the pool name twi...
Sage Weil
09:18 PM Revision 6c407943 (ceph): Added libexpat dependency
Cesar Mello
09:13 PM Revision b5f81636 (ceph): osdthrasher: inject pause on a live (on in) osd
Sage Weil
08:58 PM devops Feature #3917 (Fix Under Review): ceph-dir-prepare command
Sage Weil
08:58 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
init system is a property of the host, not the disk.. doesn't belong in ceph-disk-prepare. Sage Weil
08:57 PM devops Feature #3911 (Fix Under Review): sysvinit: allow daemon enumeration via dirs
Sage Weil
08:57 PM devops Feature #3914 (Fix Under Review): ceph-disk-activate: support sysvinit
Sage Weil
08:54 PM devops Feature #3341 (Rejected): ceph-disk-activate: Make --mount the default
Sage Weil
08:53 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
ceph-deploy commit:8067dd0afa19ff7b7ca75f984dedc4213d3a4be8 Sage Weil
05:21 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Ian Colle wrote:
> Aaron are you still seeing this?
Sorry I need to get the time to try and reproduce this (and o...
Aaron Schulz
12:44 PM rbd Bug #3937 (Fix Under Review): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
A patch resolving this has been posted for review.
[PATCH 4/4] rbd: don't drop watch requests on completion
Alex Elder
12:43 PM rbd Bug #3940 (Fix Under Review): krbd: decrement obj request count when deleting
A patch resolving this has been posted for review. Alex Elder
08:05 AM rbd Bug #3940 (Resolved): krbd: decrement obj request count when deleting
The obj_request_count value keeps track of how many object requests
are associated with an image request. It is inc...
Alex Elder
07:57 AM rbd Bug #3939 (Duplicate): krbd: circular locking report in sysfs code
I intended to write this up before but don't think I did.
I'm getting a "possible circular locking dependency detect...
Alex Elder
05:27 AM Revision 7daf3724 (ceph): rbd-fuse: Original code from Andreas Bluemle
Signed-off-by: Andreas Bluemle <andreas.bluemle@itxperts.de> Andreas Bluemle
05:27 AM Revision 2a6dcabf (ceph): rbd-fuse: add simple RBD FUSE client
Currently written in C on FUSE hi-level interfaces, so error reporting
could be better. No serious work done for per...
Dan Mick
05:25 AM Revision aec2a474 (ceph): s3/php: update to 1.5? version of API
Something like v1.5 of the Amazon PHP library requires the AmazonS3
constructor to be given an array of parameters ra...
Dan Mick
02:07 AM Revision b2a473be (ceph): workunit for iogen
Signed-off-by: tamil <tamil.muthamizhan@inktank.com> Tamilarasi muthamizhan
01:59 AM Revision b98da75a (ceph): Merge branch 'wip-osd-msgr'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
01:58 AM Revision 17cd549a (ceph): mon: Monitor: timecheck: only output report to dout once
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
01:56 AM Revision 13fb1726 (ceph): mon: Monitor: track timecheck round state and report on health
Fixes: #3854
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
01:56 AM Revision aa85d914 (ceph): task: mon_clock_skew_check: increase timeout and kick it off only on stop
We were kicking-off the timeout as soon as we started; it's better however
to kick if off only when we are told to st...
Joao Eduardo Luis
01:56 AM Revision 673101c7 (ceph): task: mon_clock_skew_check: distinguish between on-going and finished c...
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
01:24 AM Revision e6bceeed (ceph): sharedptr_registry: remove extaneous Mutex::Locker declaration
For some reason, the lookup() retry loop (for when happened to
race with a removal and grab an invalid WeakPtr) locke...
Samuel Just
01:24 AM Revision 60888caf (ceph): FileStore: ping TPHandle after each operation in _do_transactions
Each completed operation in the transaction proves thread
liveness, a stuck thread should still trigger the timeouts....
Samuel Just
01:24 AM Revision 6b8a673f (ceph): OSD: use TPHandle in peering_wq
Implement _process overload with TPHandle argument and use
that to ping the hb map between pgs and between map epochs...
Samuel Just
01:24 AM Revision aa6d20aa (ceph): WorkQueue: add TPHandle to allow _process to ping the hb map
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 4f653d23999b24fc8c65a5...
Samuel Just
01:23 AM Revision e66a7505 (ceph): ReplicatedPG: handle omap > max_recovery_chunk
span_of fails if len == 0.
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from c...
Samuel Just
01:23 AM Revision 44f0407a (ceph): ReplicatedPG: correctly handle omap key larger than max chunk
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit c3dec3e30a85ecad0090c7...
Samuel Just
01:23 AM Revision 50fd6ac9 (ceph): ReplicatedPG: start scanning omap at omap_recovered_to
Previously, we started scanning omap after omap_recovered_to.
This is a problem since the break in the loop implies t...
Samuel Just
01:23 AM Revision 4b32eecb (ceph): ReplicatedPG: don't finish_recovery_op until the transaction completes
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 62a4b96831c1726043699db86a664dc6a0af8637)
Samuel Just
01:23 AM Revision da34c77b (ceph): ReplicatedPG: ack push only after transaction has completed
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 20278c4f77b890d5b2b95d2ccbeb4fbe106667ac)
Samuel Just
01:23 AM Revision f9381c74 (ceph): ObjectStore: add queue_transactions with oncomplete
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 4d6ba06309b80fb21de7bb5d12d5482e71de5f16)
Samuel Just
01:22 AM Revision e2560554 (ceph): common/HeartbeatMap: inject unhealthy heartbeat for N seconds
This lets us test code that is triggered by an unhealthy heartbeat in a
generic way.
Signed-off-by: Sage Weil <sage@...
Sage Weil
01:22 AM Revision cbe8b5bc (ceph): os/FileStore: add stall injection into filestore op queue
Allow admin to artificially induce a stall in the op queue. Forces the
thread(s) to sleep for N seconds. We pause f...
Sage Weil
01:22 AM Revision beb6ca44 (ceph): osd: do not join cluster if not healthy
If our internal heartbeats are failing, do not send a boot message and try
to join the cluster.
Signed-off-by: Sage ...
Sage Weil
01:22 AM Revision 1ecdfca3 (ceph): osd: hold lock while calling start_boot on startup
This probably doesn't strictly matter because start_boot doesn't need the
lock (currently) and few other threads shou...
Sage Weil
01:22 AM Bug #3938 (Can't reproduce): ceph-mon crashed on mixed bobtail-argonaut cluster (2 argonaut mons,...
7:09:03.310220 7f652087e700 1 mon.a@1(peon).osd e72 e72: 20 osds: 20 up, 20 in ... Samuel Just
01:21 AM Revision e120bf20 (ceph): osd: do not reply to ping if internal heartbeat is not healthy
If we find that our internal threads are stalled, do not reply to ping
requests. If we do this long enough, peers wi...
Sage Weil
01:21 AM Revision 5f396e2b (ceph): osd: reduce op thread heartbeat default 30 -> 15 seconds
If the thread stalls for 15 seconds, let our internal heartbeat fail.
This will let us internally respond more quickl...
Sage Weil
01:17 AM Revision fca288b7 (ceph): osd: improve sub_op flag points
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 73a969366c8bbd105579611320c43e2334907fef)
Sage Weil
01:17 AM Revision f13ddc8a (ceph): osd: refactor ReplicatedPG::do_sub_op
PULL is the only case where we don't wait for active.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked fro...
Sage Weil
01:17 AM Revision d5e00f96 (ceph): osd: make last state for slow requests more informative
Report on the last event string, and pass in important context for the
op event list, including:
- which peers were...
Sage Weil
01:17 AM Revision ab3a110c (ceph): osd: dump op priority queue state via admin socket
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 24d0d7eb0165c8b8f923f2d8896b156bfb5e0e60)
Sage Weil
01:17 AM Revision 43a65d04 (ceph): osd: simplify asok to single callback
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 33efe32151e04beaafd9435d7f86dc2eb046214d)
Sage Weil
01:16 AM Revision d0407986 (ceph): common/PrioritizedQueue: dump state to Formatter
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 514af15e95604bd241d2a98a97b938889c6876db)
Sage Weil
01:16 AM Revision 691fd505 (ceph): common/PrioritizedQueue: add min cost, max tokens per bucket
Two problems.
First, we need to cap the tokens per bucket. Otherwise, a stream of
items at one priority over time w...
Sage Weil
01:16 AM Revision a2b03fe0 (ceph): common/PrioritizedQueue: buckets -> tokens
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c549a0cf6fae78c8418a3b4b0702fd8a1e4ce482)
Sage Weil
01:16 AM Revision 612d75cd (ceph): note puller's max chunk in pull requests
this lets us calculate a cost value
(cherry picked from commit 128fcfcac7d3fb66ca2c799df521591a98b82e05)
Sage Weil
01:16 AM Revision 2224e413 (ceph): osd: add OpRequest flag point when commit is sent
With writeahead journaling in particular, we can get requests that
stay in the queue for a long time even after the c...
Sage Weil
01:16 AM Revision 5b5ca592 (ceph): osd: set PULL subop cost to size of requested data
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit a1bf8220e545f29b83d965f07b1abfbea06238b3)
Sage Weil
01:16 AM Revision 10651e4f (ceph): osd: use Message::get_cost() function for queueing
The data payload is a decent proxy for cost in most cases, but not all.
Signed-off-by: Sage Weil <sage@inktank.com>
...
Sage Weil
01:16 AM Revision 9735c6b1 (ceph): osd: debug msg prio, cost, latency
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit bec96a234c160bebd9fd295df5b431dc70a2cfb3)
Sage Weil
01:15 AM Revision c48279da (ceph): filestore: filestore_queue_max_ops 500 -> 50
Having a deep queue limits the effectiveness of the priority queues
above by adding additional latency.
Signed-off-b...
Sage Weil
01:15 AM Revision f47b2e8b (ceph): osd: target transaction size 300 -> 30
Small transactions make pg removal nicer to the op queue. It also slows
down PG deletion a bit, which may exacerbate...
Sage Weil
01:15 AM Revision 4947f0ef (ceph): os/FileStore: allow filestore_queue_max_{ops,bytes} to be adjusted at r...
The 'committing' ones too.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit cfe4b8519363f92f84...
Sage Weil
01:14 AM Revision ad6e6c91 (ceph): osd: make osd_max_backfills dynamically adjustable
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 101955a6b8bfdf91f4229f4ecb5d5b3da096e160)
Sage Weil
01:14 AM Revision 939b1855 (ceph): osd: make OSD a config observer
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 9230c863b3dc2bdda12c23202682a84c48f070a1)
Con...
Sage Weil
12:16 AM Revision b49440bc (ceph): doc: Added new, more comprehensive OSD/PG monitoring doc.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:15 AM Revision 5f210505 (ceph): doc: Trimmed some detail and added a x-ref to detailed osd/pg monitorin...
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:14 AM Revision 95cfdd46 (ceph): doc: Added osd/pg monitoring section to the index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:14 AM Revision d36a208c (ceph): doc: Added x-ref links.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins

01/25/2013

10:25 PM Revision 89386856 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
10:24 PM Revision 1af3578e (ceph): doc: fixed description for pg in control section.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:48 PM Revision 248835d4 (ceph): doc: wider sidebar, larger font, cleaned tip CSS
The sidebar is now about a hundred pixels wider and the fonts
are larger throughout. This works a lot better when yo...
Ross Turk
08:16 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
Just to close this out...
The fix (not repeating no ERANGE) has been committed:
commit c04306471ad93f1daf60771a...
Alex Elder
06:27 AM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
Josh rejected this. But since he said that the
change I proposed--to not do the loop--was OK
I suggest this bug sh...
Alex Elder
07:41 PM Revision 037900dc (ceph): sharedptr_registry: remove extaneous Mutex::Locker declaration
For some reason, the lookup() retry loop (for when happened to
race with a removal and grab an invalid WeakPtr) locke...
Samuel Just
06:54 PM Revision 8bd306b9 (ceph): doc: Added Subdomain section.
fixes: #3778
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
05:40 PM Revision 8fef6fa3 (ceph): osd/PG: include map epoch in query results
Currently you can only infer it from the info.history.* fields.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:38 PM Revision e359a862 (ceph): osd: kill unused addr-based send_map()
Not used, old API, bad.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:38 PM Revision 5e2fab54 (ceph): osd: share incoming maps via Connection*, not addrs
Kill a set of parallel methods that are using the old addr/inst-based
msgr APIs, and instead use Connection handles. ...
Sage Weil
05:38 PM Revision 1bc419a7 (ceph): osd: pass new maps to dead osds via existing Connection
Previously we were sending these maps to dead osds via their old addrs
using a new outgoing connection and setting th...
Sage Weil
05:38 PM Revision 76705ace (ceph): osd: requeue osdmaps on heartbeat connections for cluster connection
If we receive an OSDMap on the cluster connection, requeue it for the
cluster messenger, and process it there where w...
Sage Weil
05:38 PM Revision a7059eb3 (ceph): msgr: add get_loopback_connection() method
Return the Connection* for ourselves, so we can queue messages for
ourselves.
Signed-off-by: Sage Weil <sage@inktank...
Sage Weil
05:38 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
I will be able to reproduce after the Feb,8. Willl do if nobody will reproduce before. Ivan Kudryavtsev
04:33 PM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
please set 'debug mds = 10' and upload mds log. To minimize mds log size, please truncate the mds log before executin... Zheng Yan
09:54 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
I made a mistake during initial post: amount of files in directory is 3.5K, not 35K. It's my netflow for last years, ... Ivan Kudryavtsev
12:11 AM CephFS Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
At #3936 I'm providing some benchmarks to show that IOPS/speed is OK for my installation and my hands are not perform... Ivan Kudryavtsev
04:25 PM Documentation #3222 (Resolved): DOC: Get an Object from a Primary OSD
Added a full exercise toward the end here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ John Wilkins
08:50 AM Documentation #3222 (In Progress): DOC: Get an Object from a Primary OSD
John Wilkins
04:24 PM Documentation #3333 (Resolved): doc: Explain "degraded" more
More extensive discussion here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ John Wilkins
04:24 PM Documentation #3331 (Resolved): doc: Where is my data placed?
Provided an entire exercise toward the end of this document: http://ceph.com/docs/master/rados/operations/monitoring-... John Wilkins
04:22 PM Documentation #3320 (Resolved): doc: What persistency does Ceph guarantee
Added more extensive discussions.
Here: http://ceph.com/docs/master/rados/operations/monitoring-osd-pg/ and
Her...
John Wilkins
03:25 PM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
OK, with Josh's help I finally managed to reproduce the
problem intentionally to check my fix.
I'm building it no...
Alex Elder
11:11 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I have confirmed that every time a request registered to linger
is re-submitted the osd client will call the callbac...
Alex Elder
08:07 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
I've decoded the osd request that's been provided to
rbd_osd_req_callback(). Its contents look completely
legitima...
Alex Elder
06:54 AM rbd Bug #3937: krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
Adding two things:
- this occurred during test 190 of the third consecutive pass
of xfstests with this in the teuth...
Alex Elder
05:04 AM rbd Bug #3937 (Resolved): krbd: crash in rbd_assert(osd_req == obj_request->osd_req)
Looking at a crash this morning in the new request code due
to this failed assertion in rbd_osd_req_callback():
...
Alex Elder
03:14 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
Ian Colle
01:51 PM Subtask #3840: osd: ack push after apply+commit
Ian Colle
01:50 PM Feature #3833: osd: improve recovery throttling
Ian Colle
11:48 AM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
pushed to master, still need to backport Samuel Just
11:40 AM Bug #3836 (Fix Under Review): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
D'oh. sharedptr_registry.hpp has an extaneous Mutex::Locker l(lock) declaration in the retry loop. It only actually... Samuel Just
11:41 AM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
John Wilkins
11:40 AM Documentation #3390 (Resolved): doc: add detail on different bucket algorithms
John Wilkins
11:12 AM rgw Feature #3669 (In Progress): rgw: support acl grants through http headers
Ian Colle
11:09 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
Merged into master, commit: b3a2e7e955547a863d29566aab62bcc480e27a65 caleb miles
11:07 AM rgw Feature #3667 (In Progress): rgw: support extra canned acl params
Ian Colle
10:55 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
The timeout should be fixed by e0511f4f4773766d04e845af2d079f82f3177cb6. Samuel Just
10:55 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
Added info for subdomain call. John Wilkins
10:33 AM rgw Bug #3778 (In Progress): document procedure for enabling subdomain S3 api calls
John Wilkins
09:54 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
It's pretty likely that this is a server-side behavior rather than a client-side one. Keep that in mind when reproduc... Greg Farnum
12:00 AM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
rados -p rbd bench 120 write -t 16
shows about 90-110 MB/sec.
Ivan Kudryavtsev
09:52 AM rbd Bug #3654 (Resolved): libvirt: colons in ipv6 monitor addresses are not escaped when sent to qemu
Upstream commit c1509ab47edf61e9f20d11922526b9fca518d238 Josh Durgin
09:34 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
Yes, the ENXIO is expected. Assuming it's being propagated out to dd, and the test passes (outputs OK at the end of k... Josh Durgin
05:55 AM rbd Bug #3427: krbd: unmap does not remove block device properly
We had some discussion about the whether an atomic bit
operation for this was sufficient, or whether a memory
barri...
Alex Elder
05:48 AM Revision a6ed62e3 (ceph): common: fix cli tests on usage
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:06 AM Revision 38871e27 (ceph): os/FileStore: only adjust up op queue for btrfs
We only need to adjust up the op queue limits during commit for btrfs,
because the snapshot initiation (async create)...
Sage Weil
05:06 AM Revision 5f9ab930 (ceph): Revert "filestore: disable extra committing queue allowance"
This reverts commit 44dca5c8c5058acf9bc391303dc77893793ce0be.
The allowance is not only added for btrfs as of commit...
Sage Weil
05:00 AM Revision d95b4313 (ceph): adminops.rst: revert changes for as-yet-unimplemented features
See wip-admin-api for the new specification
Fixes: #3724
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick
04:40 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
Heh. Funny markup. The numbered list came out of #s used for comments.
Anyway, I've just verified that the issue...
Alexandre Oliva
04:34 AM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
I've just verified that the problem is still present in 3.7.3, and I have a much simpler reproducer too.
mount -t ...
Alexandre Oliva
03:43 AM Revision bb860e49 (ceph): rados: remove unused "check_stdio" parameter
Signed-off-by: Dan Mick <dan.mick@inktank.com> DanTest MickTest
02:05 AM Bug #3810: btrfs corrupts file size on 3.7
Kernel was 3.7.1
Ran btrfsck on the partitions when the error first occurred with nothing found.
Tried your fix o...
Bill Kenworthy
01:54 AM Revision 234becd3 (ceph): rados: obey op_size for 'get'
Otherwise we try to read the whole object in one go, which doesn't bode
well for large objects (either non-optimal or...
Sage Weil
01:31 AM Revision 3a5c70b8 (ceph): ceph_manager: turn long stall injection off by default
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:25 AM Revision 4f653d23 (ceph): WorkQueue: add TPHandle to allow _process to ping the hb map
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
01:25 AM Revision e0511f4f (ceph): OSD: use TPHandle in peering_wq
Implement _process overload with TPHandle argument and use
that to ping the hb map between pgs and between map epochs...
Samuel Just
01:25 AM Revision 0c1cc687 (ceph): FileStore: ping TPHandle after each operation in _do_transactions
Each completed operation in the transaction proves thread
liveness, a stuck thread should still trigger the timeouts....
Samuel Just
12:24 AM Revision 006e7065 (ceph): osd_recovery: fix up incomplete test
- stop rados bench from cleaning up
- flush pg stats
- fix sleep call
One or more of these helped fix this test, don...
Sage Weil
12:23 AM Revision 20af01f2 (ceph): ceph_manager: fix get_num_active_recovered()
The states now have 'backfill' *or* 'recover' in them. Sage Weil
12:20 AM Revision 79d599cf (ceph): java: remove extra whitespace
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins

01/24/2013

11:59 PM rbd Bug #3936: rbd: Strange dd speed behaviour (server side issue?)
I also tried to do:
dd if=/dev/rbd/rbd/test of=/dev/null bs=4M - the same situation.
Ivan Kudryavtsev
11:57 PM rbd Bug #3936 (Rejected): rbd: Strange dd speed behaviour (server side issue?)
I have 3 node/15 osds (5 on each), every on separate drive installation (with SSD cache), journal in RAMFS. XFS as ba... Ivan Kudryavtsev
11:46 PM CephFS Bug #3935 (Can't reproduce): kclient: Big directory access bugs (multiple), mixed 32- and 64-bit ...
I have next directory structure in ceph fs:... Ivan Kudryavtsev
11:21 PM Revision b150e8e3 (ceph): workunit: pass java path as env variable
The libcephfs-java test needs this. Sage Weil
11:13 PM Revision 6f0e1137 (ceph): libcephfs-java test: use provided environment
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:41 PM Bug #3810: btrfs corrupts file size on 3.7
Bill Kenworthy wrote:
> Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after ...
Sage Weil
09:36 PM Bug #3810: btrfs corrupts file size on 3.7
Version was 55.1 when created and the error occurred, now updated to 56.1 (on gentoo) after error
Its organised as 5...
Bill Kenworthy
08:55 PM Bug #3810: btrfs corrupts file size on 3.7
Bill Kenworthy wrote:
> I have been hit by the same thing ... is there any information you need before I try and fix...
Sage Weil
06:18 PM Bug #3810: btrfs corrupts file size on 3.7
I have been hit by the same thing ... is there any information you need before I try and fix it further.
Ive tried...
Bill Kenworthy
01:35 PM Bug #3810: btrfs corrupts file size on 3.7
How about this object instead:
2013-01-23 18:41:31.336722 osd.7 149.165.228.11:6800/28046 159 : [ERR] 2.202 osd.0: s...
Mike Lowe
01:16 PM Bug #3810: btrfs corrupts file size on 3.7
the going theory is that this is triggered by btrfs scrub. can we confirm this somehow? Sage Weil
11:03 AM Bug #3810: btrfs corrupts file size on 3.7
Samuel Just wrote:
> I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and...
Mike Lowe
10:17 AM Bug #3810: btrfs corrupts file size on 3.7
Additional info, btrfs scrubs were done while the osd's were active which may or may not have had a negative effect. ... Mike Lowe
09:31 PM Revision 40ae8cea (ceph): common: only show -d, -f options for daemons
Fixes: #3073
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
09:13 PM Revision 7e7130da (ceph): doc: Syntax fixes.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:58 PM Revision b51bfdf0 (ceph): doc: Updated usage for Bobtail.
fixes: #3831
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
08:57 PM Revision 1d71d052 (ceph): doc: Updated usage for Bobtail.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:55 PM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
commit d95b4313de1614fd85265879e6d7ddadd5268af2
Dan Mick
08:45 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
Since the docs are in wip-admin-api, this amounts to rolling doc/radosgw/admin/adminops.rst back to its state as of 0... Dan Mick
01:41 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
Sage Weil
01:38 PM rgw Bug #3724: docs refer to non-implemented features of the radosgw-admin rest api
John - any update? Ian Colle
08:54 PM Revision b0a5fe94 (ceph): java: support ceph_get_file_pool_name
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
08:50 PM Revision 6a859bcd (ceph): ceph_manager: use 80/70 as pause_long, pause_check_after defaults
OSD::op_tp suicides after 150.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
08:47 PM Revision 6b272e0f (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
08:46 PM Revision 42d92b73 (ceph): doc: Added example of ext4 user_xattr mount option.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:43 PM Bug #3885: osd: osd-recovery-incomplete qa test failing
(the above commit is in the teuthology code) Dan Mick
04:28 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
fixed, mostly by commit:20af01f23ba932cb97cb40bba89bff546e10c461, which may fix up some of hte other spurious failure... Sage Weil
11:13 AM Bug #3885 (In Progress): osd: osd-recovery-incomplete qa test failing
Sage Weil
08:30 PM Revision b3a2e7e9 (ceph): rgw_rest: Make fallback uri configurable.
Some HTTP servers, notabily lighttp, do not set SCRIPT_URI, make the fallback
string configurable.
Signed-off-by: ca...
caleb miles
08:29 PM Revision b0f27a8f (ceph): librbd: Allow get_lock_info to fail
If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need...
Dan Mick
07:33 PM Revision 0c6d5a9d (ceph): java: support fchmod
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:33 PM Revision 9cefa969 (ceph): java: add missing chmod unmounted test
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:33 PM Revision 487bacdb (ceph): java: fix exception name typo
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:33 PM Revision 352652b6 (ceph): libcephfs: document ERANGE rv for get_file_pool_name
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:27 PM Revision 4b3bcb92 (ceph): java: support stat()
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
06:52 PM Revision 00cfe1d3 (ceph): common/HeartbeatMap: fix uninitialized variable
Introduced by me in 132045ce085e8584a3e177af552ee7a5205b13d8. Thank you,
valgrind!
Signed-off-by: Sage Weil <sage@i...
Sage Weil
06:41 PM Revision b9f58baa (ceph): libcephfs-java test: jar files are in /usr/local/share/java, it seems
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:35 PM Revision f9f31aae (ceph): wireshark: fix indention
Fix indention.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
Danny Al-Gaaf
06:35 PM Revision 3e9cc0d4 (ceph): wireshark: fix guint64 print format handling
Use G_GUINT64_FORMAT to handle print format of guint64 correctly.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect...
Danny Al-Gaaf
06:08 PM Revision 0f24dca2 (ceph): ceph_manager: use do_rados for rmpool
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
04:54 PM devops Bug #3934: ceph-deploy new should require at least one host name
If no hosts are specified on the command line, a ceph.conf file is created without any monitors listed. No errors or... Anonymous
04:51 PM devops Bug #3934 (Resolved): ceph-deploy new should require at least one host name
Anonymous
04:14 PM devops Bug #3933: ceph-deploy gatherkeys silently fails if no host is specified
If no host is specified and ceph.conf exists gatherkeys will fail, but not report any error. Anonymous
04:12 PM devops Bug #3933 (Resolved): ceph-deploy gatherkeys silently fails if no host is specified
Anonymous
02:52 PM Bug #3930 (Resolved): ceph.spec: udev rule for rbd not in rpms
The udev rule for kernel rbd (udev/50-rbd.rules in ceph.git) should be packaged. It's already in the debs: debian/lib... Josh Durgin
01:41 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
Sage Weil
01:39 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
Any update? Ian Colle
01:41 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
Sage Weil
01:33 PM rgw Bug #3450: WRITE permission only doesn't allow proper multi-part upload
Needs to be part of larger overall discussion about the intent of subusers. Ian Colle
01:41 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
Sage Weil
01:38 PM rgw Bug #3706: rgw functional test testSlashInName failed in nightly
Need to see if happens again and then find reproducer. Ian Colle
01:41 PM rgw Feature #2804: rgw: disallow running multiple gateways on the same fastcgi socket
Sage Weil
01:41 PM rgw Feature #3074: radosgw needs --help support
Sage Weil
01:41 PM rgw Bug #2366: rgw: bucket index update rely on pg state
Sage Weil
01:41 PM rgw Bug #2650: rgw: swift key creation overrides subuser access mask
Sage Weil
01:41 PM rgw Bug #1777: rgw: user info modification is not atomic
Sage Weil
01:41 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
Sage Weil
01:14 PM rgw Bug #1779: rgw: swift auth returns wrong error code when unexisting user is given
Work in course with other swift changes, but not a driver. Ian Colle
01:40 PM rgw Feature #3366: rgw: dr: define management api
Caleb to get out updated document for review. Ian Colle
01:37 PM rgw Bug #3620: rgw:improve multiple user access keys scalability
Caleb to review. Ian Colle
01:36 PM rgw Bug #3682 (Resolved): valgrind errors seen when running rgw tests in nightlies
Increased time in tests and has not occurred. Ian Colle
01:35 PM rgw Bug #3628 (Resolved): rgw: leak of object parts on partial upload
Fixed in bobtail Ian Colle
01:34 PM rgw Bug #3485 (In Progress): rgw: unique user emails not enforced
Ian Colle
01:34 PM Bug #3906: ceph-mon leaks memory during peering
the logs indicate this may be related to failed auth connection attempts spamming the monitor. Sage Weil
11:43 AM Bug #3906: ceph-mon leaks memory during peering
we need to reproduce this on a large internal cluster, with many osds and even more pgs. Sage Weil
09:38 AM Bug #3906: ceph-mon leaks memory during peering
I believe this to be related to #3609 Joao Eduardo Luis
01:32 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
commit:40ae8ceab58b4c05e01dc9f7809728a592cc4f0d actaully Sage Weil
01:30 PM rgw Bug #3073 (Resolved): radosgw-admin: is not a daemon, should not have -d/-f options
commit:b878b2c6e9ee41de25faf4dfdd7285dcb01b36e8 Sage Weil
01:26 PM rgw Bug #3073: radosgw-admin: is not a daemon, should not have -d/-f options
Change common init Ian Colle
01:30 PM rgw Bug #3365: Broken metadata (duplicated as CSV)
Aaron are you still seeing this? Ian Colle
01:29 PM rgw Bug #3365 (Need More Info): Broken metadata (duplicated as CSV)
Ian Colle
01:21 PM rgw Feature #2490: rgw-admin: only register watch when needed
Performance improvement. Ian Colle
01:21 PM CephFS Bug #1878: ceph.ko doesn't setattr (lchown, utimes) on symlinks
This is still present in 3.6.11 (I'll know about 3.7.* soon). I suspect this may have to do with failing to mark met... Alexandre Oliva
01:18 PM rgw Bug #2482 (Rejected): rgw: duplicate content-length results in 400
Apache issue. Ian Colle
01:14 PM rgw Bug #1906 (Can't reproduce): rgw: total_time isn't logged consistently
Ian Colle
01:13 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
John Wilkins
01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
Ian Colle
01:10 PM rgw Bug #1673: rgw: mod_fastcgi needs to be backward compatible
Canonical can not take our changes up stream until we solve this issue. Ian Colle
11:16 AM rgw Cleanup #3929 (New): s3-tests: refactor all test_post_* tests
These tests mostly do the same thing, can be cleaned up, no need to duplicate the same code across all. Yehuda Sadeh
10:58 AM CephFS Feature #3821 (In Progress): qa: run backuppc as part of qa suite
Ekapol Rojpiboonphun wrote:
> Just to make sure that I will be on this along the line of what you might already have...
Sage Weil
10:52 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
Just to make sure that I will be on this along the line of what you might already have in mind. (More details please ... Anonymous
09:56 AM CephFS Feature #3821: qa: run backuppc as part of qa suite
Download/install backuppc and get it into suite. Ian Colle
10:32 AM Bug #3928 (In Progress): osd: peering workqueue tryings to advance through *all* past osdmaps in ...
Samuel Just
10:02 AM Bug #3928 (Resolved): osd: peering workqueue tryings to advance through *all* past osdmaps in one...
Sage Weil
10:10 AM Bug #3905: incomplete & stale (lost?) PGs
Sounds like a combination of crush map and rules that aren't behaving well together — "incomplete" means the PG doesn... Greg Farnum
09:42 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
The olog stuff is fixed in bobtail, and won't be backported to argonaut.
I'm not sure what the root cause of hte h...
Sage Weil
08:42 AM Bug #3854: mon: clock skew tests failing on master
Happened again on QA, reopening while testing a new patch. Joao Eduardo Luis
08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
Hey! I just looked at the test, and here's how it ends:
# remove snapshot and detect error from mapped snapshot
...
Alex Elder
08:15 AM rbd Bug #3927: krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
This is the relevant portion of the yaml file:
- workunit:
clients:
all:
- rbd/map-unmap.sh
...
Alex Elder
08:09 AM rbd Bug #3927 (Closed): krbd: I/O errors (ENXIO) during rbd/kernel.sh workunit
I'm seeing ENXIO errors at what I believe to the "rbd/kernel.sh
teuthology workunit while testing the new request co...
Alex Elder
05:49 AM rbd Feature #3926 (Resolved): krbd: use slab allocation for common data structures
There are some common data structures--like image and object
requests--that are very frequently allocated and would ...
Alex Elder
05:29 AM rbd Bug #3925 (Resolved): krbd: sysfs write lockdep warnings
... Alex Elder
03:42 AM Revision 2f192eaf (ceph): TestRados expects rollback, not snap_rollback
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
02:50 AM Revision 67c77577 (ceph): PendingReleaseNotes: pool removal cli changes
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:49 AM Revision 87fe35f6 (ceph): Merge remote-tracking branch 'gh/wip-rm-pool'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
02:47 AM Revision 64b9dd08 (ceph): Merge remote-tracking branch 'gh/wip-3832-oc-flushrange'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
02:43 AM Revision 9b56f367 (ceph): Merge remote-tracking branch 'gh/wip_heartbeat'
Sage Weil
02:40 AM Revision 62579eef (ceph): Merge branch 'wip-osd-hb'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
01:44 AM Revision ec5a1455 (ceph): ceph_manager: default chance_down to 0.4
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:40 AM Revision 566ae533 (ceph): ceph_manager: add filestore and heartbeat stalls
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
01:22 AM Revision 5d66c9ab (ceph): Use ceph git repo instead of github.
This code change is so that instead of pulling the tarball of github
which can be unreliable at times it instead uses...
Sandon Van Ness
12:55 AM Revision d6db239c (ceph): Merge remote-tracking branch 'upstream/wip_push_after_complete'
Reviewed-by: Sage Weil <sage@inktank.com> Samuel Just

01/23/2013

10:00 PM devops Feature #3229 (Resolved): Support clean ceph-fuse fstab automounting
implemented this already; /sbin/mount.fuse.ceph is in bobtail. Sage Weil
09:59 PM devops Feature #3924 (Resolved): ceph-deploy: package it
Sage Weil
09:57 PM devops Feature #3923 (Resolved): ceph-deploy: discover HOST
somewhat similar to new, except we pull the ceph.conf from a remote host. Sage Weil
09:57 PM devops Feature #3922 (Resolved): ceph-deploy: version command
Sage Weil
09:57 PM devops Feature #3921 (Resolved): ceph-deploy: support RPM-based distros
Sage Weil
09:57 PM devops Feature #3920 (Resolved): ceph-deploy: support other deb-based distros
Sage Weil
09:56 PM devops Feature #3919 (Resolved): ceph-deploy: remove upstart dependency
eliminate whatever remaining upstart dependencies are in ceph-deploy, so that upstart and sysvinit are both viable. Sage Weil
09:55 PM devops Feature #3918 (Resolved): ceph-deploy: osd create HOST:DIR[:JOURNAL]
trigger ceph-dir-prepare instead of ceph-disk-prepare. Sage Weil
09:54 PM devops Feature #3917 (Resolved): ceph-dir-prepare command
ceph-dir-prepare <dir> [journal] or similar
somewhat similar to ceph-disk-prepare, but simpler.
- allocate osd ...
Sage Weil
09:54 PM devops Feature #3916 (Resolved): ceph-disk-activate: non-upstart trigger (udev?)
Sage Weil
09:53 PM devops Feature #3915 (Rejected): ceph-disk-prepare: support sysvinit or upstart
Sage Weil
09:53 PM devops Feature #3914 (Resolved): ceph-disk-activate: support sysvinit
Sage Weil
09:52 PM devops Feature #3913 (Resolved): ceph-deploy: break mon into create/destroy
Sage Weil
09:52 PM devops Feature #3912 (Resolved): ceph-deploy: break osd into create/destroy
Actually, we want
ceph-deploy osd prepare HOST:DEV[:JOURNAL]
ceph-deploy osd activate HOST:DEVORDIR
and perh...
Sage Weil
09:52 PM devops Feature #3911 (Resolved): sysvinit: allow daemon enumeration via dirs
Sage Weil
09:52 PM devops Feature #3910 (Resolved): ceph-deploy: uninstall purge
Sage Weil
09:52 PM devops Feature #3909 (Resolved): ceph-deploy: update install for bobtail/argonaut urls
Sage Weil
09:51 PM devops Feature #3907 (Resolved): ceph-deploy: be verbose about what is run and what is done (with -q)
Sage Weil
08:49 PM Revision 8a97eef1 (ceph): ReplicatedPG: handle omap > max_recovery_chunk
span_of fails if len == 0.
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
08:35 PM Revision c3dec3e3 (ceph): ReplicatedPG: correctly handle omap key larger than max chunk
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
08:15 PM Revision 09c71f2f (ceph): ReplicatedPG: start scanning omap at omap_recovered_to
Previously, we started scanning omap after omap_recovered_to.
This is a problem since the break in the loop implies t...
Samuel Just
08:10 PM Bug #3904: FAILED assert(want_acting.empty())
I have a theory:
reset
started
primary
getinfo
got infos
getlog
calc_acting succeeds, choose_acting fails,...
Sage Weil
02:48 PM Bug #3904 (Resolved): FAILED assert(want_acting.empty())
Ceph 0.56.1 on Ubuntu 12.04, standard ceph.com packages. Multiple OSDs started getting marked down/crashing out, this... Faidon Liambotis
07:50 PM Revision 20278c4f (ceph): ReplicatedPG: ack push only after transaction has completed
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
07:50 PM Revision 62a4b968 (ceph): ReplicatedPG: don't finish_recovery_op until the transaction completes
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
07:50 PM Revision 4d6ba063 (ceph): ObjectStore: add queue_transactions with oncomplete
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
06:48 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
commit:64b9dd088d8f20019d6c1042895676b2ec57077e Sage Weil
06:42 PM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
Sage Weil
06:42 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
commit:62579eefba057eea200d8a9a3f6b3d8bca29b8b4 Sage Weil
06:31 PM Bug #3906 (Won't Fix): ceph-mon leaks memory during peering
I've done multiple OSD swaps with both 0.55 & 0.56/0.56.1 on a cluster with > 16k PGs. In those, I've noticed multipl... Faidon Liambotis
06:27 PM Bug #3905 (Can't reproduce): incomplete & stale (lost?) PGs
I added a bunch of new OSDs into my Ceph cluster (0.56.1 on Ubuntu 12.04 LTS) about 72h ago. Simultaneously, I marked... Faidon Liambotis
05:14 PM Revision a972fd40 (ceph): mds: fix end check in Server::handle_client_readdir()
commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the ...
Yan, Zheng
04:49 PM Revision c061e841 (ceph): rados: safety interlock on 'rmpool' command
This is a very easy way for a user to do a lot of damage with no way back.
Make sure they mean it.
Signed-off-by: Sa...
Sage Weil
04:40 PM Revision c993ac9b (ceph): mon: implement safety interlock for deleting pools
This is a very easy way for users to accidentally to a *lot* of damage.
Make it an annoying manual process to actuall...
Sage Weil
02:43 PM Bug #3903 (Resolved): OSDMap::raw_pg_to_pps causes pools to have similar mappings
The pool should be added in a way to ensure that different pools have independent mappings. Samuel Just
02:27 PM Revision 022a5254 (ceph): osd: drop newlines from event descriptions
These produce extra newlines in the log.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.j...
Sage Weil
02:22 PM Revision ebc93a87 (ceph): OSD: do deep_scrub for repair
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(cherry picked...
Samuel Just
02:22 PM Revision 32527fa3 (ceph): ReplicatedPG: ignore snap link info in scrub if nlinks==0
links==0 implies that the replica did not sent snap link information.
Signed-off-by: Samuel Just <sam.just@inktank.c...
Samuel Just
02:22 PM Revision 13e42265 (ceph): osd/PG: fix osd id in error message on snap collection errors
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 381e25870f26fad144ecc2fb99710498e3a7a1d4)
Sage Weil
02:22 PM Revision e3b6191f (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 665577a88b98390b9db0f9991836d10ebdd8f4cf)
Sage Weil
02:21 PM Revision 353b7341 (ceph): ReplicatedPG: compare nlinks to snapcolls
nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ...
Samuel Just
02:21 PM Revision 33d5cfc8 (ceph): ReplicatedPG/PG: check snap collections during _scan_list
During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc...
Samuel Just
02:21 PM Revision bea783bd (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit b85687475fa2ec74e5429d92ee64eda2051a256c)
Samuel Just
02:21 PM Revision 0c48407b (ceph): PG: move auth replica selection to helper in scrub
Signed-off-by: Samuel Just <sam.just@inktank.com>
(cherry picked from commit 39bc65492af1bf1da481a8ea0a70fe7d0b4b17a3)
Samuel Just
02:21 PM Revision c3433ce6 (ceph): mon: note scrub errors in health summary
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 8e33a8b9e1fef757bbd901d55893e9b84ce6f3fc)
Sage Weil
02:21 PM Revision 90c6edd0 (ceph): osd: fix rescrub after repair
We were rescrubbing if INCONSISTENT is set, but that is now persistent.
Add a new scrub_after_recovery flag that is r...
Sage Weil
02:21 PM Revision 0696cf57 (ceph): osd: note must_scrub* flags in PG operator<<
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d56af797f996ac92bf4e0886d416fd358a2aa08e)
Sage Weil
02:21 PM Revision 1541ffe4 (ceph): osd: based INCONSISTENT pg state on persistent scrub errors
This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we r...
Sage Weil
02:21 PM Revision 60910125 (ceph): osd: fix scrub scheduling for 0.0
The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-...
Sage Weil
02:21 PM Revision 0961a3a8 (ceph): osd: note last_clean_scrub_stamp, last_scrub_errors
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 389bed5d338cf32ab14c9fc2abbc7bcc386b8a28)
Sage Weil
02:21 PM Revision 8d823045 (ceph): osd: add num_scrub_errors to object_stat_t
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 2475066c3247774a2ad048a2e32968e47da1b0f5)
Sage Weil
02:20 PM Revision 3a1cd6e0 (ceph): osd: add last_clean_scrub_stamp to pg_stat_t, pg_history_t
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit d738328488de831bf090f23e3fa6d25f6fa819df)
Sage Weil
02:20 PM Revision 7e5a899b (ceph): osd: fix object_stat_sum_t dump signedness
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 6f6a41937f1bd05260a8d70b4c4a58ecadb34a2f)
Sage Weil
02:20 PM Revision e252a313 (ceph): osd: change scrub min/max thresholds
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' wou...
Sage Weil
02:20 PM Revision 33aa64ee (ceph): osd/PG: remove useless osd_scrub_min_interval check
This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we ...
Sage Weil
02:20 PM Revision fdd0c1ec (ceph): osd: move scrub schedule random backoff to seperate helper
Separate this from the load check, which will soon vary dependon on the
PG.
Signed-off-by: Sage Weil <sage@inktank.c...
Sage Weil
02:20 PM Revision 9ffbe268 (ceph): osd/PG: trigger scrub via scrub schedule, must_ flags
When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. ...
Sage Weil
02:19 PM Revision cffb1b22 (ceph): osd/PG: introduce flags to indicate explicitly requested scrubs
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 1441095d6babfacd781929e8a54ed2f8a4444467)
Sage Weil
02:19 PM Revision 438e3dfc (ceph): osd/PG: move scrub schedule registration into a helper
Simplifies callers, and will let us easily modify the decision of when
to schedule the PG for scrub.
Signed-off-by: ...
Sage Weil
01:40 PM Revision acb47e4d (ceph): os/FileStore: only flush inline if write is sufficiently large
Honor filestore_flush_min in the inline flush case.
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Re...
Sage Weil
01:40 PM Revision 15a1ced8 (ceph): os/FileStore: fix compile when sync_file_range is missing;
If sync_file_range is not present, we always close inline, and flush
via fdatasync(2).
Fixes compile on ancient plat...
Sage Weil
01:39 PM Revision 9dddb9d8 (ceph): osd: set pg removal transactions based on configurable
Use the osd_target_transaction_size knob, and gracefully tolerate bogus
values (e.g., <= 0).
Signed-off-by: Sage Wei...
Sage Weil
01:38 PM Revision c30d231e (ceph): osd: make pg removal thread more friendly
For a large PG these are saturating the filestore and journal queues. Do
them synchronously to make them more friend...
Sage Weil
01:38 PM Revision b2bc4b95 (ceph): os: move apply_transactions() sync wrapper into ObjectStore
This has nothing to do with the backend implementation.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked f...
Sage Weil
01:38 PM Revision 6d161b57 (ceph): os: add apply_transaction() variant that takes a sequencer
Also, move the convenience wrappers into the interface and funnel through
a single implementation.
Signed-off-by: Sa...
Sage Weil
12:31 PM Support #3902 (Closed): S3-tests need to cleanup after themselves
On Congress, DHO has hit the max number of users due to s3-tests not cleaning up after execution. Could we have the s... JuanJose Galvez
11:27 AM rbd Tasks #2853 (In Progress): krbd: read path
With my patches for the basic new request code now
out for initial review, I've started working on this
feature. I...
Alex Elder
11:20 AM rbd Subtask #2852 (In Progress): krbd: open parent on open
The many patches have now been posted for review.
Included in that is a small, temporary patch that enables
this ...
Alex Elder
05:21 AM rbd Fix #3665: librbd: deadlock during flatten
possibly here: ... Sage Weil
05:20 AM Revision 657df852 (ceph): os/FileStore: add stall injection into filestore op queue
Allow admin to artificially induce a stall in the op queue. Forces the
thread(s) to sleep for N seconds. We pause f...
Sage Weil
05:20 AM Revision 132045ce (ceph): common/HeartbeatMap: inject unhealthy heartbeat for N seconds
This lets us test code that is triggered by an unhealthy heartbeat in a
generic way.
Signed-off-by: Sage Weil <sage@...
Sage Weil
02:03 AM Revision a4e78652 (ceph): osd: do not join cluster if not healthy
If our internal heartbeats are failing, do not send a boot message and try
to join the cluster.
Signed-off-by: Sage ...
Sage Weil
02:01 AM Revision c406476c (ceph): osd: hold lock while calling start_boot on startup
This probably doesn't strictly matter because start_boot doesn't need the
lock (currently) and few other threads shou...
Sage Weil
01:56 AM Revision ad6b2311 (ceph): osd: do not reply to ping if internal heartbeat is not healthy
If we find that our internal threads are stalled, do not reply to ping
requests. If we do this long enough, peers wi...
Sage Weil
01:53 AM Revision 61eafffc (ceph): osd: reduce op thread heartbeat default 30 -> 15 seconds
If the thread stalls for 15 seconds, let our internal heartbeat fail.
This will let us internally respond more quickl...
Sage Weil
12:54 AM Revision 371e6fbe (ceph): Merge pull request #35 from cholcombe973/master
Making the usage details a little better. Yehuda Sadeh
12:23 AM Bug #3900: init-ceph should do ulimit -n's with do_root_cmd
I think he's right, except it should be do_root_cmd, and I'm not certain if that echoes the result of the command cor... Dan Mick
12:11 AM Bug #3900 (Resolved): init-ceph should do ulimit -n's with do_root_cmd
Chen Xiaoxi points out on ceph-devel:
Here is part of /etc/init.d/ceph script:
case "$command" in
s...
Dan Mick
12:19 AM Revision 0d172b95 (ceph): packaging: add smalliobenchrbd
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
12:13 AM Revision 8eee815f (ceph): Merge remote-tracking branch 'gh/wip-3833-b'
Conflicts:
src/osd/OSD.cc
src/osd/OSD.h
Reviewed-by: Samuel Just <sam.just@inktank.com>
Sage Weil
12:07 AM Revision 9388f941 (ceph): Update src/rgw/rgw_admin.cc
Improved the usage message. Chris Holcombe

01/22/2013

11:58 PM Revision eaf20fa9 (ceph): Merge branch 'wip-3651'
David Zafman
11:57 PM Revision 509a93e8 (ceph): osd: Add digest of omap for deep-scrub
Add ScrubMap encode/decode v4 message with omap digest
Compute digest of header and key/value. Use bufferlist
to ref...
David Zafman
11:57 PM Revision db48caf6 (ceph): osd: debug support for omap deep-scrub
Deep-scrub test support through admin socket
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sam...
David Zafman
11:57 PM Revision cfb1aa80 (ceph): osd: Add missing unregister_command() in OSD::shutdown()
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
David Zafman
11:48 PM Revision e714c778 (ceph): osd: Testing of deep-scrub omap changes
Fix scrub_test.py and add omap corruption test
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: S...
David Zafman
11:23 PM Revision e328fa6c (ceph): test/bench: add rbd backend to smalliobench
Only supports format 1 images to start, and does not issue flushes, so
it's best used with caching off.
Signed-off-b...
Josh Durgin
11:10 PM Revision 0ee5ec7e (ceph): common/Throttle: fix modeline, whitespace
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:10 PM Revision c3266ad1 (ceph): config: helper to identify internal fields we should be quiet about
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:01 PM Revision 89072fbb (ceph): test/bench: don't alias bl from above
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
11:01 PM Revision c50f5f52 (ceph): test/bench: use uint64_t for uniform distribution
int is too small for rbd image sizes
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
10:55 PM Revision 451cc00a (ceph): doc: Modified usage for upgrade.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
10:47 PM Revision 73a96936 (ceph): osd: improve sub_op flag points
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision 33efe321 (ceph): osd: simplify asok to single callback
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision 24d0d7eb (ceph): osd: dump op priority queue state via admin socket
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision a1137eb3 (ceph): osd: make last state for slow requests more informative
Report on the last event string, and pass in important context for the
op event list, including:
- which peers were...
Sage Weil
10:47 PM Revision 23c02bce (ceph): osd: refactor ReplicatedPG::do_sub_op
PULL is the only case where we don't wait for active.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
10:47 PM Revision c549a0cf (ceph): common/PrioritizedQueue: buckets -> tokens
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision 6e3363b2 (ceph): common/PrioritizedQueue: add min cost, max tokens per bucket
Two problems.
First, we need to cap the tokens per bucket. Otherwise, a stream of
items at one priority over time w...
Sage Weil
10:47 PM Revision 514af15e (ceph): common/PrioritizedQueue: dump state to Formatter
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision bec96a23 (ceph): osd: debug msg prio, cost, latency
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision e8e0da1a (ceph): osd: use Message::get_cost() function for queueing
The data payload is a decent proxy for cost in most cases, but not all.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
10:47 PM Revision a1bf8220 (ceph): osd: set PULL subop cost to size of requested data
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision b685f727 (ceph): osd: add OpRequest flag point when commit is sent
With writeahead journaling in particular, we can get requests that
stay in the queue for a long time even after the c...
Sage Weil
10:47 PM Revision 128fcfca (ceph): note puller's max chunk in pull requests
this lets us calculate a cost value Sage Weil
10:47 PM Revision cfe4b851 (ceph): os/FileStore: allow filestore_queue_max_{ops,bytes} to be adjusted at r...
The 'committing' ones too.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
10:47 PM Revision 44dca5c8 (ceph): filestore: disable extra committing queue allowance
The motivation here is if there is a problem draining the op queue
during a sync. For XFS and ext4, this isn't gener...
Sage Weil
10:47 PM Revision 1233e861 (ceph): osd: target transaction size 300 -> 30
Small transactions make pg removal nicer to the op queue. It also slows
down PG deletion a bit, which may exacerbate...
Sage Weil
10:47 PM Revision 40654d6d (ceph): filestore: filestore_queue_max_ops 500 -> 50
Having a deep queue limits the effectiveness of the priority queues
above by adding additional latency.
Signed-off-b...
Sage Weil
10:47 PM Revision 9230c863 (ceph): osd: make OSD a config observer
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:47 PM Revision 101955a6 (ceph): osd: make osd_max_backfills dynamically adjustable
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:23 PM Feature #3888 (Fix Under Review): osd: stop heartbeating peers when internal heartbeat fails
wip-osd-hb Sage Weil
03:09 PM Feature #3888: osd: stop heartbeating peers when internal heartbeat fails
backport to bobtail! Sage Weil
08:12 AM Feature #3888 (Resolved): osd: stop heartbeating peers when internal heartbeat fails
if our internal thread heartbeats fail, stop replying to pings from peers. Sage Weil
09:09 PM Revision b6e3edc6 (ceph): test: create /tmp/cephtest/mnt.{id}
The workunit task assumes that a mount exists
at /tmp/cephtest/mnt.{id}
This patch creates the path if it doesn't
exi...
Joe Buck
09:05 PM Revision 6401abf8 (ceph): qa/workunit: Add iozone test script for sync
The iozone-sync.sh script runs iozone testing
various sync flags, O_SYNC, O_DSYNC, O_RSYNC.
Signed-off-by: Sam Lang ...
Sam Lang
09:05 PM Revision 72147fd3 (ceph): objectcacher: Remove commit_set, use flush_set
commit_set() and flush_set() are identical in functionality,
so use flush_set everywhere and remove commit_set from
t...
Sam Lang
08:43 PM Revision 00b11869 (ceph): testing: add workunit to run hadoop internal tests.
This workunit runs the internal tests for our local branch of hadoop-common.
Requires ant be installed on the host ru...
Joe Buck
07:37 PM Bug #3899 (Won't Fix): osd: failed to decode object_info_t
This happened after moving a journal from a file to an ssd, and changing filestore xattr use omap from true to false,... Josh Durgin
07:36 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
ubuntu@teuthology:/a/teuthology-2013-01-22_07:00:04-regression-bobtail-master-basic/3235... Tamilarasi muthamizhan
07:19 PM devops Bug #3898 (Resolved): ceph-deploy: problems with >1 mon
If you try "ceph-deploy new ceph1 ceph2" then it correctly creates the ceph.conf and then spits out "Cluster config e... Greg Farnum
06:25 PM Revision 4a871b55 (ceph): Merge branch 'wip-config'
Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Sage Weil
06:24 PM Revision 359d0e98 (ceph): config: report on log level changes
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:24 PM Revision c5e09517 (ceph): config: clean up output
Report a simple list of key='value', without extra verbosity.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:37 PM CephFS Bug #3404: oops in strlen() from set_request_path_attr()
I'm found the same bug in Bobtail release with NFS kernel server and 3.7.3 kernel
[70205.985665] BUG: unable to ha...
Ivan Kudryavtsev
05:35 PM Bug #3513 (Resolved): rgw log show error
Dan Mick
05:35 PM Bug #3513: rgw log show error
Nope, I had it wrong; the required params are: object *or* all three of date, bucket, and bucket-id.
Message change ...
Dan Mick
02:37 PM Bug #3513: rgw log show error
Actually I guess the && should be || and the || should be && (the old DeMorgan's rule) Dan Mick
02:30 PM Bug #3513: rgw log show error
I experienced this also on ubuntu 12.10 0.56.1-1
root@dlcephgw01:~# radosgw-admin log show --bucket=chris --date...
Chris Holcombe
05:04 PM rgw Bug #3896 (Resolved): rest-bench common/WorkQueue.cc: 54: FAILED assert(_threads.empty())
It seems rest-bench doesn't like to exit cleanly while cleaning up after itself.... I did test at low concurrency bu... Bill Reid
04:31 PM Bug #3895 (Resolved): librados test hang during mon thrashing
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov/2929
...
Sage Weil
04:05 PM Feature #3651 (Resolved): osd: deep scrub should hash omap
David Zafman
02:58 PM Bug #3894 (Closed): monclient: --keyring failed despite presence of file
While going over install basics with Gary, we got "ERROR: missing keyring, cannot use cephx for authentication" when ... Greg Farnum
02:40 PM rbd Feature #3877 (Fix Under Review): krbd: don't wait for notify ack to complete
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Subtask #3741 (Fix Under Review): krbd: rework request tracking code
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Tasks #3755 (Fix Under Review): krbd: use new request tracking code for sync object operations
I've posted this code for review. I continue to do testing. Alex Elder
02:39 PM rbd Feature #3754 (Fix Under Review): krbd: use new request tracking code for notify ack
I've posted this code for review. I continue to do testing. Alex Elder
02:19 PM rbd Feature #3893 (Rejected): krbd: document the new request code
There are bits and pieces of the new request code
documented for the kernel rbd client--in the comments
and in the ...
Alex Elder
02:09 PM CephFS Bug #3832: client: does not observe O_SYNC
Fixed a bug in objectcacher::flush_set. Branch wip-3832-oc-flushrange has been updated, and passes the accompanying ... Sam Lang
01:09 PM Subtask #2659: mon: Single-Paxos: ceph tool -w subscriptions not being updated
Can't recall if this was fixed at some point, or if the root cause was even related.
This must be tested again onc...
Joao Eduardo Luis
01:06 PM Subtask #2622 (Resolved): mon: Single-Paxos: convert existing, old MonitorStore to a brand new Mo...
This was implemented both as an offline tool as well as integrated in ceph-mon. The ceph-mon will attempt to open the... Joao Eduardo Luis
01:02 PM Subtask #3069: mon: Single-Paxos: messaging: log MMonSync messages for offline matching
If we really want to do offline matching, this can be done using just the logs. This could be interesting however fo... Joao Eduardo Luis
12:54 PM Subtask #3843 (Rejected): osd: move purged_snaps out of info
Sage Weil
12:54 PM Subtask #3844 (Rejected): osd: move info and log into leveldb
Sage Weil
12:54 PM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
Sage Weil
12:54 PM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
broke out subtasksa nd top level features Sage Weil
12:53 PM Feature #3892 (Resolved): osd: move pg info into leveldb
Sage Weil
12:53 PM Feature #3891 (Resolved): osd: move purged_snaps out of info
Sage Weil
12:53 PM Feature #3890 (Resolved): osd: create tool to extract pg info and pg log from filestore
Sage Weil
10:38 AM Feature #2580 (Resolved): perf: investigate poor performance at 10 osds per node
This was probably unique to the burnupi cluster and/or older ceph. Performance is fine on the SC847a now with lots o... Mark Nelson
10:27 AM rbd Bug #3889 (Won't Fix): krbd: handle zero-length requests
I'm pretty sure there are some special zero-length
requests (like flush) that can come down from the
block layer. ...
Alex Elder
07:07 AM Linux kernel client Bug #3887 (Closed): kernel client: small object memory leak
In testing my new request code for rbd (issue 3741 and related)
I tried paying special attention to Linux slab usage...
Alex Elder
05:10 AM Revision 98cc1b83 (ceph): task: mon_clock_skew_check: add option to run at least one timecheck
at-least-once Runs at least once, even if we are told to stop.
(default: True)
at...
Joao Eduardo Luis
04:11 AM Linux kernel client Bug #3886: Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
https://SizableSend.com/0g9dwn/ceph_mds.a.log Mohamed Pakkeer
04:06 AM Linux kernel client Bug #3886 (New): Futher testing result for the issue "ceph: avoid 32-bit page index overflow"
We raised an issue in the following ticket and the ticket has been resolved
http://tracker.newdrea...
Mohamed Pakkeer

01/21/2013

11:09 PM Revision b7cb1b11 (ceph): rados/thrash: 3 monitors, so that we can thrash them
Sage Weil
10:20 PM Feature #3848: osd: gracefully handle cluster network heartbeat failure
One option: do not mark ourselves back up (after being wrongly marked down) unless we are able to successfully ping a... Sage Weil
10:12 PM Bug #3885 (Resolved): osd: osd-recovery-incomplete qa test failing
ubuntu@teuthology:/a/teuthology-2013-01-21_19:00:03-regression-master-testing-gcov$ teuthology-ls --archive-dir . | g... Sage Weil
10:08 PM Feature #3833 (In Progress): osd: improve recovery throttling
Sage Weil
09:59 PM Bug #2655: scrub slows writes more than it should
This ticket predates the chunky scrub work that went into ~0.54 or thereabouts. Sage Weil
09:15 PM Bug #2655 (Resolved): scrub slows writes more than it should
Sage Weil
09:12 PM Bug #2357 (Can't reproduce): mds takes down ceph
Sage Weil
09:11 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
pushed to master Sage Weil
04:45 PM Revision d7d81922 (ceph): config: don't make noise about 'internal_safe_to_start_threads'
This is set on start, and subsequently gets into the changed set.
Once any other config value is injected, it is the ...
Sage Weil
04:22 PM Revision 3399860d (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
04:21 PM Revision 2e39dd5e (ceph): mds: fix default_file_layout constructor
Signed-off-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Greg Farnum
04:21 PM Revision e461f096 (ceph): mds: fix byte_range_t ctor
I do not think we saw any bugs from this, but anything that involved
capability issues on restart or migrate might ha...
Greg Farnum
01:20 PM Fix #3884 (Resolved): osd: resurrect partially deleted PGs
If a PG is in the process of getting removed and we repeer and discover we want to keep it, we currently block waitin... Sage Weil
12:30 PM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Corin-
Have you tried 0.48.3 again since then? I'd like to get to the bottom of this, if possible... :)
Sage Weil
09:35 AM rbd Bug #3737: Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Hi Josh,
according to our conversation I did some testing.
I started the dd if=/dev... of=/tmp/doof.dat bs=4k cou...
Oliver Francke
12:11 AM Revision c5fe0965 (ceph): osd: calculate initial PG mapping from PG's osdmap
The initial values of up/acting need to be based on the PG's osdmap, not
the OSD's latest. This can cause various co...
Sage Weil
12:11 AM Revision 17160843 (ceph): osd: calculate initial PG mapping from PG's osdmap
The initial values of up/acting need to be based on the PG's osdmap, not
the OSD's latest. This can cause various co...
Sage Weil

01/20/2013

11:01 PM CephFS Feature #1236 (Fix Under Review): libceph: set layout via virtual xattrs (libceph/cfuse)
wip-vxattr (ceph.git) and wip-vxattrs (ceph-client.git). There's a test script that passes on both fuse and kclient.... Sage Weil
10:58 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Greg Farnum wrote:
> How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm no...
Sage Weil
03:12 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
How large would a simple "layout" xattr actually be in comparison to the shipped inodes? I'm not sure the size is so ... Greg Farnum
08:26 PM rbd Feature #3877: krbd: don't wait for notify ack to complete
I have implemented this in the new request code.
It will be posted for review along with the rest
of that new code ...
Alex Elder
08:14 PM rbd Feature #3877 (In Progress): krbd: don't wait for notify ack to complete
Ian points out that "I've already implemented this change"
suggests that the status of this issue should at least
b...
Alex Elder
08:26 PM rbd Subtask #3741 (In Progress): krbd: rework request tracking code
Considering this "is actually work that's mostly complete"
I'm (finally) marking it "In Progress."
This code is f...
Alex Elder
08:22 PM rbd Feature #3754 (In Progress): krbd: use new request tracking code for notify ack
I have completed implementing sending synchronous acknowledgement
in response to a watch request notification. It i...
Alex Elder
08:19 PM rbd Tasks #3755 (In Progress): krbd: use new request tracking code for sync object operations
I have completed implementing all of these in the new request
code:
- synchronous object read (for v1 header object...
Alex Elder
04:12 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
thanks! commit:17160843d0c523359d8fa934418ff2c1f7bffb25 Sage Weil
03:51 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Looks good to me. Samuel Just
09:58 AM Bug #3879 (Fix Under Review): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
wip-3879 Sage Weil
09:06 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Output from the following attached:
ceph osd getmap 554 -o 554
Jens Kristian Søgaard
08:46 AM Bug #3879 (In Progress): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Jens Kristian Søgaard wrote:
> Output from the following attached:
>
> ceph osd getmap 555 -o 555
> ceph osd get...
Sage Weil
12:49 AM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
Output from the following attached:
ceph osd getmap 555 -o 555
ceph osd getmap 556 -o 556
Jens Kristian Søgaard
11:15 AM Bug #3883 (Won't Fix): osd: leaks memory (possibly triggered by scrubbing) on argonaut
100MB/day reported by multiple users, both on 0.48 and 0.56.1.
Some correlation with scrubbing. Possibly specific...
Sage Weil
09:55 AM CephFS Feature #3882: Hide snapshot directory name in mount/mtab
It seems like better (or perhaps just "more important") fix is to restrict access to .snap in the first place.
FWI...
Sage Weil
07:14 AM CephFS Feature #3882 (Rejected): Hide snapshot directory name in mount/mtab
The idea is to avoid users to see what snapshot directory name choosen during mount.
This is useful if we want to...
Ivan Kudryavtsev
09:51 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
Ivan Kudryavtsev wrote:
> Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined pu...
Sage Weil
09:44 AM CephFS Bug #3881: Wrong ip network to exchange data between kernel ceph and MDS
Hm. It seems that I'm wrong about the way it works. It connects to OSDs via OSD-defined public network. It seems that... Ivan Kudryavtsev
07:03 AM CephFS Bug #3881 (Rejected): Wrong ip network to exchange data between kernel ceph and MDS
I'm using ceph installation with three networks:
1st is Infiniband network for OSD exchange and replication
2nd i...
Ivan Kudryavtsev

01/19/2013

02:24 PM Bug #3879: ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
full log at http://bit.ly/11Hn7BN
Sage Weil
02:04 PM Bug #3879 (Resolved): ./osd/OSDMap.h: 367: FAILED assert(exists(osd))
... Sage Weil
12:58 PM Bug #3878 (Rejected): osd: nobackfill flag doesn't work
on currently master, bobtail Sage Weil
11:43 AM Feature #3833: osd: improve recovery throttling
see wip-3833 for push Sage Weil
08:40 AM rbd Feature #3877 (Closed): krbd: don't wait for notify ack to complete
When we receive notification of a change to an rbd image's header
object we need to refresh our information about th...
Alex Elder
06:36 AM Revision 2491f976 (ceph): workunits/cephtool: add tests for ceph osd pool set/get
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick
04:57 AM Revision ea9628fb (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
03:26 AM Revision 48308954 (ceph): Clarify journal size based on filestore max sync
The docs had the recommended journal size based on the option
"filestore min sync interval" when it should have been
...
Travis Rhoden
02:32 AM Revision aea898db (ceph): ceph: reject negative weights at ceph osd <n> reweight
Check the integer (fixed-point) value to avoid any worries
about floating-point rounding. Add tests for reweight < 0...
Dan Mick
02:32 AM Revision 7d9d7651 (ceph): workunit/cephtool: Use '! cmd' when expecting failure
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick
12:55 AM Revision ee4a9f25 (ceph): marginal/mds_thrasher: Add tests for mds thrasher
Adds a basic set of roles for testing the mds thrasher
with 1 active and 1 standby, and a few basic tests that
stress...
Sam Lang
12:40 AM Revision 6008b1d8 (ceph): osdmap: make replica separate in default crush map configurable
Add 'osd crush chooseleaf type' option to control what the default
CRUSH rule separates replicas across. Default to ...
Sage Weil
12:17 AM Revision 8c0d702e (ceph): msg/Pipe: use state_closed atomic_t for _lookup_pipe
We shouldn't look at Pipe::state in SimpleMessenger::_lookup_pipe() without
holding pipe_lock. Instead, use an atomi...
Sage Weil
12:17 AM Revision 5fb77bf1 (ceph): ceph: adjust crush tunables via 'ceph osd crush tunables <profile>'
Make it easy to adjust crush tunables. Create profiles:
legacy: the legacy values
argonaut: the argonaut defaults...
Sage Weil
12:17 AM Revision 373f1671 (ceph): msgr: atomically queue first message with connect_rank
Atomically queue the first message on the new pipe, without dropping
and retaking pipe_lock.
Signed-off-by: Sage Wei...
Sage Weil
12:17 AM Revision ae1882e7 (ceph): msgr: don't queue message on closed pipe
If we have a con that refs a pipe but it is closed, don't use it. If
the ref is still there, it is only because we a...
Sage Weil
12:17 AM Revision 34e2d402 (ceph): msgr: fix race on Pipe removal from hash
When a pipe is faulting and shutting down, we have to drop pipe_lock to
take msgr lock and then remove the entry. Th...
Sage Weil
12:17 AM Revision 8e0359c3 (ceph): msgr: inject delays at inconvenient times
Exercise some rare races by injecting delays before taking locks
via the 'ms inject internal delays' option.
Signed-...
Sage Weil
12:01 AM Revision 0cb760f3 (ceph): OSD: do deep_scrub for repair
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
Samuel Just

01/18/2013

11:45 PM Revision 684a8f8f (ceph): Merge branch 'wip-pg-removal'
Reviewed-by: Samuel Just <sam.just@inktank.com> Sage Weil
11:44 PM Revision f6c69c3f (ceph): os: add apply_transaction() variant that takes a sequencer
Also, move the convenience wrappers into the interface and funnel through
a single implementation.
Signed-off-by: Sa...
Sage Weil
11:44 PM Revision bc994045 (ceph): os: move apply_transactions() sync wrapper into ObjectStore
This has nothing to do with the backend implementation.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
11:44 PM Revision 4712e984 (ceph): osd: make pg removal thread more friendly
For a large PG these are saturating the filestore and journal queues. Do
them synchronously to make them more friend...
Sage Weil
11:44 PM Revision 5e00af40 (ceph): osd: set pg removal transactions based on configurable
Use the osd_target_transaction_size knob, and gracefully tolerate bogus
values (e.g., <= 0).
Signed-off-by: Sage Wei...
Sage Weil
11:33 PM Revision 82f22b38 (ceph): config_opts.h: default osd_recovery_delay_start to 0
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd...
Samuel Just
11:21 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Dan Mick
11:20 PM Documentation #3711: crush-map.rst: choose firstn talks about "N", but does not clearly define wh...
Sorry, I think this is still wrong; the descriptions of {num} only apply if firstn is supplied, correct? Otherwise {... Dan Mick
11:12 PM Bug #3869: ceph osd pool get doesn't support everything set does
Added tests with commit:2491f976e4cd6eca5c30f7c184038364e4fe1873
Dan Mick
01:22 PM Bug #3869: ceph osd pool get doesn't support everything set does
how about a quick bash test script that gets and sets some of these values? Sage Weil
12:49 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
commit:1f911fd0616c3fb45d5d36de7947a1914190017b
Dan Mick
12:27 PM Bug #3869 (Fix Under Review): ceph osd pool get doesn't support everything set does
Dan Mick
12:15 PM Bug #3869: ceph osd pool get doesn't support everything set does
This was noted on #ceph overnight. Dan Mick
12:14 PM Bug #3869 (Resolved): ceph osd pool get doesn't support everything set does
...for no apparently good reason. Adding the missing info is easy. Dan Mick
11:11 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
commit:aea898db2b56878b50f09dcbbf52347f4cc5c754
Dan Mick
05:39 PM RADOS Bug #3872: You can put negative weights on OSDs
Dan Mick
04:01 PM RADOS Bug #3872 (Fix Under Review): You can put negative weights on OSDs
Dan Mick
02:32 PM RADOS Bug #3872 (Resolved): You can put negative weights on OSDs
DHO reports that negative weights can be assigned to an OSD. Tested on Alexandria running 0.56-20-g9aecacd-1precise.
...
JuanJose Galvez
09:48 PM Revision 53f22d94 (ceph): task/mds_thrasher: New task for thrashing the mds
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
09:43 PM Revision 4bdcfbff (ceph): client: Respect O_SYNC, O_DSYNC, and O_RSYNC
If the file is opened with O_SYNC, O_DSYNC, or O_RSYNC, we need to
flush cached data (and metadata for O_SYNC) on a w...
Sam Lang
09:31 PM Revision b4e0f7ca (ceph): Merge remote-tracking branch 'gh/wip-client-pool-api'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
09:16 PM Linux kernel client Bug #3875 (Resolved): osd_client: don't use r_num_pages for bio requests
There is an osd request field "r_num_pages" that's used
to record the number of pages supplied with the request.
Fo...
Alex Elder
09:02 PM Revision 609442da (ceph): Merge remote-tracking branch 'gh/wip-scrub-argonaut' into argonaut
Sage Weil
08:42 PM Revision 1f911fd0 (ceph): ceph: allow osd pool get to get everything you can set
osd pool get was missing size, min_size, crash_replay_interval,
and crush_ruleset; they're all easily added.
Fixes: ...
Dan Mick
08:21 PM Revision 045af959 (ceph): qa: remove xfstest 068 from qemu testing
This tests fsfreeze, which sometimes hangs in xfs in linux 3.2
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
08:14 PM Revision 49726dcf (ceph): os/FileStore: only flush inline if write is sufficiently large
Honor filestore_flush_min in the inline flush case.
Backport: bobtail
Signed-off-by: Sage Weil <sage@inktank.com>
Re...
Sage Weil
08:14 PM Revision 8ddb55d3 (ceph): os/FileStore: fix compile when sync_file_range is missing;
If sync_file_range is not present, we always close inline, and flush
via fdatasync(2).
Fixes compile on ancient plat...
Sage Weil
07:05 PM Revision b8d5e286 (ceph): doc/rados/operations/crush: need kernel v3.6 for first round of tunables
Reported-by: rl219 in #ceph on irc.oftc.net
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
06:47 PM Revision dbc38eff (ceph): rbd.py: update scratch and test image sizes
Test 167 was failing due to running out of space on the scratch
file system. The test reserves 21MB in a file, and r...
Alex Elder
06:45 PM Revision 7e8e6491 (ceph): os/: Add CollectionIndex::prep_delete
If an unlink is interupted between removing the file
and updating the subdir attribute, the attribute will
overestima...
Samuel Just
06:35 PM Revision 736966f3 (ceph): java: support get pool id/replication interface
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
06:33 PM Revision 40415d1c (ceph): libcephfs: add pool id/size lookup interface
Adds new interfaces ceph_get_pool_id() and ceph_get_pool_replication()
to libcephfs.
Signed-off-by: Noah Watkins <no...
Noah Watkins
05:35 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Translating any ceph.* setxattrs into a sync setxattr and handling it on the MDS seems like an easy win. I can't thi... Sage Weil
01:34 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
We're still thinking through the implications of the best way to implement this. Nonetheless there are people using h... Greg Farnum
05:01 PM CephFS Bug #3832: client: does not observe O_SYNC
Current status: the iozone-sync.sh test script is causing a segfault (sometimes at hang). Needs more testing! Segf... Sam Lang
04:46 PM Documentation #3808 (In Progress): Block device quick start page need update
John Wilkins
03:58 PM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
It had sounded to me like the trend was towards eliminating the Perl usage rather than adding it as a dependency. Did... Greg Farnum
03:56 PM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
Sage Weil
03:49 PM Bug #3870 (Resolved): osd: make pg removal more friendly
commit:684a8f8f84312d4d9c6cdeb8d6d9fad792bd5a6d Sage Weil
01:44 PM Bug #3870 (Resolved): osd: make pg removal more friendly
wip-pg-removal needs cleanup and merge Sage Weil
03:49 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
Thanks. I was trying to figure out where the conflict could come from, and actually it does make sense: The single-os... Greg Farnum
03:45 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Sure, it's attached... Ben Poliakoff
03:40 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
@Josh: Even with the new CRUSH tunables it's still a matter of probability, so if you give it a particularly challeng... Greg Farnum
03:31 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
OK, it looks like I may have simply given CRUSH a challenging assignment, given the resources of the cluster.
I ...
Ben Poliakoff
02:58 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
Ian Colle
02:54 PM Bug #3873 (Duplicate): Ceph cli tool allows setting negative weights
Setting OSD weights to negative values:... Kyle Bader
02:46 PM Bug #1807 (Can't reproduce): CentOS compile error in perfglue/heap_profiler.cc
Anonymous
02:01 PM CephFS Feature #3570 (Resolved): teuthology: mds thrasher
Sage Weil
02:01 PM rbd Bug #3871 (Resolved): krbd: initial header read may be out of date
Currently krbd uses the version parameter of a watch operation to try to prevent this, but that was never implemented... Josh Durgin
01:55 PM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
Josh Durgin
01:54 PM Linux kernel client Bug #3860: rbd: problems if watch setup returns ERANGE
ERANGE is never actually returned - it was never implemented (#2592). The real fix for the race it was intended to pr... Josh Durgin
08:08 AM Linux kernel client Bug #3860 (Rejected): rbd: problems if watch setup returns ERANGE
When rbd sets up the watch request for a newly-mapped rbd image
it loops and tries again if the request returns ERAN...
Alex Elder
12:49 PM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
#3541. Whoops! Greg Farnum
11:02 AM CephFS Feature #3865 (Duplicate): mds: implement lookup-by-ino based on inode backtraces
Following #3862 and #3863, implement the lookup-by-ino algorithm described in http://www.spinics.net/lists/ceph-devel... Greg Farnum
12:49 PM CephFS Feature #3541: mds: robust ino lookup using file backpointers
We have a design now! Greg Farnum
12:48 PM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
#3540. Whoops! Greg Farnum
10:26 AM CephFS Feature #3862 (Duplicate): mds: add file backtraces to data objects
Add backtraces to each file object, as described at http://www.spinics.net/lists/ceph-devel/msg11872.html. This ticke... Greg Farnum
12:48 PM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
We have a design now! Greg Farnum
11:09 AM CephFS Feature #3727: mds: refactor EMetablob encoding paths
What is this bug about? Greg Farnum
11:08 AM CephFS Feature #3867 (Resolved): optionally do not use an anchor table
Following #3865 and #3866, we should introduce a config option that, when set, does not make use of the Anchor table ... Greg Farnum
11:07 AM CephFS Feature #3866 (New): mds: Add lazily-updated backtraces to hard links
As described in http://www.spinics.net/lists/ceph-devel/msg11872.html, we want hard links to contain lazily-updated b... Greg Farnum
10:55 AM CephFS Feature #3863: implement a tool to lookup inode numbers without holding their path
+1 for just adding the libcephfs function, and a test in test_libcephfs. Sam Lang
10:41 AM CephFS Feature #3863 (Resolved): implement a tool to lookup inode numbers without holding their path
This should just be a small wrapper around Client.cc*, but we need to be able to generate inode lookups without knowi... Greg Farnum
10:41 AM Feature #3769: osd: scrub should verify snap collection existence, membership
In master, sha-1 7b6fe03208c507b55517abe45cdff5c96d91904a
Needs backport when we are happy with the testing (if it's...
Samuel Just
10:15 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
The sync header read operation was another one that was needed.
That's basically done too.
All of this will be re...
Alex Elder
10:09 AM rbd Tasks #3755: krbd: use new request tracking code for sync object operations
I have been looking in detail at how the watch requests are
implemented and in the process identified a few potentia...
Alex Elder
10:14 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
... Alex Elder
10:11 AM Bug #3854 (Fix Under Review): mon: clock skew tests failing on master
Joao Eduardo Luis
10:07 AM Bug #3854: mon: clock skew tests failing on master
teuthology's wip-3854 commit:1d8640860441dc27e8342788c1ae17f5c1b3ccc0 fixes this issue. Joao Eduardo Luis
09:00 AM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
commit:98a763123240803741ac9f67846b8f405f1b005b
When the osd does a "mark myself back up" it takes care to rebind ...
Sage Weil
08:58 AM rbd Feature #3861 (Resolved): rbd: consider splitting rbd_osd_req_op_create()
When it was out for review, Josh suggested that it might
be better to have separate (type-checking) functions for
b...
Alex Elder
08:25 AM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
+1 clearing it for cosmetic reasons. Sam Lang
08:25 AM Revision 76e715ba (ceph): doc: Added link to rotation section.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:25 AM Revision e1741ba6 (ceph): doc: Added hyperlink to log rotation section.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:24 AM Revision 612717af (ceph): doc: Added section on log rotation.
fixes: #3776
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
08:07 AM rbd Bug #3859 (Resolved): osd_client: define ceph_osdc_clear_request_linger()
There is a ceph_osdc_set_request_linger() function that
sets a flag on a request and takes an additional reference.
...
Alex Elder
08:04 AM rbd Bug #3858 (Resolved): osd_client: ceph_osdc_wait_request() seems wrong
The only error wait_for_completion_interruptible() will
return is ERESTARTSYS. So if that gets returned inside
cep...
Alex Elder
07:33 AM Revision 48f41468 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
07:32 AM Revision 83326588 (ceph): doc: Modified index to include mon-osd-interaction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:31 AM Revision d6fc92df (ceph): doc: Added a section describing mon/osd interaction.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:14 AM rbd Feature #1491: qemu: make qemu-img convert fast
This was rejected because feature is not relevant anymore. At the time, when I was looking at it there was some obvio... Yehuda Sadeh
06:43 AM Revision bebdc70b (ceph): build: Add perl installation dependency to rpm and debian packages.
There was already a dependency on python in the debian control file,
a similar dependency was added to the rpm spec f...
Gary Lowell
06:13 AM Revision ff7c971f (ceph): doc: Added an admonishment for SSD write latency.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
06:00 AM Revision 6f28faf9 (ceph): mds: open mydir after replay
In certain cases, we may replay the journal and not end up with the
dirfrag for mydir open. This is fine--we just ne...
Sage Weil
05:51 AM Revision dd7caf5f (ceph): mds: gracefully exit if newer gid replaces us by name
If 'mds enforce unique name' is set, and another MDS with the same name
kicks us out of the MDSMap, gracefully exit i...
Sage Weil
05:45 AM Revision 2e112333 (ceph): mon: enforce unique name in mdsmap
Add 'mds enforce unique name' option, defaulting to true.
If set, when an MDS boots, it will kick any previous mds w...
Sage Weil
05:27 AM Revision ca2d9ac9 (ceph): doc: Updated OSD configuration reference with backfill config options.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
05:25 AM Revision e330b7ec (ceph): mon: create fail_mds_gid() helper; make 'ceph mds rm ...' more generic
Take a gid or a rank or a name. Use a nicer helper.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
05:05 AM Revision 5a384f48 (ceph): Merge branch 'wip-mds'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
05:00 AM Revision f41b5421 (ceph): add mon_thrash task to kernel and rados thrashers collections
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
04:57 AM Revision 626f6104 (ceph): Add a test for the truncate/osd-commit-reply race
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
04:54 AM Revision cc7bf1bd (ceph): rados: add osd reply delay injection
Sage Weil
01:54 AM Revision d81ac841 (ceph): rbd: fix bench-write infinite loop
I/O was continously submitted as long as there were few enough ops in
flight. If the number of 'threads' was high, or...
Josh Durgin
01:01 AM Revision 233d034d (ceph): Merge branch 'wip-cephx'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
12:43 AM devops Feature #2885 (Resolved): doc: mon initial members requirements, functioning, admin steps to take
This was done some time ago. Step 9 here: http://ceph.com/docs/master/rados/deployment/chef/#configure-your-ceph-envi... John Wilkins
12:35 AM Documentation #3062 (Resolved): doc: osd tuning config options
This was completed some time ago. John Wilkins
12:28 AM Documentation #3329 (Resolved): doc: What metrics should be used to set node weight
Discussion was primarily starting with 1TB as a weight of 1.00 with additional consideration for throughput. If this ... John Wilkins
12:27 AM Tasks #3779 (In Progress): update osd config ref as appropriate
John Wilkins
12:26 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
John Wilkins
12:09 AM Revision e776b63d (ceph): crushtool: consolidate_whitespace() should eat everything except \n
CRUSH map source with \r (like a DOS text file) failed to compile
with the usual nonuseful message; turns out that ea...
Dan Mick
12:09 AM Revision 60db6e3e (ceph): crushtool: warn usefully about missing output spec
When running with --test, you must request output to CSV files or
specific types of output to --show-X; make the erro...
Dan Mick

01/17/2013

11:41 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
John Wilkins
11:41 PM Documentation #3389 (Resolved): doc: crush docs could use a full example crushmap
John Wilkins
11:40 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
John Wilkins
11:40 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
John Wilkins
11:28 PM Feature #3505 (Resolved): default to libnss
This was done for RPMs with the commit listed below. Debians already had the --with-nss flag in the rules file.
...
Anonymous
11:21 PM Bug #2176 (In Progress): dependencies not checked by autoconf
All these are listed as build requirements for the rpm and debian packages. I'll add the missing ones to configure.ac. Anonymous
11:16 PM devops Tasks #3512 (In Progress): Publish our fastcgi packages
The approach is to pick up the latest debian and rpm packages for mod_fastcgi, apply the ceph patch, and build manual... Anonymous
11:13 PM Bug #3736: kernel build: failures starting in 3.8-rc1
The immediate kernel build problems have been solved by recreating the patch that is applied to the debian package bu... Anonymous
11:09 PM Bug #3736: kernel build: failures starting in 3.8-rc1
Branch: refs/heads/master
Home: https://github.com/ceph/autobuild-ceph
Commit: 0ff4f9a9ce82b37288b3bbcc5b5d65b5...
Anonymous
11:12 PM Revision efa595f5 (ceph): doc/rados/operations/authentication: update for cephx sig requirement o...
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:12 PM Revision 50db10dc (ceph): msg/Pipe: require MSG_AUTH feature on server if option is enabled
If we
negotiate cephx AND
are a server AND
cephx require signatures = true
then require the MSG_AUTH feature ...
Sage Weil
11:12 PM Revision 91a573a4 (ceph): mon: enforce 'cephx require signatures' during negotiation
If we are negotiating which auth protocol to use, and the client does not
support the MSG_AUTH feature, and the serve...
Sage Weil
11:11 PM Revision 4a49a09d (ceph): cephx: control signaures for service vs cluster
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
11:01 PM Revision c236a51a (ceph): osdmap: make replica separate in default crush map configurable
Add 'osd crush chooseleaf type' option to control what the default
CRUSH rule separates replicas across. Default to ...
Sage Weil
10:54 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
Branch: refs/heads/master
Home: https://github.com/ceph/ceph
Commit: bebdc70b4254a78d9fe86af9c645e828fd11e2b2
...
Anonymous
10:16 PM Documentation #3831 (In Progress): ceph osd crush set command needs correction in the doc
John Wilkins
10:14 PM CephFS Feature #1236: libceph: set layout via virtual xattrs (libceph/cfuse)
Sage Weil
10:02 PM CephFS Feature #3857: mds: enforce unique mds names in mdsmap
see wip-mds-names Sage Weil
09:36 PM CephFS Feature #3857 (Resolved): mds: enforce unique mds names in mdsmap
Currently mds's are uniquely identified by their addr (i.e., a unique instance of the process). The name is useful on... Sage Weil
08:27 PM Revision cd09be6a (ceph): ceph: pass ceph.conf to osdmaptool
This ensure it sees the chooseleaf option and generates the proper
CRUSH rules.
Sage Weil
06:37 PM rbd Bug #3413 (Resolved): rbd bench-write fails with assert when rbd caching turned on
commit:d81ac8418f9e6bbc9adcc69b2e7cb98dd4db6abb Josh Durgin
01:39 PM rbd Bug #3413 (Fix Under Review): rbd bench-write fails with assert when rbd caching turned on
branch wip-rbd-bench-write Josh Durgin
06:11 PM Revision c6f8010b (ceph): mon: Monitor: drop messages from old timecheck epochs
We were asserting when the message's timecheck epoch (which is mapped to
the election epoch) was older than the curre...
Joao Eduardo Luis
06:08 PM Revision 81e8bb55 (ceph): osdmaptool: more fix cli test
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit b0162fab3d927544885f2b9609b9ab3dc4aaff74)
Sage Weil
06:08 PM Revision 2b5b2657 (ceph): osdmaptool: fix cli test
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 5bd8765c918174aea606069124e43c480c809943)
Sage Weil
06:08 PM Revision f739d123 (ceph): osdmaptool: allow user to specify pool for test-map-object
Fixes: #3820
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@in...
Samuel Just
06:07 PM Revision 00759ee0 (ceph): rados.cc: fix rmomapkey usage: val not needed
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <samuel.just@inktank.com>
(cherry pic...
David Zafman
06:07 PM Revision 06b3270f (ceph): librados.hpp: fix omap_get_vals and omap_get_keys comments
We list keys greater than start_after.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <...
Samuel Just
06:07 PM Revision 75072965 (ceph): rados.cc: use omap_get_vals_by_keys in getomapval
Fixes: #3811
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(...
Samuel Just
06:07 PM Revision a3c2980f (ceph): rados.cc: fix listomapvals usage: key,val are not needed
Fixes: #3812
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
(...
Samuel Just
06:00 PM rgw Feature #3856 (Resolved): rgw: list buckets S3 api should be paginated
The S3 api (unlike swift) does not define marker, max when listing buckets (probably due to the fact that max buckets... Yehuda Sadeh
05:25 PM Bug #3836: osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
... Sage Weil
08:52 AM Bug #3836 (Resolved): osd: common/Mutex.cc: 94: FAILED assert(r == 0) in PG::start_flush()
... Sage Weil
04:55 PM Bug #3279: mon/caps: cap comparison in get-or-create is based on a string literal
This effects the chef mon recipe. I am able to correct this error by joining lines 96-99.
[Thu, 17 Jan 2013 16:5...
Kraig Amador
04:44 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
'pg dump' and 'osd dump' both have 'json' support since argonaut, but argonaut does not support outputting json on 'o... Joao Eduardo Luis
03:20 PM Feature #3850: Add json output for ceph pg dump and ceph osd tree
It already exists for pg dump and osd dump too. osd tree was recent though, maybe it's not in the version he's using? Josh Durgin
03:02 PM Feature #3850 (Closed): Add json output for ceph pg dump and ceph osd tree
Kyle Bader has requested json output for the following commands:
ceph pg dump
ceph osd tree
Sage Comment:
th...
Ian Colle
04:32 PM Feature #3855 (Resolved): Making Scrubs Nicer
As requested from DHO:
Currently scrubs are not very nice, Sage referred to these issues and it would be nice if t...
JuanJose Galvez
04:26 PM Bug #3854 (Resolved): mon: clock skew tests failing on master
... Sage Weil
04:21 PM Feature #3853 (Resolved): qa: include iogen in qa suite
Sage Weil
04:10 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
commit:60db6e3e394df1e4110eefa5951657b648b02006
Dan Mick
04:10 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
commit:e776b63dd5c540a6f49b03b67e72a1f4636a74fd Dan Mick
11:06 AM RADOS Bug #3834: crushtool really really hates \r
Well isspace() would catch newline too, which I think we don't want, so it'd be iswhite(c) && c != '\n', which I'm no... Dan Mick
04:06 PM devops Bug #3852 (Resolved): chef recipes don't try to start OSDs
I wasn't aware the chef recipes were this incomplete, but it appears as though, unless
you're running Crowbar, osd.r...
Dan Mick
04:05 PM devops Bug #3851 (Resolved): chef recipes don't enable upstart
Since upstart management of daemons now explicitly looks for an upstart tag file, Chef
doesn't start the monitors co...
Dan Mick
03:17 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I presume we're planning to backport this to bobtail after it passes some nights of testing? Maybe we should leave th... Greg Farnum
03:03 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
commit:f358cb1d2b0a3a78bf59c4fd085906fcb5541bbe Sage Weil
02:58 PM Feature #3849 (Resolved): Track slow PGs and times OSDs marked down
Kyle Bader:
"Over the weekend of 01/02/13 we encountered an issue that we had not yet
encountered. One of our cephs...
Ian Colle
02:54 PM Feature #3848 (Resolved): osd: gracefully handle cluster network heartbeat failure
From Kyle Bader
"Back in October we had a switch failure on our cluster (backend) network.
This was not noticed b...
Ian Colle
02:24 PM rbd Bug #3847 (Resolved): rbd: figure out correct byte order for watch version
In the process of refactoring rbd code that builds up osd
operations I noticed that for NOTIFY_ACK and WATCH operati...
Alex Elder
01:40 PM Documentation #3846 (Resolved): Debian install has incorrect gitbuilder URL

From http://ceph.com/docs/master/install/debian/ :...
Anonymous
12:32 PM rbd Feature #1491 (Rejected): qemu: make qemu-img convert fast
Yehuda Sadeh
12:28 PM CephFS Bug #3832 (Fix Under Review): client: does not observe O_SYNC
Implemented in wip-3832. Needs review. Sam Lang
12:17 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
I dont' think it matters. It's is a fixed lifecycle from standby -> active -> dead, so the leftover standby_ just te... Sage Weil
12:13 PM CephFS Bug #3845: mds: standby_for_rank not getting cleared on takeover
This is a monitor thing; the MDS is only involved in relaying the config setting over on boot-up. Greg Farnum
11:38 AM CephFS Bug #3845 (Closed): mds: standby_for_rank not getting cleared on takeover
This is the mdsmap after mds.a was active and given rank 0, then killed, and another mds (mds.b-s-r0) that had standb... Sam Lang
11:34 AM CephFS Feature #3730: Support replication factor in Hadoop
Sage Weil wrote:
> If there are more such cases, that is a separate bug!
It was a bug I had introduced in wip-cli...
Noah Watkins
09:51 AM CephFS Feature #3730: Support replication factor in Hadoop
Noah Watkins wrote:
> In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken..
...
Sage Weil
08:55 AM CephFS Feature #3730: Support replication factor in Hadoop
In Client, osdmap is protected by client_lock? If so, new version of branch isn't broken.. Noah Watkins
10:45 AM Subtask #3844 (Rejected): osd: move info and log into leveldb
Samuel Just
10:45 AM Subtask #3843 (Rejected): osd: move purged_snaps out of info
the purged_snaps set is really a property of the local pg instance rather than a global property and does not get upd... Samuel Just
10:42 AM Subtask #3842 (Rejected): osd: create tool to extract pg info and pg log from filestore
Once these are moved into leveldb, it will be much more difficult to manually extract these structures. Samuel Just
10:41 AM Feature #3841 (Rejected): osd: avoid seeks for log and info writes on client writes
Probable approach is to move log and info into leveldb. Samuel Just
10:38 AM Subtask #3840 (Resolved): osd: ack push after apply+commit
This will prevent the primary from shoving another push before the first has completed. Alternately, make the number... Samuel Just
10:28 AM Documentation #3839 (Resolved): SSD crushmap example will not compile
The SSD CRUSH map example (http://ceph.com/docs/master/rados/operations/crush-map/#placing-different-pools-on-differe... Alexandre Marangone
10:24 AM CephFS Bug #1435: mds: loss of layout policies upon mds restart
wip-mds-layout2
needs to be rebased reviewed and tested!
Sage Weil
10:13 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
pushed to master, commit:c6f8010b1c8e4d54f9fb24b2e4e25ff8a2bde778 Joao Eduardo Luis
09:34 AM Bug #3835 (Fix Under Review): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) whe...
Ian Colle
08:51 AM Bug #3835: mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monitors are seve...
This issue is fixed on wip-3835, commit:785a2bc3e9271607b1ddf25390056e9dd9c72b21 Joao Eduardo Luis
07:47 AM Bug #3835 (Resolved): mon: timecheck: hits FAILED assert(m->epoch == timecheck_epoch) when monito...
The leader schedules a new 'ping' to the monitors in the quorum as soon as the pings are all sent.
This allows for...
Joao Eduardo Luis
10:04 AM Bug #3820: osdmaptool - user cannot specify pool
85eb8e382a26dfc53df36ae1a473185608b282aa Samuel Just
09:58 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Sage Weil
09:50 AM rbd Feature #3838 (New): krbd: use common functions for striping calculations
With the STRIPINGV2 feature bit, format 2 striping has the same parameters as cephfs striping. Re-work the rbd object... Josh Durgin
09:29 AM Linux kernel client Feature #3837 (Resolved): krbd: support format 2 striping
Format 2 images with the STRIPINGV2 feature bit set (created with rbd create --stripe-count X --stripe-unit Y --order... Josh Durgin
09:12 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Yay! Sage Weil
04:52 AM rbd Feature #3754: krbd: use new request tracking code for notify ack
Yeehah! All tests passed, including the previously-failing
blogbench.sh, fsstress, and two passes through xfstests.
Alex Elder
09:11 AM Bug #2843: filestore: replay failure on xfs
The post-v0.50 version of this bug was just fixed, commit:66eb93b83648b4561b77ee6aab5b484e6dba4771, which is backport... Sage Weil
02:38 AM Bug #2843: filestore: replay failure on xfs
Hi,
We have exactly the same problem on 1 of our osd (bobtail 0.56.1).
[[https://gist.github.com/4555135]]
Wha...
Guilhem Lettron
09:08 AM CephFS Bug #3261 (Rejected): mds crashes in EMetaBlob::replay
Understood. I'm sorry we weren't able to dig in when it happened. When do you get around to retesting we should be ... Sage Weil
02:09 AM CephFS Bug #3261: mds crashes in EMetaBlob::replay
should i test the same btrfs volume with a new ceph? if so i might get to it in the next month. please close with ins... Tobias Florek
05:19 AM Revision b0162fab (ceph): osdmaptool: more fix cli test
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:10 AM Revision 5bd8765c (ceph): osdmaptool: fix cli test
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:01 AM Revision 98a76312 (ceph): osd: leave osd_lock locked in shutdown()
No callers expect the lock to be dropped.
Fixes: #3816
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:48 AM Revision 72db1a59 (ceph): When running teuthology with targets provisionned on OpenStack and kvm,...
Signed-off-by: Loic Dachary <loic@dachary.org> Loïc Dachary
02:04 AM Revision faa62fa8 (ceph): radosgw: increate nofile ulimit in upstart
The default ulimit for open file descriptors per process is 1024,
far too few for radosgw if you have lots of OSDs an...
Kyle Bader
12:59 AM Revision df399da1 (ceph): rgw: copy object should not copy source acls
Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t...
Yehuda Sadeh
12:25 AM Revision 19ee2311 (ceph): ceph: adjust crush tunables via 'ceph osd crush tunables <profile>'
Make it easy to adjust crush tunables. Create profiles:
legacy: the legacy values
argonaut: the argonaut defaults...
Sage Weil
12:19 AM Revision 85eb8e38 (ceph): osdmaptool: allow user to specify pool for test-map-object
Fixes: #3820
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Gregory Farnum <greg@in...
Samuel Just

01/16/2013

11:52 PM Revision 7b6fe032 (ceph): Merge branch 'wip_snap_scrub'
Reviewed-by: Sage Weil <sage@inktank.com> Samuel Just
11:40 PM Revision 0946a78c (ceph): fix mon clock queue test syntax
Sage Weil
11:30 PM Revision 20b27a1c (ceph): rgw: copy object should not copy source acls
Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t...
Yehuda Sadeh
11:22 PM Revision 37dbf7d9 (ceph): rgw: copy object should not copy source acls
Fixes: #3802
Backport: argonaut, bobtail
When using the S3 api and x-amz-metadata-directive is
set to COPY we used t...
Yehuda Sadeh
10:42 PM Revision b8568747 (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
10:42 PM Revision 57352351 (ceph): ReplicatedPG/PG: check snap collections during _scan_list
During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc...
Samuel Just
10:42 PM Revision e65ea70e (ceph): ReplicatedPG: compare nlinks to snapcolls
nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ...
Samuel Just
10:42 PM Revision 665577a8 (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:42 PM Revision 381e2587 (ceph): osd/PG: fix osd id in error message on snap collection errors
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
10:42 PM Revision 70c35120 (ceph): ReplicatedPG: ignore snap link info in scrub if nlinks==0
links==0 implies that the replica did not sent snap link information.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Samuel Just
10:42 PM Revision 39bc6549 (ceph): PG: move auth replica selection to helper in scrub
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
10:42 PM Revision 9e44fca1 (ceph): ReplicatedPG: correctly handle new snap collections on replica
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
10:35 PM Revision 88956e31 (ceph): ReplicatedPG: make_snap_collection when moving snap link in snap_trimmer
Backport: bobtail
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
10:33 PM Revision 3f0ad497 (ceph): librados.hpp: fix omap_get_vals and omap_get_keys comments
We list keys greater than start_after.
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <...
Samuel Just
10:33 PM Revision 625c3cb9 (ceph): rados.cc: fix rmomapkey usage: val not needed
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Samuel Just <samuel.just@inktank.com>
David Zafman
10:33 PM Revision 44c45e52 (ceph): rados.cc: fix listomapvals usage: key,val are not needed
Fixes: #3812
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
Samuel Just
10:33 PM Revision cb5e2be4 (ceph): rados.cc: use omap_get_vals_by_keys in getomapval
Fixes: #3811
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
Samuel Just
09:57 PM Revision 3c67ee36 (ceph): rbd: add test for formatted output from rbd cli
Josh Durgin
09:41 PM RADOS Bug #3834: crushtool really really hates \r
Ha! Sorry about htat. Maybe iswhite() (or wahtever that helper is) would be best here? Sage Weil
09:36 PM RADOS Bug #3834 (Resolved): crushtool really really hates \r
Spent a long time trying to figure out why a crush map wouldn't compile; finally got it to no differences at all, eve... Dan Mick
09:29 PM Revision 333cc0d5 (ceph): Merge branch 'wip-rbd-formatted-output'
Reviewed-by: Dan Mick <dan.mick@inktank.com>
Conflicts:
src/rbd.cc
src/test/cli/rbd/help.t
Josh Durgin
09:23 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
OK, that quick fix wasn't enough.
I had a spinlock protecting the check for something being
complete. But that w...
Alex Elder
08:13 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
Well that's unfortunate. I hit the same problem. I'll
need to take a closer look I guess.
Alex Elder
07:39 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
Seems to be working better. It may end up being an
atomic rather than protecting with a spinlock, but
either way, ...
Alex Elder
03:15 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
I've pretty much implemented this feature but having done
this I'm looking at a crash that happened with this code
...
Alex Elder
09:17 PM Revision b59c27dd (ceph): Merge branch 'master' into wip-scrub
Sage Weil
09:15 PM Revision fb4bb5d7 (ceph): osd: better error message for request on pool that dne
If the request is sent when the pool didn't even exist, say so. This
would have made #3734 a bit easier to track dow...
Sage Weil
09:14 PM Revision 9a1f5742 (ceph): osd: drop newlines from event descriptions
These produce extra newlines in the log.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Samuel Just <sam.j...
Sage Weil
09:14 PM Revision 6934ac3f (ceph): rbd: move Formatter construction to main
Each method that uses a formatter is doing the same thing.
Simplify by constructing and handling errors only once.
Al...
Josh Durgin
09:14 PM Revision 8fea6dee (ceph): rbd: add --pretty-format option
This is the same option the rados and radosgw-admin tool use for more
human-readable json/xml.
Signed-off-by: Josh D...
Josh Durgin
09:14 PM Revision 4e5a07bc (ceph): XMLFormatter: fix pretty printing
It used the wrong indentation level and did not add a newline after
closing a section. dump_stream() did not indent a...
Josh Durgin
09:14 PM Revision d7cdcc0e (ceph): rbd: regenerate man page and cli test
Signed-off-by: Josh Durgin <josh.durgin@inktank.com> Josh Durgin
09:14 PM Revision f6dabc83 (ceph): rbd: always output result for formatted output
When there's nothing, return an empty array.
This way scripts don't have to special case this.
Signed-off-by: Josh D...
Josh Durgin
09:14 PM Revision 0efb9c51 (ceph): test: add cram integration test for formatted output
This can be used with the new teuthology cram task.
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Josh Durgin
09:14 PM Revision 84c5d857 (ceph): rbd: support plain/json/xml output formatting
This patch renames the --format option to --image-format, for
specifying the RBD image format, and uses --format to s...
Stratos Psomadakis
09:14 PM Revision 98487b56 (ceph): rbd: fix long lines
Several >80 characters have crept in recently.
The older ones generally don't have very useful history,
so I'm not wo...
Josh Durgin
07:21 PM Revision a586966a (ceph): osd: fix rescrub after repair
We were rescrubbing if INCONSISTENT is set, but that is now persistent.
Add a new scrub_after_recovery flag that is r...
Sage Weil
07:21 PM Revision 8e33a8b9 (ceph): mon: note scrub errors in health summary
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
07:17 PM Revision 476eb24b (ceph): Merge branch 'wip-rpm-update'
Merges work around for odd AS_IF behaviour in configure.ac. Gary Lowell
06:37 PM rgw Bug #3813: radosgw doesn't have a logrotate script
Let's go with /var/log/radosgw and a separate logrotate script. Simpler! Sage Weil
09:06 AM rgw Bug #3813: radosgw doesn't have a logrotate script
Given that radosgw gets installed without ceph, it seems like teh viable optoins are putting the logrotate cofnig in ... Sage Weil
04:14 AM rgw Bug #3813: radosgw doesn't have a logrotate script
Note that the official docs suggest to put "log file = /var/log/ceph/radosgw.log" too. If "ceph" isn't installed, thi... Faidon Liambotis
04:02 AM rgw Bug #3813 (Resolved): radosgw doesn't have a logrotate script
Currently there's no logrotate configuration for radosgw at all. Even if one sets "log file" to /var/log/ceph/somethi... Faidon Liambotis
06:35 PM Feature #3833 (Resolved): osd: improve recovery throttling
Sage Weil
06:24 PM Bug #3810 (Need More Info): btrfs corrupts file size on 3.7
I need a dump of the xattrs on the d0c18e1d/605.00000000/head//1 object in pg 1.1d on osd 7 and osd 0 Samuel Just
05:59 PM CephFS Bug #3832 (Resolved): client: does not observe O_SYNC
if the file was opened with O_SYNC we need to flush the io on every write call. Sage Weil
05:49 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
Sage Weil
05:44 PM rgw Feature #3207 (Resolved): qa: swift functional tests in nightly
Yehuda Sadeh
05:41 PM Revision c1a86ab1 (ceph): configure.ac: fix problem with --enable-cephfs-java
The AS_IF used to cover java related checks via --enable-cephfs-java
didn't work correctly. Use a plain 'if/fi' inste...
Danny Al-Gaaf
05:34 PM CephFS Feature #3730: Support replication factor in Hadoop
Oh right, libcephfs is not built on top of librados. Never mind, that's a whole different discussion we start occasio... Greg Farnum
05:15 PM CephFS Feature #3730: Support replication factor in Hadoop
I don't think libcephfs will give up an instance of the rados client, if that's what you mean by grant access to rado... Noah Watkins
04:33 PM CephFS Feature #3730: Support replication factor in Hadoop
Sorry to back this up a little, but I can't recall — does using libcephfs automatically grant a user access to the RA... Greg Farnum
04:30 PM CephFS Feature #3730: Support replication factor in Hadoop
This interface update is up for review in wip-client-pool-api Noah Watkins
09:52 AM CephFS Feature #3730: Support replication factor in Hadoop
From stand-up, stick with int64_t for userspace, and enforce 32-bit range. Noah Watkins
09:43 AM CephFS Feature #3730: Support replication factor in Hadoop
The move from int32 -> int64 was misguided, and incomplete. At this point it's not really worth the effort to move a... Sage Weil
07:31 AM CephFS Feature #3730: Support replication factor in Hadoop
It looks like in OSDMap there is some mixed usage of int64 and int for pool id, too. In Client::_create pool id is e... Noah Watkins
06:40 AM CephFS Feature #3730: Support replication factor in Hadoop
Can we change the type in libcephfs to uint64? We're the only ones calling ceph_get_file_pool() right now as far as ... Sam Lang
05:33 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
Samuel Just
02:24 PM Bug #3820 (Resolved): osdmaptool - user cannot specify pool
Samuel Just
05:23 PM Documentation #3831 (Resolved): ceph osd crush set command needs correction in the doc
ceph osd crush set command has different parameters in different places.
http://ceph.com/docs/master/rados/operat...
Tamilarasi muthamizhan
05:21 PM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
Fixed, commit:ccfefe3097a51b49885f2ed5d9334e85b497d963. Fix was pushed to both argonaut and bobtail branches. Yehuda Sadeh
11:17 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
ok, affects both argonaut and bobtail. Actual bug is when copying object, if x-amz-metadata-directive is set to COPY ... Yehuda Sadeh
10:01 AM rgw Bug #3802: x-amz-acl header ignored on copy operation
On what version? Yehuda Sadeh
05:16 PM RADOS Documentation #3830 (Closed): crush-map.rst: chooseleaf doesn't include 'firstn|indep', and 'aggr...
1) I think chooseleaf should also include [firstn|indep] like choose does.
2) I'm not certain I understand just wh...
Dan Mick
05:15 PM Bug #3829 (Can't reproduce): new osd added to the cluster is not receiving data
ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
1. Initially , had a cluster[burnupi21,burnupi22,b...
Tamilarasi muthamizhan
04:12 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
This is showing up on your MDS, about 15 minutes after a client completes accesses, right? This is associated with th... Greg Farnum
04:01 PM CephFS Bug #3828 (Rejected): seeing error: fault, server, going to standby whenever I run a ceph-syn loa...
while validating bug 520, i saw an interesting error. it may be a red herring, as I am seeing no problem with the wr... Anonymous
03:47 PM CephFS Bug #520 (Closed): mds: change ifile state mix->sync on (many) lookups?
3 Node Cluster:
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
# cat /etc/ceph/ceph.conf
[global]...
Anonymous
02:51 PM CephFS Bug #520: mds: change ifile state mix->sync on (many) lookups?
csyn is now called ceph-syn
and --debug-ms 1 to see those messages go by!
Sage Weil
03:43 PM Revision 1d50affc (ceph): mds: fix usage typo for ceph-mds
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:26 PM CephFS Bug #3261: mds crashes in EMetaBlob::replay
This looks like a problem with what's in the journal, but soo much MDS code has changed since then that I don't think... Sage Weil
03:24 PM CephFS Bug #1760 (Resolved): multiple_rsync workunit cannot remove non-empty directory intermittently
this also looks like the tmap problem, commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and preceeding 4 commits. Sage Weil
03:23 PM CephFS Bug #2380 (Rejected): kclient: aufs over a cephfs mount fails with Stale NFS file handle
this is a generic problem with lookup by ino, see #3541 and other features Sage Weil
03:23 PM CephFS Bug #2092 (Can't reproduce): BUG at fs/ceph/caps.c:999
commit:561cf283173360c39db19dc735da4a319be68ff6 fixes the multi-mds case. we haven't seen this again for single-mds..... Sage Weil
03:21 PM Bug #3827 (Resolved): crushtool --test: claims to want -o, really wants --output-csv or --show-*
The error message is wrong, apparently, for crushtool's test mode; it looks like it wants either
--output-csv (in wh...
Dan Mick
03:11 PM CephFS Feature #3826 (Resolved): uclient: Be more aggressive about checking for pools we can't write to
Right now the client will happily buffer up writes to a pool that it can't actually write to. #2753 is going to make ... Greg Farnum
03:06 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
Run against bad code. Greg Farnum
03:03 PM CephFS Bug #2444 (Can't reproduce): null pointer deference in ceph_d_prune inside kvm
Sage Weil
03:00 PM CephFS Bug #2071 (Can't reproduce): kclient: pjd mkfifo failures
Sage Weil
02:59 PM CephFS Bug #1770 (Can't reproduce): directory nonexistent on kernel_untar_build.sh
Sage Weil
02:58 PM CephFS Bug #1749 (Can't reproduce): nonexistent directory in kclient_workunit_kernel_untar_build
Sage Weil
02:55 PM CephFS Bug #1318 (Resolved): directories disappear across multiple rsyncs
commit:e52ebacb73747ef642aabdb3cc3cb2a328687a4c and 4 preceeding patches fix up the TMAP bug that is the likely cause... Sage Weil
02:55 PM CephFS Bug #1511: fsstress failure with 3 active mds
Sam thinks this works now! Adding to QA suite. Greg Farnum
02:50 PM CephFS Bug #3625 (Resolved): client: EEXIST error on multiple clients to create
commit:b4d3bd06d4083d780755f6ef506df1643932fa2f Sage Weil
02:49 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
Maybe you already handled this? Greg Farnum
02:11 PM CephFS Bug #3625 (Fix Under Review): client: EEXIST error on multiple clients to create
Sam Lang
06:16 AM CephFS Bug #3625: client: EEXIST error on multiple clients to create
The kernel side has been reviewed and tested, but needs to be merged. The fuse side has been tested, but I think it ... Sam Lang
02:48 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
we should return an error code on fsync().. that is the quick fix.
a more polite feature will be opened to return ...
Sage Weil
09:19 AM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
This is clearly a bug, bureaucracy or not. It should not be a feature. We can do new development to fix a bug. If you... Ian Colle
02:47 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
Samuel Just
02:47 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
Samuel Just
02:46 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
I think this can be closed. There is a bunch of autoconf changes for Java that have or will be merged. Noah Watkins
02:41 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
I just did a ./configure and using CPPFLAGS to indicate where the jni headers were and that worked just fine. Using C... Anonymous
02:45 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
Multi-mds, currently low priority. Greg Farnum
02:44 PM CephFS Bug #3637 (In Progress): client: not issuing caps for with clients doing shared writes
Sage Weil
02:43 PM CephFS Bug #3637 (Fix Under Review): client: not issuing caps for with clients doing shared writes
Sage Weil
02:40 PM CephFS Bug #3498 (Resolved): mds: mds assert failure during untar_kernel
this was a msgr bug, long since fixed. commit:36c0fd220ef02b1ffd7a3ae0d98e0fdec6b55a5b or thereabouts Sage Weil
02:39 PM CephFS Bug #1666: hadoop: time-related meta-data problems
http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg10334.html
Also wip-mtime-incr in the ceph repo.
Sam Lang
02:38 PM CephFS Bug #2218: CephFS "mismatch between child accounted_rstats and my rstats!"
Greg Farnum
02:32 PM CephFS Feature #3821 (New): qa: run backuppc as part of qa suite
Sage Weil
02:32 PM CephFS Bug #2494 (Can't reproduce): mds: Cannot remove directory despite it being empty.
The dupe inode suggests this is the problem fixed by Yan's tmap fixes. Greg Farnum
02:29 PM CephFS Bug #2019 (Can't reproduce): mds: CInode::filelock stuck in sync->mix
Presumably we'll see this again, but it hasn't turned up in our testing lately and we need more info to debug it. Greg Farnum
02:27 PM CephFS Bug #1811 (Duplicate): 2 pjd chown tests failed on cfuse
Ian Colle
02:22 PM CephFS Bug #1537 (Resolved): cmds 100% when copying lots of files, mds_cache_size and mds_bal_frag
This is an optimization issue, which we'll get to! Sage Weil
02:22 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Interesting, but where did this actually get from?
And why didn't it get triggered when I started the OSDs again? ...
Wido den Hollander
01:08 PM Bug #3816 (Fix Under Review): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
-5678> 2013-01-15 17:18:24.509093 7f5a10cec700 1 accepter.accepter.rebind avoid 6812
-5677> 2013-01-15 17:18:24.5...
Sage Weil
12:43 PM Bug #3816: osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
Like requested on the mailinglist I'm attaching the logfiles from osd.0 to osd.3
There is indeed a osd_map logline...
Wido den Hollander
09:59 AM Bug #3816 (Resolved): osd/OSD.cc: 3318: FAILED assert(osd_lock.is_locked())
... Sage Weil
02:21 PM CephFS Feature #3819 (Resolved): mds: re-add snaptests to qa suite
Sage Weil
02:02 PM CephFS Bug #3818 (Duplicate): kclient: fsx fails in mapread

With the fix in #3681, fsx fails in mapread with bad data. It looks like this is unrelated to the fix, and is a se...
Sam Lang
01:56 PM Bug #3786: osd: scrub is deferred indefinitely if load is high
Fixed by https://github.com/ceph/ceph/commit/299548024acbf8123a4e488424c06e16365fba5a Ian Colle
01:38 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
Sage Weil
01:38 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
11:38 AM rbd Feature #3817 (Resolved): librbd: make cache write-through until a flush is encountered
Writeback caching is unsafe if higher layers don't send flushes. qemu can be accidentally misconfigured to not send f... Josh Durgin
11:09 AM CephFS Feature #3543 (In Progress): mds: new encoding
Oh, this has been in progress all week. Greg Farnum
10:35 AM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
I have been trying to reproduce this but have not hit it yet.
will reopen the bug, when needed.
Tamilarasi muthamizhan
10:34 AM Bug #3801 (New): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(...
Ian Colle
10:28 AM Bug #3801: Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED assert(0 == "...
Sage Weil wrote:
> The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is p...
Justin Lott
09:39 AM Bug #3801 (Need More Info): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAI...
The osd.40 error means the fs returned EIO on a read operation. Check yoru kern.org.. there is probably a bad disk, ... Sage Weil
09:41 AM Feature #3815 (Duplicate): osd: move pg_info_t back into the xattr; avoid writing pginfo file whe...
see wip-pginfo for a hacky prototype.
did some testing, and it looks good:...
Sage Weil
09:39 AM Linux kernel client Bug #3800 (Won't Fix): libceph: check compatibility between ceph modules
Sage Weil
07:03 AM Feature #3805: log: detect dup messages
The one that comes to mind is "no heartbeat from osd.foo since timestamp bar" messages. We could try to identify the... Sam Lang
06:43 AM Revision 2dc2b480 (ceph): mds: use #defines for bits per cap
Hard-coding 0xff in SimpleLock.h is too far away from where we add new cap
bits.
Signed-off-by: Sage Weil <sage@inkt...
Sage Weil
06:04 AM CephFS Bug #3601: client: With multiple clients, file remove doesn't free up space
Yeah its that the lru doesn't have a timeout.
The mds could send an "enable timeout" message to clients once it se...
Sam Lang
03:27 AM Revision 63e33c8a (ceph): osd: send forced scrub/repair through scrub scheduling
This marks a PG for immediate scrub or repair. Adjust the sched_scrub()
code so that we handle these PGs even when s...
Sage Weil
03:26 AM Revision 27ad74b9 (ceph): osd: use helpers to queue a PG in the scrub LRU
Move the duplicated reach into info.history.last_scrub_stamp into a helper
so we can control when we queue the PG for...
Sage Weil
03:25 AM Revision f8a649c0 (ceph): osd/ReplicatedPG: validate ino when scrubbing snap collections
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:25 AM Revision 8fb04813 (ceph): ReplicatedPG: compare nlinks to snapcolls
nlinks gives us the number of hardlinks to the object.
nlinks should be 1 + snapcolls.size(). This will allow
us to ...
Samuel Just
03:24 AM Revision 4affecee (ceph): ReplicatedPG/PG: check snap collections during _scan_list
During _scan_list check the snapcollections corresponding to the
object_info attr on the object. Report inconsistenc...
Samuel Just
03:21 AM Revision 40e0f2db (ceph): byteorder: fix gcc 4.7 warnings
./include/encoding.h: In function 'void encode(int64_t, ceph::bufferlist&, uint64_t)':
./include/encoding.h:101:1: wa...
Sage Weil
03:21 AM Revision dde83262 (ceph): osd_types: add nlink and snapcolls fields to ScrubMap::object
Signed-off-by: Samuel Just <sam.just@inktank.com> Samuel Just
03:21 AM Revision f969f6b3 (ceph): osd_types: bring ScrubMap::object up to the 0.56.1 encoding
We need to introduce some new fields here, so to maintain compatibility
we'll need to first bring the 48.* series up ...
Samuel Just
03:21 AM Revision b6561a2f (ceph): osd: make missing head non-fatal during scrub
If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca...
Sage Weil
02:00 AM Revision d882d053 (ceph): ReplicatedPG: fix snapdir trimming
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l...
Samuel Just
02:00 AM Revision 015a454a (ceph): osdmap: spread replicas across hosts with default crush map
This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi...
Sage Weil
02:00 AM Revision 55b7dd32 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
(che...
Joao Eduardo Luis
02:00 AM Revision 898a4b19 (ceph): Revert "osdmap: spread replicas across hosts with default crush map"
This reverts commit 503917f0049d297218b1247dc0793980c39195b3.
This breaks vstart and teuthology configs. A better f...
Sage Weil
02:00 AM Revision 3293b31b (ceph): OSD: only trim up to the oldest map still in use by a pg
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin...
Samuel Just

01/15/2013

10:07 PM Revision c8a9a9a8 (ceph): Add cram task
This runs cram tests, which are an easy way to test output
stays consistent. We already use cram for basic cli tests ...
Josh Durgin
09:39 PM Bug #3811 (Fix Under Review): rados.cc getomapval implementation is broken, should use omap_get_v...
Samuel Just
09:21 PM Bug #3811 (Resolved): rados.cc getomapval implementation is broken, should use omap_get_vals_by_keys
Samuel Just
09:38 PM Bug #3812 (Fix Under Review): rados.cc listomapvals usage is wrong, <key> <val> are ignored and n...
Samuel Just
09:22 PM Bug #3812 (Resolved): rados.cc listomapvals usage is wrong, <key> <val> are ignored and not needed
Samuel Just
08:53 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
08:41 PM Revision cf149c8c (ceph): Merge branch 'wip-rpm-update'
Clean-up the handling of ceph java bindings in the rpm specfile and
configure.ac.
Gary Lowell
08:38 PM CephFS Feature #3730: Support replication factor in Hadoop
pool ids are currently exposed via libcephfs from ceph_file_layout, which uses a 32bit integer for pool id. However, ... Noah Watkins
08:34 PM CephFS Feature #3730: Support replication factor in Hadoop
Someone could toss a 'ceph osd pool set size' Hadoop's way, so a static mapping between pg pool size and pool name co... Noah Watkins
07:51 PM rbd Feature #3754: krbd: use new request tracking code for notify ack
I'm not sure yet whether the problem has to do with this
or whether it's in the existing "new request" code. But
I...
Alex Elder
06:23 PM Documentation #3808: Block device quick start page need update
Fixed description formatting. Also, 3784 is in master now (e94b06a19218decaf7d2d7b009bd862040f20285) Dan Mick
04:46 PM Documentation #3808: Block device quick start page need update
The current writeup also assumes that the mount is local to the cluster so it hides (for the beginner) important deta... Ken Franklin
03:38 PM Documentation #3808: Block device quick start page need update
-c and --secret aren't needed if you're using the default ceph.conf and your keyring can be found based on your ceph.... Josh Durgin
03:30 PM Documentation #3808 (Resolved): Block device quick start page need update
The instructions don't match well with the bobtail release.
- should include a note that ceph-common needs to be ins...
Ken Franklin
06:21 PM Feature #3805: log: detect dup messages
I tend to think there aren't very many dups we could usefully compress. It's pretty easy to add a one-string buffer ... Dan Mick
02:25 PM Feature #3805: log: detect dup messages
What kind of dups are we trying to detect?
This sounds to me like a wishlist item that requires much more work to...
Greg Farnum
02:17 PM Feature #3805 (New): log: detect dup messages
If a log message comes through and is a dup of the previous, increment a counter or something and only log it once wi... Sage Weil
05:35 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
No. So far I'm focus on stabilize basic fs function for multiple MDS setup, completely ignore snapshot. Zheng Yan
03:28 PM CephFS Bug #3254: mds: Replica inode's parent snaprealms are not open
Hmm, did this get fixed by some of Zheng's later patches? I remember things about snaprealms and migration... Greg Farnum
05:33 PM Bug #3810 (Resolved): btrfs corrupts file size on 3.7
After creating a new ceph cluster pg's become inconsistent after using the qemu client. Logs indicate that the prima... Mike Lowe
04:54 PM Bug #3809 (Won't Fix): crush compiler errors are not helpful
Small, or large, errors in the CRUSH input are apparently all treated the same by crushtool -c:
error: parse error a...
Dan Mick
04:44 PM CephFS Feature #3289: ceph-fuse: somehow exert pressure on the VFS to remove dentries from the cache
#3575 should be kept in mind while doing this/instead of this — there's a forget_multi as well. Greg Farnum
04:44 PM CephFS Bug #3601 (New): client: With multiple clients, file remove doesn't free up space
Whoops, didn't mean to change that status. Greg Farnum
04:43 PM CephFS Bug #3601 (Duplicate): client: With multiple clients, file remove doesn't free up space
The LRU actually already exists; check out Client::lru. (Unless I'm misunderstanding something?) So we might want to ... Greg Farnum
04:37 PM CephFS Bug #925: mds: update replica snaprealm on rename
De-prioritizing multi-MDS issues... Greg Farnum
04:34 PM CephFS Bug #1117: mds: rename rollback broken on slaves during replay
De-prioritizing multi-mds issues for now. Greg Farnum
04:27 PM CephFS Bug #1435: mds: loss of layout policies upon mds restart
I'm guessing we want to move this up the queue; will discuss in bug scrub tomorrow! Greg Farnum
04:23 PM CephFS Bug #1511: fsstress failure with 3 active mds
De-prioritizing multi-mds failures at this time. Greg Farnum
04:23 PM CephFS Bug #1535: concurrent creating and removing directories crashes cmds
De-prioritizing multi-MDS bugs at this time. Greg Farnum
03:51 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
Fair enough, but if I can just make a suggestion, perhaps you might want to explain these procedures somewhere in the... Florian Haas
03:45 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
I agree it's a bug, but given the procedures we have now (ack! changing procedures coming alert!) I don't think we wa... Greg Farnum
03:43 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
No, please. A write pretending to succeed while actually not writing data _is_ a bug. The filesystem _not lying to it... Florian Haas
03:33 PM CephFS Bug #2753: Writes to mounted Ceph FS fail silently if client has no write capability on data pool
This is a great suggestion but falls into feature rather than bug-fix category. My initial thought is keeping a list ... Greg Farnum
03:42 PM CephFS Bug #1675 (Can't reproduce): mds: failed rstat assert
The logs are long gone. This will presumably pop up again; it's a pretty common failure mode, but there's nothing in ... Greg Farnum
03:38 PM CephFS Bug #1938: mds: snaptest-2 doesn't pass with 3 MDS system
De-prioritizing all multi-MDS bugs for now. Greg Farnum
03:27 PM CephFS Bug #3267: Multiple active MDSes stall when listing freshly created files
Currently de-prioritizing multi-MDS bugs. Greg Farnum
03:23 PM Bug #3537: Logs can run root out of space and crash ceph cluster (need more aggressive log rotation)
Not an FS bug, and #3775 has a lot more conversation on this subject. Greg Farnum
03:22 PM Bug #3552: After ceph-deploy installation a reboot breaks OSDs
Whoops, not an FS bug!
I've put this in the main Ceph project for now, but it might also belong in devops. We need...
Greg Farnum
03:18 PM CephFS Bug #3625: client: EEXIST error on multiple clients to create
I know you guys did a couple rounds on this one, what's the status? Greg Farnum
02:39 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Yes, the question is why they're 'getting unlucky'. Josh Durgin
02:22 PM Bug #3806: OSDs stuck in active+degraded after changing replication from 2 to 3
Haven't looked into this, but my guess is a couple PGs are getting unlucky with their replica selection. I assume you... Greg Farnum
02:17 PM Bug #3806 (Won't Fix): OSDs stuck in active+degraded after changing replication from 2 to 3
Small 3 node cluster running 0.56.1-1~bpo60+1 on Debian/Squeeze, with "tuneables" enabled
I recently changed the r...
Ben Poliakoff
02:27 PM RADOS Feature #3807 (Resolved): crush: simple commands to create common rules
These should be in CrushWrapper or similar, and available via crushtool and via some 'ceph osd crush ...' commands.
...
Sage Weil
02:16 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
I agree. If there are lots of log messages at the default levels, that is the problem. I don't think there is much ... Sage Weil
01:59 PM Feature #3775 (Need More Info): log: stop logging in statfs reports usage above some threshold
So I suggest we split this into two issues:
1) the documentation examples show an awfully-high logging value for s...
Dan Mick
12:03 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
so, a couple ideas of what can be done.
if we do set size and frequency (or inform the user how to), then it could...
Anonymous
11:39 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
So a couple of thoughts:
1) changing size in logrotate.conf doesn't help unless we also change frequency
2) with ...
Dan Mick
02:15 PM Documentation #3804 (Resolved): Logging section recommends fairly high levels, doesn't stress how...
3775 introduced the observation that logs can fill very quickly and bury a small root disk.
Our documentation could ...
Dan Mick
02:03 PM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
commit:15bb00cafc31305cacf3c4684a429c2c9ee6f804 in master
Dan Mick
02:03 PM rbd Feature #3635 (Resolved): rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
02:02 PM rbd Feature #3784: rbd: issue modprobe when rbd map is called
commit:e94b06a19218decaf7d2d7b009bd862040f20285 in master
Dan Mick
02:01 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
Dan Mick
01:47 PM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
nevermind.. this is fixed in v0.48.3argonaut too. Sage Weil
01:45 PM Bug #3803: rados parsing error with hostnames in mon_host
Responed to the upstraem bug. This is fixed in master and bobtail, but not backported to argonaut. Should we? Sage Weil
08:37 AM Bug #3803 (Resolved): rados parsing error with hostnames in mon_host
In /etc/ceph/ceph.conf, if I set hostnames in the mon_host variable and separate them with spaces, the parsing algori... Ian Colle
01:25 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
Sage has a different proposed fix than what's in the branch. Still needs to be tested. Sam Lang
12:50 PM CephFS Bug #3637: client: not issuing caps for with clients doing shared writes
I don't remember where this ended up. Was the proposed fix problematic, or did it never get looked at? Greg Farnum
01:16 PM Bug #3770: OSD crashes on boot
Yeah, I just pushed a work-around branch (which I haven't tested much, so ideally you would try it on a node you can ... Samuel Just
12:08 PM rbd Subtask #3741: krbd: rework request tracking code
I found the source of my trouble, and in the process understood
a little more about some subtlety in bio reference c...
Alex Elder
11:39 AM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
This apparently is only a problem under re-export, which I believe we are not focusing on right now. Greg Farnum
11:35 AM CephFS Bug #3553: MDS core dumped running 0.48.2argonaut
Given what we know so far (the Op got sent to the wrong OSD) this is a bug in the Objecter, not the MDS. Or possibly ... Greg Farnum
11:17 AM Bug #3771: ceph does not have startup scripts in Centos
Not an FS bug! :) Greg Farnum
10:17 AM Bug #3771 (In Progress): ceph does not have startup scripts in Centos
Anonymous
11:16 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
Whoops, this was never an FS bug. :) Greg Farnum
10:15 AM Bug #3768 (In Progress): perl is required for logrotate, we need to include Perl as a dependency
Anonymous
10:54 AM Bug #3747: PGs stuck in active+remapped
No I didn't, just the CRUSH rule. Faidon Liambotis
10:46 AM Bug #3747 (Need More Info): PGs stuck in active+remapped
Faidon: did you also change the replication level of pool 3 (.rgw.buckets) ? Samuel Just
10:18 AM Feature #3505 (In Progress): default to libnss
This may already have been done. Will double check. Anonymous
10:16 AM Feature #3733 (In Progress): osd: update leveldb submodule
Anonymous
10:10 AM Bug #3797 (Need More Info): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest ...
Ian Colle
07:09 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Can you try reupgrading one of the nodes and start it with debug file store = 20? That will tell is what it is writing. Sage Weil
02:49 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just downgraded to 0.48.2argonaut and everything seems to be running normally again now:
Before downgrade:
ii ...
Corin Langosch
02:28 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
Here's the output of dstat http://pastie.org/5687470.text
I'm not sure why it is writing so much now, before the ...
Corin Langosch
02:17 AM Bug #3797: osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48.3argonaut
I just noticed the second osd is now consuming 100% cpu too. Before it was properly running for around 15 minutes. Gu... Corin Langosch
02:14 AM Bug #3797 (Duplicate): osd takes 100% cpu after upgrading from 0.48.2argonaut to the latest 0.48....
I just upgraded one of my production servers (2 osds) from 0.48.2argonaut to the latest 0.48.3argonaut and now of the... Corin Langosch
08:33 AM rgw Bug #3802 (Resolved): x-amz-acl header ignored on copy operation
When copying an object the x-amz-acl header is ignored. To replicate; copy a private object and send the 'x-amz-acl' ... JuanJose Galvez
07:43 AM Bug #3801 (Won't Fix): Cascading OSD failures beginning with common/HeartbeatMap.cc: 78: FAILED a...
0.48.2argonaut
Relevant logs are attached. Core dumps are available if needed....
Justin Lott
07:25 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
You're right, as long as you are using matching
code it's fine.
If it occurred, it's a serious problem. It just
...
Alex Elder
07:17 AM Linux kernel client Bug #3800: libceph: check compatibility between ceph modules
Is this really a problem? It seems like this could only bite someone building mixed versions out of tree. Sage Weil
06:57 AM Linux kernel client Bug #3800 (Resolved): libceph: check compatibility between ceph modules
It's possible for semantic changes to occur in one of the
ceph modules (fs/ceph, net/libceph, or block/rbd) that is
...
Alex Elder
06:58 AM Linux kernel client Bug #3799: libceph/rbd: bio refs are messed up
Because this suggests a semantically-incompatible change
between modules, this should probably be completed first:
...
Alex Elder
06:56 AM Linux kernel client Bug #3799 (Resolved): libceph/rbd: bio refs are messed up
There is an ugly reference counting dance that occurs with bio
pointers in the kernel osd I/O path, and it needs to ...
Alex Elder
06:57 AM Linux kernel client Bug #3798: libceph/rbd: take reference to all bio's in list
The other bug related to this is:
http://tracker.newdream.net/issues/3799
Alex Elder
06:56 AM Linux kernel client Bug #3798 (Resolved): libceph/rbd: take reference to all bio's in list
In a separate bug ("libceph/rbd: bio refs are messed up") I
describe how reference counting of bio's interact betwee...
Alex Elder
03:20 AM Revision d56af797 (ceph): osd: note must_scrub* flags in PG operator<<
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:20 AM Revision 26a63df9 (ceph): osd: fix scrub scheduling for 0.0
The initial value for pair<utime_t,pg_t> can match pg 0.0, preventing it
from being manually scrubbed. Fix!
Signed-...
Sage Weil
03:20 AM Revision 2baf1253 (ceph): osd: based INCONSISTENT pg state on persistent scrub errors
This makes the state persistent across PG peering and OSD restarts.
This has the side-effect that, on recovery, we r...
Sage Weil
02:24 AM Revision 16d67c79 (ceph): osd/PG: remove useless osd_scrub_min_interval check
This was already a no-op: we don't call PG::scrub_sched() unless it has
been osd_scrub_max_interval seconds since we ...
Sage Weil
02:24 AM Revision 29954802 (ceph): osd: change scrub min/max thresholds
The previous 'osd scrub min interval' was mostly meaningless and useless.
Meanwhile, the 'osd scrub max interval' wou...
Sage Weil
02:24 AM Revision 6f6a4193 (ceph): osd: fix object_stat_sum_t dump signedness
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision d7383284 (ceph): osd: add last_clean_scrub_stamp to pg_stat_t, pg_history_t
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 2475066c (ceph): osd: add num_scrub_errors to object_stat_t
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 389bed5d (ceph): osd: note last_clean_scrub_stamp, last_scrub_errors
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 796907e2 (ceph): osd/PG: move scrub schedule registration into a helper
Simplifies callers, and will let us easily modify the decision of when
to schedule the PG for scrub.
Signed-off-by: ...
Sage Weil
02:24 AM Revision 1441095d (ceph): osd/PG: introduce flags to indicate explicitly requested scrubs
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
02:24 AM Revision 62ee6e09 (ceph): osd/PG: trigger scrub via scrub schedule, must_ flags
When a scrub is requested, flag it and move it to the front of the
scrub schedule instead of immediately queuing it. ...
Sage Weil
02:24 AM Revision a1481207 (ceph): osd: move scrub schedule random backoff to seperate helper
Separate this from the load check, which will soon vary dependon on the
PG.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
12:25 AM Revision 123a2dc4 (ceph): rados: adjust socket injection rate down
See #3795. Sage Weil
12:14 AM Revision 71097b7b (ceph): Revert "task/kclient: chmod root to 1777."
This reverts commit f17847e537802671c6f90bd1a0cdaa0e9d1e6f7a. It had
a typo and we hopefully don't need it.
Signed-o...
Greg Farnum

01/14/2013

10:11 PM Revision be0c4b34 (ceph): ac_prog_javah.m4: Use AC_CANONICAL_TARGET instead of AC_CANONICAL_SYSTEM.
Gary Lowell
10:07 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
oh *fine*. :) Dan Mick
10:04 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
Funny you should mention it: that is step #1 (or maybe 2 or 3) for the management API work, IMHO. :) Sage Weil
09:41 PM Bug #3748: ceph osd dump --format=json includes non-JSON line
I sorta think we ought to clean up how the various output channels are used in this code in general. This fixes the ... Dan Mick
09:23 PM Revision e182c1fd (ceph): Merge branch 'wip-java-sync'
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Reviewed-by: Joe Buck <jbbuck@gmail.com>
Noah Watkins
09:11 PM Revision fb8a488e (ceph): java: remove create/release synchronization
The constructor calls create, and finalize() calls release. Since each
of these can only happen once (enforced by Jav...
Noah Watkins
09:11 PM Revision 2b9da45d (ceph): java: remove unnecessary synchronization
The body of ceph_unmount is a call to a synchronized method.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
09:11 PM Revision 85c10357 (ceph): java: remove all intrinsic locks
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
09:11 PM Revision 13cb196e (ceph): java: add fine grained synchronization
Adds r/w lock to protect against some races.
1. Mutual exclusion for mount/unmount prevents races between the two in...
Noah Watkins
08:02 PM rbd Subtask #3741: krbd: rework request tracking code
OK, I ran a test and got a crash. The bio built for
an object request gets handed off to an osd request.
I need to...
Alex Elder
07:32 PM rbd Subtask #3741: krbd: rework request tracking code
I spent the day trying to find the memory leak and finally
found it. The structure being leaked was a bio. It was
...
Alex Elder
06:48 AM rbd Subtask #3741: krbd: rework request tracking code
For some reason my tests started hanging on Friday when
I added memory debug code for catching leaks and reuses.
I ...
Alex Elder
07:49 PM CephFS Bug #3544: ./configure checks CFLAGS for jni.h if --with-hadoop is specified but also needs to ch...
Is this still an issue? Noah Watkins
04:54 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Josh just pinged me that there was a typo in the chmod patch, and nobody's noticed so apparently it still hasn't been... Greg Farnum
04:24 PM Bug #3795: loadgen task gets into msgr loop
I looked a bit more and I see some failures before that, and also some passes after, e.g. teuthology-2013-01-11_07:00... Sage Weil
11:35 AM Bug #3795: loadgen task gets into msgr loop
taking a look again at the nightly runs, looks like this issue has been happening on next branch from 01-01-2013 whic... Tamilarasi muthamizhan
08:13 AM Bug #3795: loadgen task gets into msgr loop
going to see if the recent msgr changes are to blame.. bisecting! Sage Weil
08:04 AM Bug #3795: loadgen task gets into msgr loop
This appears to be a simple cycle:
- objecter has lots of requests outstanding
- there is a fault (msgr failure i...
Sage Weil
03:37 PM Revision 017b6d63 (ceph): Revert "osdmap: spread replicas across hosts with default crush map"
This reverts commit 7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6.
This breaks teuthology and vstart both in its current ...
Sage Weil
03:04 PM CephFS Documentation #3796 (Resolved): FUSE mount documentation needs some corrections for v0,56x
The FUSE instructions need to be updated for v0.56 and later
currently:
> http://ceph.com/docs/master/cephfs/fuse...
Anonymous
01:35 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
I also don't seem to be able to reproduce on bobtail, marking can't reproduce. Samuel Just
12:58 PM Bug #3772 (New): osd: osd_disk_threads = 5 seems to hang recovery
I don't seem to be able to reproduce this on master. Samuel Just
10:37 AM Bug #3772: osd: osd_disk_threads = 5 seems to hang recovery
didn't reproduce with simple test, trying something more complicated. (roles/8882.yaml + osd disk threads : 10, teste... Samuel Just
01:28 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
Noah Watkins
12:57 PM Feature #3769 (Fix Under Review): osd: scrub should verify snap collection existence, membership
wip_snap_scrub Samuel Just
11:55 AM rbd Bug #2871 (Resolved): rbd export command hangs when trying to export an image of size 0 to a loca...
Not certain which recent fix resolved this, but it works now.
Dan Mick
11:32 AM rbd Bug #3585 (Closed): Image import via QEMU-IMG results in a corrupt rbd
Great, glad to hear it's fixed. Josh Durgin
11:09 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Patch posted for review. I'm not sure I'll be able to test
the scenario very well but hopefully it can be seen by
...
Alex Elder
09:56 AM rbd Bug #3427: krbd: unmap does not remove block device properly
Implementing the change I described now. Alex Elder
11:01 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
for reference, ubuntu@teuthology:/a/teuthology-2013-01-10_07:00:03-regression-argonaut-master-basic/38145 Tamilarasi muthamizhan
10:50 AM Bug #2691: osd/ReplicatedPG.cc: 5888: FAILED assert(latest->is_update())
This has shown up once in argonaut, probably not worth backporting unless it becomes more of a problem? Samuel Just
09:42 AM Bug #3629 (Resolved): test_mon_workloadgen.cc: 766: FAILED assert(m->fsid == monc.get_fsid())
commit:3610e72e4f9117af712f34a2e12c5e9537a5746f Joao Eduardo Luis
07:00 AM CephFS Bug #2187: pjd chown/00.t failed test 97
Happened again on Friday. Time to add the delay injection to the nightlies?
2013-01-11T07:32:37.489 INFO:teutholo...
Sam Lang
06:52 AM Revision 92a9d9c2 (ceph): ceph.conf: separate replicas across osds
ceph.git master now separates across crush hosts without this setting.
For teuthology clusters, we don't want that (u...
Sage Weil
05:43 AM Bug #3770: OSD crashes on boot
So, my (very basic) understanding of this suggests that the fix is that the trim wouldn't happen in the first place.
...
Faidon Liambotis

01/13/2013

10:11 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
Nope.. which leads me to realize that that setting needs to go in teuthology's ceph.conf. Doing that now, and then I... Sage Weil
10:01 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
*sigh*
This also looks good to me, and I like it better (should have suggested this the first time around). But no...
Greg Farnum
10:05 PM Bug #3774 (Fix Under Review): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
wip-scrub Sage Weil
10:05 PM Bug #3786 (Fix Under Review): osd: scrub is deferred indefinitely if load is high
wip-scrub Sage Weil
07:04 AM Revision 410906e0 (ceph): mon: OSDMonitor: don't output to stdout in plain text if json is specified
Fixes: #3748
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis

01/12/2013

11:05 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
commit:410906e04936c935903526f26fb7db16c412a711 Sage Weil
11:03 PM Bug #3795 (Resolved): loadgen task gets into msgr loop
... Sage Weil
11:01 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
der, broke vstart. can you review wip-3785? Sage Weil
08:01 AM CephFS Feature #3749: Remove forced synchronization from Java bindings
In libcephfs mount/unmount race against each other, and the test of the API (e.g. unmount racing against write). In C... Noah Watkins
01:10 AM Revision 7ea5d84f (ceph): osdmap: spread replicas across hosts with default crush map
This is more often the case than not, and we don't have a good way to
magically know what size of cluster the user wi...
Sage Weil
01:09 AM Revision 3610e72e (ceph): mon: OSDMonitor: only share osdmap with up OSDs
Try to share the map with a randomly picked OSD; if the picked monitor is
not 'up', then try to find the nearest 'up'...
Joao Eduardo Luis
12:25 AM Revision 1f721804 (ceph): rbd: Fix tabs
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick

01/11/2013

11:56 PM Revision 34138993 (ceph): doc: Updates to CRUSH paper.
fixes: 3329, 3707, 3711, 3389
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
10:28 PM Revision 15bb00ca (ceph): rbd: call udevadm settle on map/unmap
When we map/unmap devices, udev gets called to manage device nodes;
this will allow the command to wait for those man...
Dan Mick
10:28 PM Revision e94b06a1 (ceph): rbd: make 'add' modprobe rbd so it has a chance of success
Check for existence of /sys/bus/rbd first to avoid unnecessary calls
Fixes: #3784
Signed-off-by: Dan Mick <dan.mick@...
Dan Mick
08:17 PM Revision 66eb93b8 (ceph): OSD: only trim up to the oldest map still in use by a pg
map_cache.cached_lb() provides us with a lower bound across
all pgs for in-use osdmaps. We cannot trim past this sin...
Samuel Just
08:15 PM Revision 8cf79f25 (ceph): OSD: check for empty command in do_command
Fixes: #3878
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: David Zafman <david.zafman@inktank.com>
Samuel Just
08:09 PM Revision 3e147295 (ceph): Merge pull request #32 from imjustmatthew/imjustmatthew_docs
Correct typo in mon docs 'ceph.com' to 'ceph.conf' John Wilkins
07:59 PM Revision 0f161f1e (ceph): Correct typo in mon docs 'ceph.com' to 'ceph.conf'
Matthew Roy
06:49 PM Revision aeb02061 (ceph): qa/run_xfstests.sh: use cloned xfstests repository
Use our own copy of the xfstests repository rather than hitting
the upstream one repeatedly.
Signed-off-by: Alex Eld...
Alex Elder
06:15 PM Revision 8d0fa15e (ceph): mon: Monitor: only schedule a timecheck after election if we are not alone
Fixes: #3790
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
05:51 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
Merged to master in commit:7ea5d84fa3d0ed3db61eea7eb9fa8dbee53244b6 and cherry-picked to bobtail in commit:503917f004... Greg Farnum
05:45 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
good question. let's start with bobtail. Sage Weil
05:39 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
Looks good to me. What branches do we want to cherry-pick it on. Greg Farnum
05:24 PM Bug #3785 (Fix Under Review): ceph: default crush rule does not suit multi-OSD deployments
wip-3785 Sage Weil
01:59 PM Bug #3785 (New): ceph: default crush rule does not suit multi-OSD deployments
dang! wrong bug. opening this one back up.
sorry all!
Anonymous
12:34 PM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I think maybe Deb's comments and closure were meant for another bug (perhaps 3789?) Dan Mick
11:34 AM Bug #3785 (Won't Fix): ceph: default crush rule does not suit multi-OSD deployments
This comment should have been in bug 3789
caused by a lack of resources on the system.
have increased the memory fro...
Anonymous
11:32 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
This comment should have been in bug 3789
upping the memory on these VMs from 512M to 2G
since it appears it was a...
Anonymous
10:55 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
I agree with Ian, I have seen *very bad things* happen when crush choses two OSD on one host, rather than distribute... Anonymous
10:11 AM Bug #3785: ceph: default crush rule does not suit multi-OSD deployments
The issue here is that CRUSH maps which behave well on multi-host deployments behave quite poorly on one or two host ... Greg Farnum
05:46 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Yes, Greg. The test passed in the recent runs. Tamilarasi muthamizhan
05:34 PM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
This appears to be passing now, right Tamil?
Since I'm not seeing anything else breaking I'm inclined to leave the...
Greg Farnum
04:25 PM Bug #3772 (In Progress): osd: osd_disk_threads = 5 seems to hang recovery
Samuel Just
03:53 PM Documentation #3330 (In Progress): doc: How to troubleshoot unbalanced CRUSH
John Wilkins
03:51 PM Documentation #3329 (In Progress): doc: What metrics should be used to set node weight
John Wilkins
02:45 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
That makes this sounds like a simple fix... we need to swap the frsize and bsize fields. Except that right now we ar... Sage Weil
02:39 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
I spent a bit of time with gregaf trying to find authoritative sources for what the different values denote. While `... David McBride
01:40 PM CephFS Bug #3793: wrong size reported in some distributions/toolchains
This coreutils commit may have useful data:
http://git.savannah.gnu.org/cgit/coreutils.git/commit/src?id=0863f018f0f...
Greg Farnum
01:38 PM CephFS Bug #3793 (Resolved): wrong size reported in some distributions/toolchains
In ceph_statfs we set f_bsize to be 1MB in order to report very large available spaces. However, nowadays it is appar... Greg Farnum
02:38 PM CephFS Feature #3749: Remove forced synchronization from Java bindings
This needs more thought than just removing synchronization. We'd like to be segfault free in Java, even though you co... Noah Watkins
02:26 PM Bug #3789: OSD core dump and down OSD on CentOS cluster
There is 'ceph health', and a nagios plugin that runs it. A similarly trivial plugin can probably be written for oth... Sage Weil
02:01 PM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
dmesg shows it was a lack of resources.
upping the memory on these VMs from 512M to 2G
since it appears it ...
Anonymous
10:28 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
Deb Barba wrote:
> all core files have similar backtrace.
> again, Sage, looks like you are right, low resources
>...
Anonymous
10:27 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
all core files have similar backtrace.
again, Sage, looks like you are right, low resources
dmesg:
hrtimer: inte...
Anonymous
10:23 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
looks from dmesg, you are right Sage, low on resources
centos1 core# gdb /usr/bin/ceph-osd core.0.26177
Core wa...
Anonymous
10:16 AM Bug #3789: OSD core dump and down OSD on CentOS cluster
backtrace of core.0.14401 from centos3:
Core was generated by `/usr/bin/ceph-osd -i 8 --pid-file /var/run/ceph/osd....
Anonymous
09:37 AM Bug #3789 (Need More Info): OSD core dump and down OSD on CentOS cluster
check dmesg, or VM responsiveness. this triggers when a call to sync(2) takes more than... 2 minutes? i forget how l... Sage Weil
09:13 AM Bug #3789 (Won't Fix): OSD core dump and down OSD on CentOS cluster
Running a CentOS VM cluster. Running v0.56.1
I had written a bit of data, and stopped writing about 4pm yesterday...
Anonymous
02:17 PM rbd Subtask #3741: krbd: rework request tracking code
Unfortunately my system crashed after an hour or so. The
crash was in the network driver, and a little analysis
su...
Alex Elder
10:45 AM rbd Subtask #3741: krbd: rework request tracking code
My full test run isn't complete but I seem to have resolved
whatever problem I was hitting yesterday. I have not ye...
Alex Elder
01:39 PM CephFS Bug #3794 (Resolved): uclient: reports sizes wrong in some cases
This is the counterpart to kernel bug #3793. See Client::statfs, in which we set f_bsize to 1MB but f_frsize to 4KB. ... Greg Farnum
12:22 PM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
8cf79f252a1bcea5713065390180a36f31d66dfd Samuel Just
11:12 AM Bug #3787 (Fix Under Review): Ceph OSD crashes on ceph tell osd.x
wip_3787 Samuel Just
09:33 AM Bug #3787: Ceph OSD crashes on ceph tell osd.x
verified this happens on master. should be an easy fix. thanks for the report! Sage Weil
12:17 AM Bug #3787 (Resolved): Ceph OSD crashes on ceph tell osd.x
I recently set up a small test cluster with 2 nodes to test the 0.48.3 -> 0.56.1 upgrade. After Upgrading one of the ... Seb Mel
12:22 PM Bug #3770 (Resolved): OSD crashes on boot
66eb93b83648b4561b77ee6aab5b484e6dba4771 Samuel Just
11:16 AM Bug #3770 (Fix Under Review): OSD crashes on boot
wip_3770 Samuel Just
11:03 AM Bug #3770: OSD crashes on boot
The fault is in OSD::handle_osd_map where we trim old maps. Prior to 0.50, the pgs would have processed up to the cu... Samuel Just
09:59 AM Bug #3770: OSD crashes on boot
I'm seeing this same assert failure when trying to startup 3 of my OSDs. Happy to provide feedback for the debugging ... Mike Dawson
09:43 AM Bug #3770: OSD crashes on boot
sjust said that we're done collecting information and that I could rm the pg directory/log/info, which I did. Unfortu... Faidon Liambotis
09:41 AM Bug #3770: OSD crashes on boot
Ian Colle
12:04 PM Bug #3788: debian source packages are missing
Gary Lowell wrote:
> It looks like the Sources file has been zero length in past releases as well. Still investigat...
Loïc Dachary
12:03 PM Bug #3788: debian source packages are missing
My favorite use case when source packages are available would be... Loïc Dachary
11:33 AM Bug #3788: debian source packages are missing
I think we should build source packages too (in addition to tarballs, etc.). Sage Weil
10:47 AM Bug #3788: debian source packages are missing
We are not currently building debian or rpm source packages. We do put out a source tarball corresponding to the rel... Anonymous
09:56 AM Bug #3788 (In Progress): debian source packages are missing
It looks like the Sources file has been zero length in past releases as well. Still investigating. Anonymous
02:20 AM Bug #3788: debian source packages are missing
Proposed fix at https://github.com/ceph/ceph-build/pull/1 Loïc Dachary
01:44 AM Bug #3788: debian source packages are missing
http://ceph.com/debian/conf/distributions is created from https://github.com/ceph/ceph-build/blob/master/gen_reprepro... Loïc Dachary
01:35 AM Bug #3788 (Resolved): debian source packages are missing
Following the instructions at http://ceph.com/docs/master/install/debian/ to add the ... Loïc Dachary
10:52 AM CephFS Bug #3773: mds crashed at LogEvent::decode
Sure Sage. I was running bonnie from client during upgrade.
I had debug ms=1 set, i will try to reproduce this with...
Tamilarasi muthamizhan
09:41 AM CephFS Bug #3773 (Need More Info): mds crashed at LogEvent::decode
Tamil, I wonder if you can try to reproduce this with mds logging turned up from teh start (debug mds = 20, debug ms ... Sage Weil
10:34 AM Messengers Bug #2569: msgr: connect_rank crash
yes, you are right, Greg. I just wanted to put a note of this somewhere, so chose to update the bug itself :) Tamilarasi muthamizhan
10:23 AM Bug #3748 (Fix Under Review): ceph osd dump --format=json includes non-JSON line
wip-3748 has a fix, commit:0edb53f02231fb83f33d3bc5f58b37b14cd5df82 Joao Eduardo Luis
10:20 AM Bug #3695 (Resolved): monitor crashed after an upgrade in Monitor::timecheck
Ian Colle
10:16 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
looks good, merged into master. commit:8d0fa15e6aa3847e89de5d5adfca0a863e8da976 Sage Weil
10:06 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
Had a redundant check on the previous commit; fixed and rebased it and the new commit can be found on wip-3790 commit... Joao Eduardo Luis
10:02 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
This patch fixes it. Joao Eduardo Luis
09:31 AM Bug #3790 (In Progress): Mon crash after update to ceph version 0.56-209-g310112f
My fault. Forgot a check on win_election().
Any chance you can test 6104629d95207f3dfd3a744d81b011b6a714070e on wi...
Joao Eduardo Luis
09:18 AM Bug #3790: Mon crash after update to ceph version 0.56-209-g310112f
Previous installed version was .56-193. Ken Franklin
09:14 AM Bug #3790 (Resolved): Mon crash after update to ceph version 0.56-209-g310112f
I have a single node cluster on burnupi60 updated each morning to the latest Master branch. After the update this mo... Ken Franklin
09:16 AM Bug #3774 (In Progress): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
09:16 AM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
wip-scrub-sched for the argonaut version. should look very similar for master/bobtail. Sage Weil
02:05 AM Revision 310112f7 (ceph): Merge remote-tracking branch 'gh/wip-3633'
Reviewed-by: Sage Weil <sage@inktank.com> Sage Weil
02:04 AM Revision 9e4a3f03 (ceph): Merge remote-tracking branch 'gh/wip-3633'
Sage Weil
02:03 AM Revision 305cb54a (ceph): suites: rados: multimon: add mon clock skews task yaml files
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
12:58 AM Revision 2fa5d23b (ceph): test: Hadoop cluster and task config.
Add a 3-node cluster specification and a
task for running wordcount with Hadoop on Ceph.
Signed-off-by: Joe Buck <jb...
Joe Buck
12:44 AM Revision aa40de90 (ceph): messages: add MTimeCheck
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Joao Eduardo Luis
12:44 AM Revision 684d4ba2 (ceph): mon: Monitor: add timecheck infrastructure to detect clock skews
Fixes: #3633
Fixes: #3695
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Reviewed-by: Sage Weil <sage@inkt...
Joao Eduardo Luis
12:44 AM Revision ff1c254b (ceph): mon: Monitor: reduce indentation level; make code more readable
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:44 AM Revision 7a7fff57 (ceph): mon: Monitor: move a couple of if's together on handle_command()
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:44 AM Revision bc57c7a9 (ceph): mon: Monitor: use 'else if' on handle_command instead of bunches of 'if'
... when the options are mutually exclusive.
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
Joao Eduardo Luis
12:44 AM Revision 58e03ecb (ceph): mon: Monitor: unify 'ceph health' and 'ceph status'; add json output
Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Joao Eduardo Luis
12:03 AM Revision e6f284e9 (ceph): doc: Added -a option. Should work without from server, as described.
fixes: #3750
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins

01/10/2013

11:59 PM Revision de6633f9 (ceph): doc: Normalized to term "drive" rather than disk. Changed "(Manual)" en...
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:06 PM Revision 7a8ec194 (ceph): Merge branch 'next'
Samuel Just
09:54 PM Revision 988f3597 (ceph): rados: add truncate support
Signed-off-by: Samuel Just <sam.just@inktank.com>
Revewed-by: Greg Farnum <greg@inktank.com>
Samuel Just
09:04 PM Bug #3786 (Resolved): osd: scrub is deferred indefinitely if load is high
If the load is above the threshold, we will never scrub. For some environments, this is normal (e.g., mixed OSD and ... Sage Weil
08:23 PM rbd Bug #3585: Image import via QEMU-IMG results in a corrupt rbd
This seems to be fixed in QEMU 1.3.0 and Ceph 0.56.1
I've tried QED -> Raw -> Ceph -> Raw then QED -> Ceph -> Raw an...
Matt Anderson
07:56 PM Bug #3785 (Resolved): ceph: default crush rule does not suit multi-OSD deployments
Version: 0.48.2-0ubuntu2~cloud0
Our Ceph deployments typically involve multiple OSDs per host with no disk redunda...
Ian Colle
07:10 PM rbd Feature #3635 (In Progress): rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
07:10 PM Revision 44625d44 (ceph): config_opts.h: default osd_recovery_delay_start to 0
This setting was intended to prevent recovery from overwhelming peering traffic
by delaying the recovery_wq until osd...
Samuel Just
07:09 PM rbd Feature #3784 (In Progress): rbd: issue modprobe when rbd map is called
Dan Mick
06:04 PM rbd Feature #3784 (Resolved): rbd: issue modprobe when rbd map is called
rbd map will not work unless the rbd kernel module is loaded, and this must be done manually. Add code to rbd to cau... Dan Mick
07:02 PM Revision 830b8ffa (ceph): ReplicatedPG: fix snapdir trimming
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l...
Samuel Just
06:34 PM Revision 0f42c373 (ceph): ReplicatedPG: fix snapdir trimming
The previous logic was both complicated and not correct. Consequently,
we have been tending to drop snapcollection l...
Samuel Just
06:24 PM Bug #3774: osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
Sage Weil
06:14 PM Revision 035caac5 (ceph): Revert "rgw: fix handler leak in handle_request"
This reverts commit eba314a811cd98a79f483dc7a9128fe76c722c78.
Signed-off-by: Yehuda Sadeh <yehuda@inktank.com>
Yehuda Sadeh
06:11 PM rgw Feature #3402 (Fix Under Review): rgw: improve tests for multipart upload
caleb miles
06:10 PM rgw Feature #3634 (Fix Under Review): rgw: improve teuthology radosgw-admin test
caleb miles
06:09 PM Bug #3633 (Resolved): mon: clock drift errors not reported by ceph status
commit:310112f702d14294e6ba48f8af41a306288cba65 Sage Weil
06:09 PM Revision eb997e25 (ceph): Merge pull request #31 from chrisglass/expose_cluster_stats_to_python
Added python wrapper to rados_cluster_stat Greg Farnum
05:59 PM rbd Bug #3518 (Can't reproduce): rbd import file --format 2 creates an image named '--format'
Dan Mick
05:59 PM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
It seems that this no longer happens as of e6f284e945f45e39c57921149d4551d9e78557a5,
so closing non-reproducible.
Dan Mick
05:06 PM CephFS Bug #3773: mds crashed at LogEvent::decode
Okay, I gathered up a core file, a high-debug MDS log, and the log with the bad event (and the bad event itself) in t... Greg Farnum
02:05 PM CephFS Bug #3773: mds crashed at LogEvent::decode
I'll at least start this off. Greg Farnum
04:54 PM Revision c8f3fd6e (ceph): marginal: Remove broken symlinks
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
04:47 PM Messengers Bug #2569: msgr: connect_rank crash
I believe this was caused by some issues which we decided not to backport the fixes for due to their size; Sage can c... Greg Farnum
04:43 PM Messengers Bug #2569: msgr: connect_rank crash
hit this on a mixed cluster running argonaut v0.48.3 and v0.56 [ ceph version 0.56-193-g00898c1]
monitors,mds,osds...
Tamilarasi muthamizhan
04:37 PM rbd Bug #3688 (Won't Fix): rbd allows image of size 0 to be created
I claim that zero-sized images are legal, if not particularly useful in that size...but one might well want to create... Dan Mick
04:15 PM Bug #3770: OSD crashes on boot
root@ms-be1003:/var/lib/ceph/osd/ceph-27# find current/meta/ | tee ~/ceph-osd.27.meta | wc -l
42992
Attached.
Faidon Liambotis
04:02 PM Bug #3770: OSD crashes on boot
root@ms-be1003:/var/lib/ceph/osd/ceph-27/current/4.f9_head# attr -lq $PWD | while read attr; do echo $attr; attr -q -... Faidon Liambotis
02:27 PM Bug #3770 (Need More Info): OSD crashes on boot
From the backtrace:
pgid = {m_pool = 4, m_seed = 249, m_preferred = -1}
Based on the info attr, we try to...
Samuel Just
04:04 PM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
Documentation described making the call from the server console, which should work as described. Added -a so that it ... John Wilkins
03:52 PM Bug #3780 (Won't Fix): pg_num inappropriately low on new pools
Version: 0.48.2-0ubuntu2~cloud0
On a Ceph cluster with 18 OSDs, new object pools are being created with a pg_num o...
Ian Colle
03:08 PM rgw Bug #3778: document procedure for enabling subdomain S3 api calls
The documentation should note that the
@rgw dns name = {hostname}@
option must be set in the
@[client.radosgw.g...
caleb miles
11:13 AM rgw Bug #3778 (Resolved): document procedure for enabling subdomain S3 api calls
The process for setting up a server that handles subdomain API requests is not documented. If possible we should add ... caleb miles
03:07 PM Documentation #3711 (In Progress): crush-map.rst: choose firstn talks about "N", but does not cle...
John Wilkins
03:05 PM devops Documentation #2886 (In Progress): doc: crush location tricks, ceph.conf, automatic host=
John Wilkins
02:23 PM rbd Subtask #3741: krbd: rework request tracking code
I am leaving shortly for a few hours. In reviewing this
new code I find a few things that make it a little hard
ma...
Alex Elder
01:00 PM rbd Subtask #3741: krbd: rework request tracking code
I did some testing yesterday and found that I got I/O errors
while running xfstests. This was unexpected; I thought...
Alex Elder
01:43 PM Revision 797b3db3 (ceph): Added python wrapper to rados_cluster_stat
The new get_cluster_stats() method on the rados.Rados object calls
the rados_cluster_stat() function in the librados ...
Chris Glass
12:51 PM Bug #2533 (Duplicate): osd: watchers tracked by entity_name_t, not by cookie
Ian Colle
12:48 PM Feature #3769: osd: scrub should verify snap collection existence, membership
Written, just needs to be ported to Bobtail Ian Colle
09:40 AM Feature #3769 (In Progress): osd: scrub should verify snap collection existence, membership
Sage Weil
12:47 PM Bug #3736 (In Progress): kernel build: failures starting in 3.8-rc1
Ian Colle
12:02 PM Bug #3736: kernel build: failures starting in 3.8-rc1
The remaining issue is that the patch we apply to scripts/package/builddeb to build the perf tools is out of date. I... Anonymous
12:45 PM Bug #3702 (New): OSD SIGABRT during startup
Ian Colle
12:40 PM Bug #3617 (Resolved): Ceph doesn't support > 65536 PGs(?) and fails silently
Ian Colle
09:35 AM Bug #3617: Ceph doesn't support > 65536 PGs(?) and fails silently
How's the testing come along, Sage? Greg Farnum
12:39 PM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
Believed fixed by patch to 3633
684d4ba242b26828bd7927860226bfc8a0cfcc2b
Ian Colle
12:35 PM Bug #3650 (Can't reproduce): osd: crash in Reset state -> start_peering_interval -> on_change -> ...
Looked into the core dump, can't see how this happened. Samuel Just
12:30 PM Bug #3591 (Closed): auth: could not find secret_id=0
Ian Colle
12:30 PM Bug #3591 (Resolved): auth: could not find secret_id=0
Resolved by Sage's fix above. Ian Colle
12:29 PM Bug #3563 (Closed): osd crashed with error "auth: could not find secret_id=2"
Ian Colle
12:29 PM Bug #3563 (Resolved): osd crashed with error "auth: could not find secret_id=2"
Resolved by fix to 3591 Ian Colle
12:20 PM Bug #3467 (Closed): osd: bad state machine event in start_recoverY_ops
Ian Colle
12:20 PM Bug #3467 (Won't Fix): osd: bad state machine event in start_recoverY_ops
If encountered, restart OSD. Ian Colle
12:13 PM Bug #3300: ceph::buffer::end_of_buffer isn't caught
Josh - Is this just a case where the documentation needs to be updated? Ian Colle
11:46 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
The same issue exists with the debian packages. We have an explicit dependency on python, but not on perl. I don't ... Anonymous
10:55 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
Can we check to ensure perl is not used elsewhere?
Are there guidelines that are provided to the developers that spe...
Anonymous
10:06 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
I hate to see a dependency like perl get added for a oneliner perl regex. Is this the only place perl is used? Can ... Sam Lang
09:43 AM Bug #3768: perl is required for logrotate, we need to include Perl as a dependency
backport to bobtail Ian Colle
11:26 AM Tasks #3779 (Resolved): update osd config ref as appropriate
I'm not sure what our update policies on the docs are, but the defaults named in http://ceph.com/docs/master/rados/co... Greg Farnum
11:11 AM rgw Cleanup #3777 (Resolved): rgw: audit code for reading NULL env variables
Similar to the issue that triggered #3735 Yehuda Sadeh
10:25 AM Bug #3647 (Can't reproduce): forgot the auth options for Cephx and added them later: Get msg: 7f...
Sage Weil
10:19 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Ian Colle
10:19 AM rgw Bug #3735 (Resolved): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Ian Colle
10:00 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
commit:e1da85f286838cdd3a6329840cec748c6a11fd26 Sage Weil
09:57 AM Bug #3747: PGs stuck in active+remapped
Sage Weil wrote:
> commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854
oops, wrong bug!
Sage Weil
09:45 AM Bug #3747 (Resolved): PGs stuck in active+remapped
commit:f83fcf63a928fdb8ab4d604bdce596c0c4afd854 Sage Weil
09:55 AM CephFS Feature #3621 (Closed): qa: add knfsd reexport tests to qa suite
Ian Colle
09:52 AM CephFS Feature #3621: qa: add knfsd reexport tests to qa suite
commit:aaa03bbcd2549a38f962a61fc63be16cca3a6d90 in teuthology.git Sage Weil
09:34 AM Bug #3776 (Resolved): Need doc describing how to alter our log rotation
If a user has a small to moderate size of root disk, they will probably have to modify the log rotation process for c... Anonymous
09:32 AM Bug #3661 (Resolved): mon: idle/empty osds marked down after 15 min
Sage Weil
08:34 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
Sam,
That is a cool idea. I will open a doc bug for that. Providing instructions for those with smaller root dri...
Anonymous
06:32 AM Feature #3775: log: stop logging in statfs reports usage above some threshold
The easiest solution for this might be to adjust the default logrotate script (src/logrotate.conf) to use the size pa... Sam Lang
03:52 AM Revision 59aad347 (ceph): configure.ac: check for org.junit.rules.ExternalResource
Check for org.junit.rules.ExternalResource if build with
--enable-cephfs-java and --with-debug. Checking for junit4
i...
Danny Al-Gaaf
01:13 AM Revision 12af11a1 (ceph): src/java/Makefile.am: fix default java dir
Fix default javadir in src/java/Makefile.am to $(datadir)/java
since this is the common data dir for java files.
Sig...
Danny Al-Gaaf
01:13 AM Revision 9b167b46 (ceph): ceph.spec.in: fix handling of java files
Fix handling of JAVA (jar) files. Don't move the files around in the install
section since the related Makefile is fi...
Danny Al-Gaaf
01:13 AM Revision f027d025 (ceph): ceph.spec.in: rename libcephfs-java package to cephfs-java
Rename the libcephfs-java package to cephfs-java since the package
contains no (classic) library and RPMLINT complain...
Danny Al-Gaaf
01:13 AM Revision d8c4fc5e (ceph): ceph.spec.in: fix libcephfs-jni package name
Rename libcephfs-jni to libcephfs_jni1 to reflect the SO name/version of
the library and to prevent RPMLINT to compla...
Danny Al-Gaaf
01:13 AM Revision aedbb97f (ceph): configure.ac: remove AC_PROG_RANLIB
Remove already comment out AC_PROG_RANLIB to get rid of warning:
libtoolize: `AC_PROG_RANLIB' is rendered obsolete b...
Danny Al-Gaaf
01:13 AM Revision 61437ee2 (ceph): configure.ac: change junit4 handling
Change handling of --with-debug and junit4. Add a new conditional HAVE_JUNIT4
to be able to build ceph-test package a...
Danny Al-Gaaf
12:11 AM Revision 00898c18 (ceph): rbd: allow copy of zero-length images. Includes simple test.
Fixes: #3765
Signed-off-by: Dan Mick <dan.mick@inktank.com>
Dan Mick
12:10 AM Revision 1c3d6840 (ceph): doc/install/debian.rst: fix typo in link ref; broke doc build
Signed-off-by: Dan Mick <dan.mick@inktank.com> Dan Mick

01/09/2013

11:11 PM Revision 133e4e34 (ceph): Merge branch 'next'
Want to get various rbd-related fixes together for upgrade testing Dan Mick
10:40 PM Revision 48f13946 (ceph): ReplicatedPG: increment scrubber.errors rather than errors
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
05:37 PM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
commit:5b12b514b047a8a46cc5549bd94b398289b9b5f6 Sage Weil
05:08 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
I'm calling this fixed, then. Dan Mick
04:54 PM rbd Bug #3766: rbd resize command fails on a mixed node cluster when it is a copied rbd image and whe...
This works fine on the master branch that has a fix for it :
ceph version 0.56-193-g00898c1 (00898c1860e8ae95b52192...
Tamilarasi muthamizhan
01:44 PM rbd Bug #3766 (Need More Info): rbd resize command fails on a mixed node cluster when it is a copied ...
I think this might be e1776809031c6dad441cfb2b9fac9612720b9083, which is still in next. Can you try an rbd client fr... Dan Mick
04:35 PM Feature #3775: log: stop logging in statfs reports usage above some threshold
Deb Barba <deb.barba@inktank.com>
3:13 PM (1 hour ago)
to Dan
so, as I explained in chat.
i am again seeing ...
Anonymous
04:34 PM Feature #3775 (New): log: stop logging in statfs reports usage above some threshold
Add a 'log stop on utilization = .95' option that will make the log code print one last line like
--- suspending l...
Anonymous
04:31 PM Bug #3774 (Resolved): osd: 'ceph osd scrub' and 'ceph pg scrub' are poorly scheduled
These should get put at the top of the scrub queue in a way that still honors all the scheduling.
The problem is t...
Sage Weil
04:27 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
Dan Mick
04:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
Fixed, test added, in master:
commit:00898c1860e8ae95b5219257d1635b15ccdce5c1
Dan Mick
11:44 AM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
Dan Mick
02:58 PM CephFS Bug #3773 (Can't reproduce): mds crashed at LogEvent::decode
ceph version: 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
I had a cluster [burnupi06, burnupi07, burnupi08] ...
Tamilarasi muthamizhan
02:32 PM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
I believe this to have been fixed by the fix for #3744. Dan Mick
01:47 PM rbd Bug #3753: rbd copy command reports error even though copy is successful on a mixed node cluster
Tamil, does this still happen with the fix in wip-no-cls-lock (and now in next) for 3744? Dan Mick
02:14 PM Bug #3772 (Can't reproduce): osd: osd_disk_threads = 5 seems to hang recovery
reported on IRC, should be easy to reproduce.
we may want to change the default to 2 in order to avoid hiding thes...
Samuel Just
01:51 PM rbd Bug #3697 (Can't reproduce): rbd copy.sh test failing in nightly
unable to reproduce so far Dan Mick
12:05 PM CephFS Feature #3570 (In Progress): teuthology: mds thrasher
Sam Lang
11:47 AM rbd Feature #2256: rbd: parallelize deletions
Dan Mick
11:46 AM rbd Feature #2297: ObjectCacher: mark buffers mergeable for ksm
Dan Mick
11:46 AM rbd Bug #3518: rbd import file --format 2 creates an image named '--format'
Dan Mick
11:46 AM rbd Feature #3635: rbd cli: call "udevadm settle" after use of add/remove kernel interface
Dan Mick
11:42 AM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
commit:4483285c9fb16f09986e2e48b855cd3db869e33c in next Dan Mick
11:28 AM Bug #3771: ceph does not have startup scripts in Centos
Gary found that the installation script was commented out 2011-10-17
> commit 9baf5ef4f35c38d7fbaa70bde8f2c9383b2f...
Anonymous
11:13 AM Bug #3771 (Resolved): ceph does not have startup scripts in Centos
I did a basic ceph v0.56 installation on Centos 6.3
I have rebooted my nodes, and find that ceph is not startup up a...
Anonymous
10:58 AM CephFS Bug #3681: kclient fsx fails nightly
Proposed fix to set i_size before the setattr request:
This will resolve the above issue, because the cap flush on...
Sam Lang
09:59 AM Bug #3683 (Can't reproduce): mon: leak of MMonPaxos
Joao Eduardo Luis
09:58 AM Bug #3683: mon: leak of MMonPaxos
I can't for the life of me get to reproduce this leak. In the meantime, Sage submitted a patch to msg/Pipe.cc [1] tha... Joao Eduardo Luis
07:17 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
I've been unable to reproduce this bug, but the cause was pretty obvious, so I pushed a fix that should deal with thi... Joao Eduardo Luis
03:39 AM Revision 62e721a9 (ceph): librados: add aio stat tests
Implement simple write-stat test, and a write-stat-remove-stat test cycle.
Signed-off-by: Filippos Giannakos <philip...
Filippos Giannakos
03:38 AM Revision 879578c1 (ceph): librados: implement aio_stat
Implement aio stat and also export this functionality to the C API.
Signed-off-by: Filippos Giannakos <philipgian@gr...
Filippos Giannakos
02:32 AM Revision 5b12b514 (ceph): osd: make missing head non-fatal during scrub
If we encounter a scrub without a preceeding head, warn instead of
crashing. Note that this is still something we ca...
Sage Weil
02:29 AM Revision e1da85f2 (ceph): rgw: Fix crash when FastCGI frontend doesn't set SCRIPT_URI
Fixes: #3735
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Sylvain Munaut
02:28 AM Revision eba314a8 (ceph): rgw: fix handler leak in handle_request
Fixes: #3682
Signed-off-by: caleb miles <caleb.miles@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
caleb miles
02:25 AM Revision 4483285c (ceph): librbd: Allow get_lock_info to fail
If the lock class isn't present, EOPNOTSUPP is returned for lock calls
on newer OSDs, but sadly EIO on older; we need...
Dan Mick
02:21 AM Revision 77ddf276 (ceph): doc/release-notes: v0.48.3argonaut
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
12:23 AM Bug #3770 (Resolved): OSD crashes on boot
One of my 0.56.1 OSDs crashed and couldn't boot: it was reaching tp_op heartbeats, and even after increasing that I w... Faidon Liambotis

01/08/2013

10:21 PM Feature #3769 (Resolved): osd: scrub should verify snap collection existence, membership
and, hopefully, backport this to argonaut Sage Weil
09:39 PM Feature #3651 (In Progress): osd: deep scrub should hash omap
David Zafman
07:57 PM Revision 573f5315 (ceph): marginal/multiclient: Matching tests for kclient
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
07:54 PM Revision 14385a66 (ceph): marginal/multiclient: Add three client cluster
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
07:51 PM Revision a4df5238 (ceph): marginal/multiclient: Adding ior test to marginal
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:36 PM Revision 1e03fe18 (ceph): marginal/multiclient: Add a test for fsx-mpi
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:23 PM Revision c07a4cb6 (ceph): marginal/multiclient: New task to run mdtest
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
06:11 PM Revision f17847e5 (ceph): task/kclient: chmod root to 1777.
Signed-off-by: Greg Farnum <greg@inktank.com> Greg Farnum
05:27 PM rbd Bug #3765: rbd cp of a zero sized image succeeds with error
I looked into this; it happens because clip_io() (called from read_iterate()) tries to validate
that writing at offs...
Dan Mick
03:23 PM rbd Bug #3765 (Resolved): rbd cp of a zero sized image succeeds with error
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
while am still surprised about why we ne...
Tamilarasi muthamizhan
04:45 PM Bug #3768 (Resolved): perl is required for logrotate, we need to include Perl as a dependency
logrotate for ceph (/etc/logrotate.d/ceph) uses perl commands
if perl is not installed, logrotate fails
if logrotat...
Anonymous
04:29 PM CephFS Bug #3597: ceph-fuse: denying root access
Is root actually a member of the fuse group? If not that would be correct behavior. Greg Farnum
04:07 PM Revision f8958463 (ceph): task/mpi: Allow working directory to be specified
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:46 PM rbd Bug #3766 (Resolved): rbd resize command fails on a mixed node cluster when it is a copied rbd im...
ubuntu@burnupi24:/var/log/ceph$ ceph -v
ceph version 0.56-131-gd283abd (d283abdf50b1e4429b775680bfae1bb20c75306b)
...
Tamilarasi muthamizhan
03:42 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
I think so.
But first let's verify it passes.
Sage Weil
12:43 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
Should we revert that teuthology commit, then? Greg Farnum
12:31 PM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
There was a bug in the kernel for o_creat permissions checking for non root users.. Its fixed in the testing branch. ... Sage Weil
10:49 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
This is weird. Tamil says this one has never passed, but we can both run it locally fine and it passes in the ceph-fu... Greg Farnum
09:39 AM Bug #3752: fsync-tester script need to be fixed to run in the nightlies
I made a change to the cfuse task to chmod 1777 the ceph root dir after its mounted. I think we should do the same f... Sam Lang
09:21 AM Bug #3752 (Resolved): fsync-tester script need to be fixed to run in the nightlies
log: ubuntu@teuthology:/a/teuthology-2013-01-05_22:28:52-regression-next-testing-basic/35949
35949: (190s) collect...
Tamilarasi muthamizhan
03:34 PM Revision 16248121 (ceph): task: A task to setup mpi
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:33 PM Revision e88c0fc8 (ceph): task/ceph-fuse: chmod root to 1777
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:32 PM Revision 4ed20ae8 (ceph): task/pexec: Add barrier capability
This patch adds the ability to barrier between
parallel exec tasks so that all tasks will perform
the following step ...
Sam Lang
03:31 PM Revision 35320083 (ceph): task/pexec: More fixes for all case, exec on hosts
We don't want to do an exec per role, but per-host. We
were already doing an exec per host, but the names were confu...
Sam Lang
03:29 PM Revision 081a80f8 (ceph): task/pexec: Fix when 'all' is used
Signed-off-by: Sam Lang <sam.lang@inktank.com> Sam Lang
03:25 PM Revision d44fb147 (ceph): radosgw-admin.py: Increase test coverage to current admin feature set.
Signed-off-by: caleb miles <caleb.miles@inktank.com> caleb miles
12:58 PM Feature #3760: osd: maintain checksum on collection contents
It wasn't clear to me from the description, but we are of course talking about maintaining in the HashIndex a checksu... Greg Farnum
12:13 PM Feature #3760 (Rejected): osd: maintain checksum on collection contents
Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ... Samuel Just
12:33 PM RADOS Feature #3764 (New): osd: async replicas
The following is more a topic for conversation than a feature:
Currently, latency on any operation is limited by t...
Samuel Just
12:23 PM rbd Feature #3763 (Resolved): krbd: handle flattening of mapped image
An rbd client receives notice if the snapshot context for
a mapped rbd image has changed. It is possible for the
s...
Alex Elder
12:19 PM Linux kernel client Bug #3762 (Duplicate): kernel osd client: verify support for multiple ops per request
In order to support layered rbd images, the osd client needs
to support multiple ops in a single osd request.
Loo...
Alex Elder
12:15 PM rbd Feature #3761 (Resolved): kernel messenger: need to support multiple ops per request
The kernel messenger currently gets message data from either
a bio list or a page vector. That is one or the other,...
Alex Elder
12:13 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
Samuel Just
12:11 PM Bug #3759 (Duplicate): osd: maintain checksum on collection contents
Currently, there is no way for an OSD to detect erroneously missing objects in a pg collection. A scrub, therefore, ... Samuel Just
12:08 PM rbd Tasks #2853: krbd: read path
This task depends on the completion of the following others
before it can be completed:
3741 krbd: rework request ...
Alex Elder
12:07 PM Feature #3758 (Rejected): osd: incremental object checksumming
Currently, scrub can only compare the checksums between replicas. If an inconsistency is found between two replicas,... Samuel Just
12:07 PM rbd Subtask #2854: krbd: write path
Work on this won't really begin until the read path work
has completed (http://tracker.newdream.net/issues/2853).
Alex Elder
12:06 PM rbd Subtask #2854: krbd: write path
OK, I'm going to interpret this as:
Any write operation on a layered image will be preceded
by an existence c...
Alex Elder
12:04 PM CephFS Feature #626 (Closed): qa: add IOR, rompio, or other parallel workloads suite
Added tests to the _marginal_ qa suite that run IOR, mdtest, and fsx-mpi. Sam Lang
11:48 AM Feature #3756 (Duplicate): Watch/Notify cleanup
Samuel Just
11:41 AM Feature #3756 (Duplicate): Watch/Notify cleanup
The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this...
Samuel Just
11:47 AM Feature #3757 (Resolved): osd: Watch/Notify cleanup
The current design is rather fragile particularly with respect to the locking and ref counting.
The result of this...
Samuel Just
11:24 AM Bug #3744: librbd: need to handle older OSDs that don't have cls_lock
Actually, rados lock list should continue to fail. Dan Mick
11:10 AM Documentation #3322: doc: Explain multi-tenant CephFS
Where is this located? I wasn't able to find it. Greg Farnum
11:00 AM rbd Tasks #3755 (Resolved): krbd: use new request tracking code for sync object operations
The last request type still using the old request tracking code
is for handling synchronous operations. There are t...
Alex Elder
10:58 AM rbd Feature #3754 (Closed): krbd: use new request tracking code for notify ack
Two request types remain that still use the old request
tracking mechanism. One of them is sending acknowledgements...
Alex Elder
09:54 AM rbd Bug #3753 (Resolved): rbd copy command reports error even though copy is successful on a mixed no...
ceph version 0.56.1 (e4a541624df62ef353e754391cbbb707f54b16f7)
On a mixed node cluster running argonaut[burnupi21,...
Tamilarasi muthamizhan
09:39 AM CephFS Feature #3543: mds: new encoding
I'm going to get started on this (mostly just figuring out current state, probably) today. Greg Farnum
09:28 AM Bug #3695: monitor crashed after an upgrade in Monitor::timecheck
Joao Eduardo Luis
06:54 AM Bug #3695 (In Progress): monitor crashed after an upgrade in Monitor::timecheck
Joao Eduardo Luis
08:47 AM Linux kernel client Bug #3751: krbd: fix type of snap_id local variable
I have a fix for this and I'll post it for review later
today....
Alex Elder
08:47 AM Linux kernel client Bug #3751 (Resolved): krbd: fix type of snap_id local variable
The type of the snap_id local variable in rbd_dev_v2_snap_info()
is defined with the wrong byte order.
Alex Elder
06:43 AM Bug #3748: ceph osd dump --format=json includes non-JSON line
One other option would be to provide "standard" fields for status output when using json, regardless of any other exp... Joao Eduardo Luis
05:08 AM Revision 920f82e8 (ceph): v0.48.3argonaut
Gary Lowell
04:51 AM Bug #3750 (Resolved): Possible Ceph 5-minute quick start guide typo
I believe that the Ceph quick start guide should specify
@sudo service ceph -a start@
instead of the current
@...
caleb miles
04:51 AM Revision f07921be (ceph): doc/install: new URLs for argonaut vs bobtail
Also restructure the document a bit to make the choice of packages more
clear.
Signed-off-by: Sage Weil <sage@inktan...
Sage Weil
04:46 AM Revision 72674ad4 (ceph): doc/release-notes: v0.56.1
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
03:40 AM Bug #3747: PGs stuck in active+remapped
I did a "ceph osd out 0; sleep 30; ceph osd in 0" and out of those 61 active+remapped pgs, 5 went into active+remappe... Faidon Liambotis
12:14 AM Revision 1b194b25 (ceph): Merge branch 'wip-stripe-gran'
Reviewed-by: Greg Farnum <greg@inktank.com> Noah Watkins

01/07/2013

11:50 PM Revision 26e8438a (ceph): test: enforce -ENOTCONN contract in libcephfs
Tests all relevant calls for -ENOTCONN when used with an unmounted
ceph_mount_info param.
Signed-off-by: Noah Watkin...
Noah Watkins
11:49 PM Revision 5c58aa96 (ceph): libcephfs: return -ENOTCONN when call unmounted
Adds -ENOTCONN return value for stat, fchmod, fchown, lchown.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
11:16 PM Revision f83fcf63 (ceph): PG: set DEGRADED in Active AdvMap handler based on pool size
Otherwise, if the acting set does not change, the pg might
not show up as degraded if the pool size now exceeds the
a...
Samuel Just
11:04 PM Revision c4121093 (ceph): libcephfs: clarify interface return value
Document that ceph_get_stripe_unit_granularity may return an error code
(e.g. -ENOTCONN). The interface requires a mo...
Noah Watkins
09:33 PM Revision e4a54162 (ceph): v0.56.1
Gary Lowell
09:12 PM Revision c8f8c7e6 (ceph): Merge branch 'next'
Sage Weil
09:08 PM Revision 9aecacda (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim...
Sage Weil
09:08 PM Revision 299dbad4 (ceph): msgr: update Message envelope in encode, not write_message
Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi...
Sage Weil
09:08 PM Revision 35d2f583 (ceph): msg/Pipe: encode message inside pipe_lock
This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen...
Sage Weil
09:08 PM Revision 9b23f195 (ceph): msg/Pipe: associate sending msgs to con inside lock
Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes...
Sage Weil
09:08 PM Revision 6229b5a0 (ceph): msg/Pipe: fix msg leak in requeue_sent()
The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from comm...
Sage Weil
09:04 PM Revision 1b39b316 (ceph): Merge branch 'wip-3678-b' into next
Reviewed-by: Greg Farnum <greg@inktank.com>
Reviewed-by: Samuel Just <sam.just@inktank.com>
Sage Weil
09:02 PM Revision 40706afc (ceph): msgr: update Message envelope in encode, not write_message
Fill out the Message header, footer, and calculate CRCs during
encoding, not write_message(). This removes most modi...
Sage Weil
09:02 PM Revision d16ad926 (ceph): msg/Pipe: prepare Message data for wire under pipe_lock
We cannot trust the Message bufferlists or other structures to be
stable without pipe_lock, as another Pipe may claim...
Sage Weil
09:01 PM Revision 6a00ce0d (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc...
Sage Weil
08:58 PM Revision 62586884 (ceph): osdc/Objecter: fix linger_ops iterator invalidation on pool deletion
The call to check_linger_pool_dne() may unregister the linger request,
invalidating the iterator. To avoid this, inc...
Sage Weil
06:39 PM Revision 213e3559 (ceph): osd: fix race in do_recovery()
Verify that the PG is still RECOVERING or BACKFILL when we take the pg
lock in the recovery thread. This prevents a ...
Sage Weil
06:38 PM Revision e410d1a0 (ceph): ReplicatedPG: requeue waiting_for_ondisk in apply_and_flush_repops
Fixes: #3722
Signed-off-by: Samuel Just <sam.just@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Samuel Just
06:34 PM Revision 4c9f4c3c (ceph): ceph-fuse: rename ceph_ll_* to fuse_ll_*
To not conflict with future linuxbox pull for nfs-ganesha.
Signed-off-by: David Zafman <david.zafman@inktank.com>
Re...
David Zafman
04:04 PM CephFS Feature #3749 (Resolved): Remove forced synchronization from Java bindings
Remove "synchronized" keyword from native interface. This was originally added when we were seeing some pthread mutex... Noah Watkins
03:58 PM Bug #3748 (Resolved): ceph osd dump --format=json includes non-JSON line
ceph osd dump --format=json includes the non-JSON "dumped osdmap epoch N" at the top of the output, which of course b... Dan Mick
03:42 PM Bug #3747 (Closed): PGs stuck in active+remapped
About a week ago I doubled the number of OSDs in my cluster from 24 to 48 and, in the same day, adjusted CRUSH's defa... Faidon Liambotis
03:35 PM rbd Subtask #2854: krbd: write path
rbd write path.. 'guard' in the sense that the write has a check to verify the object already exists. Sage Weil
03:22 PM rbd Subtask #2854: krbd: write path
Pretty sure this is about the rbd locking and fencing. Greg Farnum
03:11 PM rbd Subtask #2854: krbd: write path
I'm about to mark bug 3418 as a duplicate of this one.
I'm adding the following from that bug here first.
I did...
Alex Elder
03:11 PM rbd Subtask #2854: krbd: write path
I'm not sure what "guard writes" is supposed to mean.
But I'm going to interpret it as simply implementing the
writ...
Alex Elder
03:26 PM CephFS Bug #3746 (Rejected): kclient mmap doesn't zero past EOF
Error coming from fsx:
INFO:teuthology.orchestra.run.out:Mapped Write: non-zero data past EOF (0xb826) page offset...
Sam Lang
03:14 PM rbd Feature #3419 (Duplicate): krbd: copy-up on write to clone
This is a duplicate of http://tracker.newdream.net/issues/2855. Alex Elder
03:14 PM rbd Subtask #2855: krbd: copy-up on write to clone
I don't know how to change the one-line bug description or I
would.
I need some clarification about the intended ...
Alex Elder
03:12 PM rbd Feature #3418 (Duplicate): krbd: write path (layering)
This is a duplicate of http://tracker.newdream.net/issues/2854. Alex Elder
03:07 PM rbd Feature #3417 (Duplicate): krbd: read path (layering)
This is a duplicate of tracker.newdream.net/issues/2854. Alex Elder
03:06 PM rbd Tasks #2853: krbd: read path
I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work o...
Alex Elder
03:05 PM rbd Feature #3416 (Duplicate): krbd: open parent on open
Marking this as a duplicate of http://tracker.newdream.net/issues/2852. Alex Elder
02:51 PM rbd Bug #3743: krbd: errors on submitted requests are ignored
If I could figure out how, I'd change the title of this
to say "krbd" rather than "rbd" to help make it clear
which...
Alex Elder
02:27 PM rbd Bug #3743 (Won't Fix): krbd: errors on submitted requests are ignored
When a Linux request comes down to the rbd driver via rbd_rq_fn(),
rbd_dev_do_request() is called after validating t...
Alex Elder
02:50 PM rbd Bug #3745 (Rejected): krbd: individual response errors are ignored
A Linux I/O request on an rbd image is broken into one or
more rbd requests, one request directed to each osd object...
Alex Elder
02:41 PM Bug #3744 (Resolved): librbd: need to handle older OSDs that don't have cls_lock
Older OSDs didn't have libcls_lock, and will fail lock operations; this means
virtually all rbd operations and rados...
Dan Mick
01:22 PM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
commit:e410d1a066b906cad3103a5bbfa5b4509be9ac37 Sage Weil
01:22 PM Bug #3736: kernel build: failures starting in 3.8-rc1
Sure enough, this is the commit that causes the problem:
af3df2c perf tools: Try to build Documentation when insta...
Alex Elder
11:48 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Looks like commit 6ca2a9c is the first one in that branch
that fails. It has a parent ce37f40 that succeeds.
I'v...
Alex Elder
10:24 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Heard back from Neil as well as Vlad Yasevich about my
proposed fix and they both ack'd it. Linus was in on
the di...
Alex Elder
09:07 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Despite a working build of the *kernel*, the package build
overall is still failing. It has something to do with bu...
Alex Elder
08:52 AM Bug #3736: kernel build: failures starting in 3.8-rc1
Neil Horman sent a response to my message and suggested
three possible alternatives to fix the underlying problem,
...
Alex Elder
05:42 AM Bug #3736: kernel build: failures starting in 3.8-rc1
I changed our config file, found in the git repository
autobuild-ceph in the file "kernel-config" in the way
descri...
Alex Elder
05:40 AM Bug #3736: kernel build: failures starting in 3.8-rc1
I'm retroactively updating this so a bit about what's been
done gets documented.
The problem was in the Kconfig f...
Alex Elder
05:35 AM Bug #3736 (Resolved): kernel build: failures starting in 3.8-rc1
Kernels as of version 3.8-rc1 are not properly building in
autobuilder. The initial symptom was that the config pha...
Alex Elder
01:16 PM Bug #3678 (Resolved): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
Sage Weil
01:16 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
commit:1b39b31678aea8c5bbdb38811b3919525228d10f Sage Weil
01:01 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
Sage Weil
12:19 PM CephFS Cleanup #3742 (Resolved): Remove old Hadoop wrappers and configuration options
I think it's likely that the current Hadoop shim is at least at feature parity with the old wrappers. Noah Watkins
12:16 PM Bug #3702: OSD SIGABRT during startup
Dan Mick wrote:
> Is this related to rbd, or should it be in category 'ceph'?
Ah, yes, it should. Thank you for c...
Justin Lott
11:31 AM Bug #3702: OSD SIGABRT during startup
Is this related to rbd, or should it be in category 'ceph'? Dan Mick
12:07 PM rbd Subtask #3741: krbd: rework request tracking code
... Alex Elder
11:54 AM rbd Subtask #3741 (Resolved): krbd: rework request tracking code
This is actually work that's mostly complete, but it never
got a bug assigned to it.
In order to handle layering ...
Alex Elder
11:26 AM Bug #3632 (Resolved): occasional testrados failure: process_8 exited with a signal
this is probably #3734, now fixed. Sage Weil
11:09 AM rbd Subtask #2852: krbd: open parent on open
This work is essentially done, and has been since
October 2012 (or even earlier). However I held off
posting it fo...
Alex Elder
11:00 AM Linux kernel client Bug #3740 (Resolved): ceph-client: change to be based on 3.8-rc2
Our current ceph-client tree is based on Linux 3.6.
That is fairly old code (late September, 2012). We
should upda...
Alex Elder
10:12 AM Feature #3739 (Resolved): osd: repair object size vs object_info_t mismatches
if the object_info_t size doesn't match the on-disk file/object size, we needt o repair it. this means proposing a s... Sage Weil
10:02 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
Noah Watkins
10:02 AM CephFS Bug #3726 (Closed): Enforce Ceph's minimum stripe size in the java bindings
Noah Watkins
09:21 AM CephFS Bug #3738 (Resolved): kclient fsx truncate/write multi-client race

This bug is similar to #3681, but occurs only in the non-exclusive case (multiple clients), where a truncate doesn'...
Sam Lang
09:09 AM CephFS Bug #3681: kclient fsx fails nightly
The race here is between a truncate down, and completion of osd write ops triggering a cap flush. The exact order th... Sam Lang
06:30 AM rbd Bug #3737 (Resolved): Higher ping-latency observed in qemu with rbd_cache=true during disk-write
Hi Josh,
as per our short conversation in IRC-#ceph there is an issue with latency/responsiveness with rbd_cache e...
Oliver Francke
04:38 AM Revision 4cfc4903 (ceph): msg/Pipe: encode message inside pipe_lock
This modifies bufferlists in the Message struct, and it is possible
for multiple instances of the Pipe to get referen...
Sage Weil
04:38 AM Revision a058f161 (ceph): msg/Pipe: associate sending msgs to con inside lock
Associate a sending message with the connection inside the pipe_lock.
This way if a racing thread tries to steal thes...
Sage Weil
04:38 AM Revision 2a1eb466 (ceph): msg/Pipe: fix msg leak in requeue_sent()
The sent list owns a reference to each message.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil
04:18 AM rgw Bug #3735: rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
Here's the fix I used on my system to fix the problem. The S3 service is set at the root of the virtual server so "" ... Sylvain Munaut
03:07 AM rgw Bug #3735 (Closed): rgw: Crashes when using a fastCGI front end that doesn't set SCRIPT_URI
I'm using lighttpd as a Fast CGI front end for radosgw and it doesn't set SCRIPT_URI environment variable.
So the ...
Sylvain Munaut

01/06/2013

10:50 PM Bug #3734 (Fix Under Review): osd/objecter: misdirected op in librados api tests
wip-3734 Sage Weil
10:41 PM Bug #3734: osd/objecter: misdirected op in librados api tests
epoch 328:... Sage Weil
10:15 PM Bug #3734 (Resolved): osd/objecter: misdirected op in librados api tests
... Sage Weil
03:10 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
this was #3731 Sage Weil
02:38 PM Bug #3722: osd: indefinitely hung request on stable cluster
Sage Weil
02:34 PM Bug #3678 (Fix Under Review): osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MN...
YAY, wip-3678 is consistently passing now. Sage Weil
05:37 AM Revision a10950f9 (ceph): os/FileJournal: include limits.h
Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit ce49968938ca3636f48fe5431...
Sage Weil
04:54 AM Revision ce499689 (ceph): os/FileJournal: include limits.h
Needed for IOV_MAX.
Signed-off-by: Sage Weil <sage@inktank.com>
Sage Weil

01/05/2013

09:32 PM Feature #3733 (Closed): osd: update leveldb submodule
Sage Weil
07:17 PM Revision e9efa332 (ceph): java: add stripe unit granularity tests
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:12 PM Revision ececcf57 (ceph): java: update javadoc comments
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:10 PM Revision cdd138da (ceph): java: fix whitespace
Signed-off-by: Noah Watkins <noahwatkins@gmail.com> Noah Watkins
07:08 PM Revision abcda95b (ceph): libcephfs: expose stripe unit granularity
Assists clients in choosing layout parameters.
Signed-off-by: Noah Watkins <noahwatkins@gmail.com>
Noah Watkins
07:08 PM Revision 6954bf33 (ceph): java: add support for get_stripe_unit_granularity
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Noah Watkins <noahwatkins@gmail.com>
Joe Buck
06:47 PM Documentation #3389 (In Progress): doc: crush docs could use a full example crushmap
John Wilkins
10:02 AM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Do we have a test that checks our interfaces to
automatically catch inadvertent protocol changes?
If not, we should.
Alex Elder
09:04 AM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
commit:988a52173522e9a410ba975a4e8b7c25c7801123 Sage Weil
09:04 AM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
commit:28d59d374b28629a230d36b93e60a8474c902aa5 Sage Weil
09:03 AM Bug #3698 (Resolved): filestore: ENOENT on clone
commit:e89b6ade63cdad315ab754789de24008cfe42b37 Sage Weil
08:27 AM Feature #3732 (Resolved): osd/mon: report recovery rate (bytes and objects per sec)
Report the rate of recovery (objects and bytes per second) via the monitor, presumably via 'ceph -w' and similar inte... Sage Weil
04:48 AM Revision 415294c0 (ceph): Merge branch 'next'
Sage Weil
04:47 AM Revision cd194ef3 (ceph): osd: special case CALL op to not have RD bit effects
In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ...
Sage Weil
04:47 AM Revision 921e06de (ceph): Revert "OSD: remove RD flag from CALL ops"
This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati...
Sage Weil
04:46 AM Revision 988a5217 (ceph): osd: special case CALL op to not have RD bit effects
In commit 20496b8d2b2c3779a771695c6f778abbdb66d92a we treat a CALL as
different from a normal "read", but we did not ...
Sage Weil
04:46 AM Revision d3abd0fe (ceph): Revert "OSD: remove RD flag from CALL ops"
This reverts commit 91e941aef9f55425cc12204146f26d79c444cfae.
We cannot change this op code without breaking compati...
Sage Weil
03:51 AM Revision 3a940874 (ceph): libcephfs: delete client after messenger shutdown
Prevents race between messages being dispatched to the client after the
client has been free'd.
Signed-off-by: Noah ...
Noah Watkins
02:02 AM Revision 0978dc49 (ceph): rbd: Don't call ProgressContext's finish() if there's an error.
do_copy was different from the others; call pc.fail() on error and
do not call pc.finish().
Fixes: #3729
Signed-off-...
Dan Mick

01/04/2013

09:45 PM Revision 7513e971 (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:44 PM Revision e89b6ade (ceph): ReplicatedPG: remove old-head optization from push_to_replica
This optimization allowed the primary to push a clone as a single push in the
case that the head object on the replic...
Samuel Just
09:37 PM Revision 6a3d475c (ceph): Merge remote branch 'origin/wip-rbd-watch'
Reviewed-by: Dan Mick <dan.mick@inktank.com> Josh Durgin
08:32 PM Revision cd5f2bfd (ceph): ObjectCacher: fix off-by-one error in split
This error left a completion that should have been attached
to the right BufferHead on the left BufferHead, which wou...
Josh Durgin
07:54 PM CephFS Bug #3666 (Resolved): Segfault running test_libcephfs
commit:3a9408742a8a6cbc870cba543a208285f1a6cec1 Sage Weil
03:25 PM CephFS Bug #3666: Segfault running test_libcephfs
I pushed a new wip-client-shutdown. This switches the clean-up order of client/messenger in libcephfs, rather than mo... Noah Watkins
01:36 PM CephFS Bug #3666: Segfault running test_libcephfs
Right, I think your fix will work, but it breaks the interface abstraction (messenger is created above the client, de... Sam Lang
01:16 PM CephFS Bug #3666: Segfault running test_libcephfs
This is what I'm running to reproduce the error. It's been running now for an hour on wip-client-shutdown without any... Noah Watkins
12:57 PM CephFS Bug #3666: Segfault running test_libcephfs
Rather than moving messenger shutdown into client shutdown? Noah Watkins
12:48 PM CephFS Bug #3666: Segfault running test_libcephfs
A similar issue was just handled in the ceph_fuse.cc code. There we just delay deleting the client till the end. Yo... Sam Lang
10:41 AM CephFS Bug #3666: Segfault running test_libcephfs
During unmount, the client is shutdown and free'd before the messenger. If any messages are delivered after the clien... Noah Watkins
07:07 PM Revision 802c486f (ceph): config: change default log_max_recent to 10,000
Commit c34e38bcdc0460219d19b21ca7a0554adf7f7f84 meant to do this but got
the wrong number of zeros.
Signed-off-by: S...
Sage Weil
06:18 PM Revision d6496abf (ceph): remove rbd_header_race test
This no longer works since export does not do a watch, and the race is
being closed a different way not detectable by...
Josh Durgin
06:16 PM Revision 620dd551 (ceph): task: mon_clock_skew_check.py: Check for clock skews on the monitors
Will run for as long as teuthology runs. By default, fails if any clock
skews higher than 0.05 seconds are detected, ...
Joao Eduardo Luis
06:11 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
commit:0978dc4963fe441fb67afecb074bc7b01798d59d Dan Mick
03:12 PM rbd Bug #3729 (Resolved): rbd cp command reports 100% completion even on failure
ceph version 0.56-109-gd8940d1 (d8940d15c330d05c8a198ff7dde16df748938b65)
when trying to copy rbd image to an alre...
Tamilarasi muthamizhan
06:06 PM Bug #3702: OSD SIGABRT during startup
Sage Weil wrote:
> Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else?
...
Justin Lott
09:42 AM Bug #3702 (Need More Info): OSD SIGABRT during startup
Sage Weil
05:54 PM Revision 1a878611 (ceph): regression: include nfs suite
Sage Weil
05:50 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
got msgr logs in ubuntu@teuthology:/a/sage-a3/34724, but the crash looked different from the earlier ones (whose logs... Sage Weil
05:40 PM Bug #3731 (Fix Under Review): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompati...
see wip-3731 Sage Weil
05:19 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Agreed. And let's make sure it's fixed for 0.56.1.
Sage Weil
05:15 PM Bug #3731: rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible protocol change
Discussed this with Dan and Sam and I think we just want to roll this patch back and tell people not to use v0.56 for... Greg Farnum
04:34 PM Bug #3731 (Resolved): rados.h: recent change to CEPH_OSD_OP_CALL constitutes an incompatible prot...
CEPH_OSD_OP_CALL changed to remove the CEPH_OSD_OP_MODE_RD bit in
91e941aef9f55425cc12204146f26d79c444cfae; however,...
Dan Mick
05:03 PM Revision e88b909a (ceph): task: ceph_manager: add 'get_mon_health' function
Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com> Joao Eduardo Luis
03:29 PM CephFS Feature #3730 (Closed): Support replication factor in Hadoop
In order to support per-file replication values in Hadoop we need to specify that a new file should be generated in a... Noah Watkins
02:38 PM rbd Bug #3642 (Resolved): librbd: watch is sent with assert version, which fails on resends
commit:6a3d475cf08eb3051e8cdbce10b17b53c92b9cb5 Josh Durgin
11:31 AM rbd Bug #3642 (Fix Under Review): librbd: watch is sent with assert version, which fails on resends
in branch wip-rbd-watch Josh Durgin
01:54 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
Also, name it something along the lines of get_stripe_granularity() and not .._min(imum)_ as that isn't entirely accu... Anonymous
01:40 PM CephFS Bug #3726: Enforce Ceph's minimum stripe size in the java bindings
After a discussion on jabber, the decision is to go with exposing a function call in libcephfs and then using that in... Anonymous
11:09 AM CephFS Bug #3726 (Resolved): Enforce Ceph's minimum stripe size in the java bindings
The Hadoop bindings are using the blocksize as the stripe size. If a block size is explicitly passed down, it ends up... Anonymous
01:00 PM CephFS Bug #3718: multi-client dbench gets stuck over NFS exported cephfs
Heads up, Zheng Yan's patches on the mds fix issues related to running multiclient dbench tests. Sam Lang
12:24 PM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hmm, okay. I wasn't real clear on the previous bugs so I'll need to look at it more if I end up taking this, but soun... Greg Farnum
11:46 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thin...
Sage Weil
11:35 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Hurray, it is. Nobody except the client looks at the trace_bl and setting that is the only thing set_trace() does. Ex... Greg Farnum
11:17 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Greg Farnum wrote:
> Am I reading it correctly that this is just going to be doing the config and wrapper work to no...
Sage Weil
09:01 AM CephFS Feature #3626: mds: debug mode to generate traceless replies to clients
Am I reading it correctly that this is just going to be doing the config and wrapper work to not call set_trace() in ... Greg Farnum
12:20 PM CephFS Feature #3543: mds: new encoding
Sage Weil
12:20 PM CephFS Feature #3728: mds: draft design for lookup by ino
Sage Weil
12:14 PM CephFS Feature #3728 (Resolved): mds: draft design for lookup by ino
Sage Weil
12:20 PM CephFS Feature #3570: teuthology: mds thrasher
Sage Weil
12:06 PM CephFS Feature #3727 (Resolved): mds: refactor EMetablob encoding paths
Right now, the EMetaBlob sub-structures — for performance reasons — use an encoding pattern that doesn't match anythi... Sage Weil
11:42 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
Greg Farnum wrote:
> I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should e...
Sage Weil
09:34 AM CephFS Cleanup #89: mds: put inode dirty fields in dirty_bits_t to reduce memory footprint
I briefly scanned the CInode and inode_t structs and it wasn't obvious to me what this should encompass. Are you talk... Greg Farnum
11:41 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
This was a whiteboard discussion 2 years ago. Nothing was written down. We should reopen new and more detailed issu... Sage Weil
09:29 AM CephFS Subtask #547: mds: define fsck strategy, required metadata
Where are the results of this bug? It's marked resolved but I don't see any fsck references in the git tree, and ther... Greg Farnum
11:39 AM Feature #685: libcephmon: interact with ceph monitors via a library
BTW it may make sense to push the client command stuff in the ceph tool into MonClient, and then wrap that in libceph... Sage Weil
11:38 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Greg Farnum wrote:
> Do we have a separate bug for the library calls this needs?
#685, which would take the clien...
Sage Weil
09:27 AM CephFS Cleanup #3677: libcephfs, mds: test creation/addition of data pools, create policy
Do we have a separate bug for the library calls this needs? Greg Farnum
11:36 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
Greg Farnum wrote:
> And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the...
Sage Weil
09:24 AM CephFS Feature #3244: qa: integrate Ganesha into teuthology testing to regularly exercise Ganesha CephFS...
And for this one as well: setting up Ganesha in teuthology, run tests against it? Not using the Ceph shim or anything... Greg Farnum
11:35 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Greg Farnum wrote:
> Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph moun...
Sage Weil
09:24 AM CephFS Feature #3243: qa: test samba reexport via libcephfs vfs plugin in teuthology
Is this a matter of setting up (via teuthology) a Samba server which sits on top of a Ceph mount and then running tes... Greg Farnum
11:34 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Greg Farnum wrote:
> Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a...
Sage Weil
09:22 AM CephFS Feature #3426: ceph-fuse: build/run on os x
Noah has done some work on this in the wip-osx branch; last I heard you could compile and get a cluster going with vs... Greg Farnum
11:32 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
Greg Farnum wrote:
> What all does this encompass? Design? Implementation? Does it need to be an online switch or ca...
Sage Weil
09:13 AM CephFS Feature #3542: mds: migration path for existing anchors, anchortables, etc.
What all does this encompass? Design? Implementation? Does it need to be an online switch or can it be an offline job? Greg Farnum
11:30 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Greg Farnum wrote:
> Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect ...
Sage Weil
09:12 AM CephFS Feature #3541: mds: robust ino lookup using file backpointers
Is this bug supposed to encompass the anchor table replacement work as well? I wouldn't expect so, but the presence o... Greg Farnum
11:23 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
Josh Durgin
10:32 AM rbd Bug #3725 (Resolved): rbd_header_race script to be fixed in the nightlies
log: ubuntu@teuthology:/a.old/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33734... Tamilarasi muthamizhan
11:23 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Greg Farnum wrote:
> Do we have any kind of design for this? We've talked about it some and it's conceptually simple...
Sage Weil
09:08 AM CephFS Feature #3540: mds: maintain per-file backpointers on first file object
Do we have any kind of design for this? We've talked about it some and it's conceptually simple, but splitting up the... Greg Farnum
11:15 AM CephFS Feature #626 (In Progress): qa: add IOR, rompio, or other parallel workloads suite
Yeah, that's what slang's working on to enable this. Assigning this to him. Sage Weil
08:57 AM CephFS Feature #626: qa: add IOR, rompio, or other parallel workloads suite
SamL has done some work on getting MPI going under teuthology, and on running some multi-client FS tests. I'm not sur... Greg Farnum
11:14 AM Bug #3722: osd: indefinitely hung request on stable cluster
the trigger is a brief osd reset due to an intermittent network outage. no actual ceph-osd daemons restart.
<pr...
Sage Weil
09:39 AM Bug #3722 (Need More Info): osd: indefinitely hung request on stable cluster
Sage Weil
08:36 AM Bug #3722 (Resolved): osd: indefinitely hung request on stable cluster
0.48.2argonaut, rbd workload.
occasional requests are blocked indefinitely.
*may* be osd down/up cycles (due to...
Sage Weil
11:13 AM CephFS Feature #3621 (Resolved): qa: add knfsd reexport tests to qa suite
Sage Weil
10:53 AM Bug #3723: ceph osd down command reports incorrectly
similarly for "ceph osd in" command as well
ubuntu@burnupi06:/etc/ceph$ sudo ceph osd in 2 -k /etc/ceph/ceph.key...
Tamilarasi muthamizhan
09:33 AM Bug #3723 (Can't reproduce): ceph osd down command reports incorrectly
issuing the command: "sudo ceph osd down 2" reports osd.2 is already down but sudo ceph osd stat reports all are up.
...
Ken Franklin
10:21 AM Bug #3698 (In Progress): filestore: ENOENT on clone
Sage Weil
09:43 AM Bug #3699 (Resolved): osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
commit:4ae4dce5c5bb547c1ff54d07c8b70d287490cae9 Sage Weil
09:43 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
Oh, those are librados specific numbers, aren't they. So this bug is to create and expose a libceph version, then. Wh... Greg Farnum
09:35 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
In libcephfs there is a call to get Ceph version (yes, just expose this). But, I recall Sage mentioning that it might... Noah Watkins
09:19 AM CephFS Feature #3399: java: add accessor to Ceph version numbers
This is just exposing the librados version() function to Java, right? Greg Farnum
09:41 AM rgw Bug #3724 (Resolved): docs refer to non-implemented features of the radosgw-admin rest api
The only radosgw-admin API calls currently are *get usage* and *trim usage* The docs at
http://ceph.com/doc...
caleb miles
09:41 AM CephFS Cleanup #660: mds: use helpers in mknod, mkdir, openc paths
What kind of helpers are you talking about with this? inode fetchers and lock grabbers? In a quick scan over handle_c... Greg Farnum
09:36 AM CephFS Feature #603: mds: repair directory hierarchy
This is part of #82 fsck, right? Do we have a more detailed algorithm anywhere? Greg Farnum
05:02 AM Revision 39a734fb (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
04:17 AM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> Nat, you should be able to install either of libtcmalloc-minimal or libgoogle-perftools — are...
Nat Makarevitch
03:40 AM Revision c63c6646 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
03:00 AM Revision acfa0c9a (ceph): mds: optimize C_MDC_RetryOpenRemoteIno
When opening remote inode, C_MDC_RetryOpenRemoteIno is used as onfinish
context for discovering remote inode. When it...
Yan, Zheng
02:45 AM Revision b03eab22 (ceph): mds: forbid creating file in deleted directory
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 59953257 (ceph): mds: keep dentry lock in sync state as much as possible
Unlike locks of other types, dentry lock in unreadable state can block path
traverse, so it should be in sync state a...
Yan, Zheng
02:45 AM Revision f9280cb6 (ceph): mds: fix replica state for LOCK_MIX_LOCK
LOCK_MIX_LOCK state is for gathering local locks and caps, so replica state
should be LOCK_MIX.
Signed-off-by: Yan, ...
Yan, Zheng
02:45 AM Revision 248e4ab8 (ceph): mds: fix cap mask for ifile lock
ifile lock has 8 cap bits, should its cap mask should be 0xff
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Yan, Zheng
02:45 AM Revision 420f3355 (ceph): mds: rdlock prepended dest trace when handling rename
rdlock prepended dest trace to prevent them from being xlocked by
someone else.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision ea2fd127 (ceph): mds: check null context in CDir::fetch()
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
02:45 AM Revision 3705c7ca (ceph): mds: drop locks when opening remote dentry
Opening remote dentry while holding locks may cause dead lock. For example,
'discover' is blocked by a xlocked dentry...
Yan, Zheng
02:45 AM Revision ca4dc4db (ceph): mds: check if stray dentry is needed
The necessity of stray dentry can change before the request acquires
all locks.
Signed-off-by: Yan, Zheng <zheng.z.y...
Yan, Zheng
02:45 AM Revision acbe6d97 (ceph): mds: don't issue caps while inode is exporting caps
If issue caps while inode is exporting caps, the client will drop the
caps soon when it receives the CAP_OP_EXPORT me...
Yan, Zheng
02:45 AM Revision d379ac8e (ceph): mds: disable concurrent remote locking
Current code allows multiple MDRequests to concurrently acquire a
remote lock. But a lock ACK message wakes all reque...
Yan, Zheng
01:15 AM Revision 28d59d37 (ceph): os/FileStore: fix non-btrfs op_seq commit order
The op_seq file is the starting point for journal replay. For stable btrfs
commit mode, which is using a snapshot as...
Sage Weil
12:23 AM Revision 49416619 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
12:13 AM Revision f1e0305f (ceph): doc: Removed the --without-tcmalloc flag until further advised.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
12:07 AM Revision 19df2086 (ceph): Merge pull request #30 from rca/master
Minor clarification in docs. Sage Weil

01/03/2013

11:04 PM Revision 5ce47c2a (ceph): ssh_keys.py: pull the keys out of targets entry
rather than the hosts known hosts file.
Signed-off-by: Joe Buck <jbbuck@gmail.com>
Reviewed-by: Sam Lang <sam.lang@i...
Joe Buck
10:51 PM Revision 88af7d18 (ceph): doc: Added defaults for PGs, links to recommended settings, and updated...
Fixes: #3555
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
10:32 PM Revision b8f061dc (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
10:18 PM Revision 4ae4dce5 (ceph): OSD: for old osds, dispatch peering messages immediately
Normally, we batch up peering messages until the end of
process_peering_events to allow us to combine many notifies, ...
Samuel Just
09:30 PM Revision 73bc8ffc (ceph): doc: Added comments on --without-tcmalloc option when building Ceph.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
09:30 PM Revision 37b57cdf (ceph): Update doc/rados/configuration/filesystem-recommendations.rst
Clarified when it's necessary to use the setting:
filestore xattr use omap = true
rca
09:29 PM Revision 43ef6772 (ceph): doc: Added some packages to the copyable line.
Fixes: #3686
Signed-off-by: John Wilkins <john.wilkins@inktank.com>
John Wilkins
09:28 PM Revision 333ae82c (ceph): doc: Fixed syntax error.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
08:57 PM Revision aaa03bbc (ceph): qa: Add knfsd reexport suite
Feature http://tracker.newdream.net/issues/3621
Signed-off-by: David Zafman <david.zafman@inktank.com>
David Zafman
08:55 PM Revision 67968d11 (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
08:54 PM Revision 34266e6b (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
08:53 PM Revision 4034f6c8 (ceph): log: broadcast cond signals
We were using a single cond, and only signalling one waiter. That means
that if the flusher and several logging thre...
Sage Weil
08:53 PM Revision 7e94f6f1 (ceph): Merge remote-tracking branch 'gh/wip-3714-b' into next
Signed-off-by: Samuel Just <sam.just@inktank.com> Sage Weil
08:44 PM Revision 224a33bb (ceph): qa/workunit: Add dbench-short.sh for nfs suite
A multi-client dbench run doesn't work over NFS,
see bug #3718. Make single client dbench available.
Signed...
David Zafman
08:13 PM Documentation #3709 (In Progress): crush-map.rst: claims 'types' are default, not true (must be s...
John Wilkins
02:32 PM Documentation #3709: crush-map.rst: claims 'types' are default, not true (must be specified); spe...
These are "defaults" in the sense that they're generated as part of the default OSD Map. Apparently that needs to be ... Greg Farnum
07:57 PM Documentation #3707 (In Progress): crush-map.rst: syntax error in example
John Wilkins
05:54 PM Bug #3702: OSD SIGABRT during startup
Was the monitor also running 0.48.2argonaut when osd.131 originally crashed? Or something else? Sage Weil
05:45 PM Bug #3721: filestore: op_seq written in wrong order on non-btrfs
Sage Weil
04:02 PM Bug #3721 (Resolved): filestore: op_seq written in wrong order on non-btrfs
see wip-fsync Sage Weil
05:23 PM Revision f8bb4814 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
05:14 PM Revision eee795c0 (ceph): rbd_xfstests.yaml: drop test 186
Stop running test 186. It keeps failing in nightly runs, unable
to unmount the scratch file system during setup. As...
Alex Elder
04:47 PM rgw Documentation #2993 (Resolved): doc: write quick RGW guide (if feasible)
John Wilkins
04:45 PM devops Feature #2884: doc: osd hotplugging
I believe the hotplug event was added, but will confirm. John Wilkins
04:43 PM devops Documentation #2974: doc: update chef docs for mon key distribution
I believe this is done. Will verify. John Wilkins
04:13 PM devops Documentation #3686: install prerequisites (Debian)
Greg Farnum wrote:
> John, can you remove that --without-tcmalloc bit until we hear more?
>
> Nat, you should be ...
John Wilkins
02:48 PM devops Documentation #3686 (In Progress): install prerequisites (Debian)
John, can you remove that --without-tcmalloc bit until we hear more?
Nat, you should be able to install either of ...
Greg Farnum
02:45 PM devops Documentation #3686: install prerequisites (Debian)
Eek. We really, really want people to be using tcmalloc (memory behavior without it is astonishingly atrocious). I kn... Greg Farnum
01:31 PM devops Documentation #3686 (Resolved): install prerequisites (Debian)
Added packages to the copyable lines. Modified the build page to include --without-tcmalloc. John Wilkins
03:50 PM Bug #3698: filestore: ENOENT on clone
Ok. The recovery_qos stuff can allow a client op to reorder past a push. This is a problem since the push might be ... Samuel Just
07:53 AM Bug #3698: filestore: ENOENT on clone
another instance with logs: ubuntu@teuthology:/a/sage-a2/33879 Sage Weil
02:52 PM Documentation #3555 (Resolved): {page-num} in ceph osd pool create is not optional
Updated the document to add "required," the default values, a link to calculating PG values, clarification about PGP,... John Wilkins
02:49 PM Bug #3633: mon: clock drift errors not reported by ceph status
The OSD clocks are actually fairly unimportant. Everything they use that requires precise timing should be based enti... Greg Farnum
10:12 AM Bug #3633: mon: clock drift errors not reported by ceph status
The objective here was to make sure that clock skews on the monitors were detected and reported, as said skews might ... Joao Eduardo Luis
08:46 AM Bug #3633: mon: clock drift errors not reported by ceph status
Reading the patch it looks only the clocks of the mons are checked. So the clocks of the osds are not important to ce... Corin Langosch
02:34 PM Bug #3720: Ceph Reporting Negative Number of Degraded objects
Per Josh D's suggestion, I set the tunables and it resolved the issue.
# ceph osd getcrushmap -o /tmp/crush
# cru...
Mike Dawson
01:02 PM Bug #3720 (Duplicate): Ceph Reporting Negative Number of Degraded objects
Changed the replication of two pools from 2x to 3x. Cluster rebalanced to nearly HEALTH_OK but got stuck at:
HEALT...
Mike Dawson
02:32 PM rbd Bug #3697: rbd copy.sh test failing in nightly
When reproducing with lots of error logging to stderr, the error occurs on snapshots because the snap rm/snap info te... Dan Mick
01:59 PM CephFS Bug #3597: ceph-fuse: denying root access
I believe that we can reproduce this error. We are running Ubuntu 12.04 LTS Server on both the client and on the Cep... Graham Hemingway
12:56 PM CephFS Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
logs: ubuntu@teuthology:/a/teuthology-2013-01-02_19:00:03-regression-next-testing-basic/33621... Tamilarasi muthamizhan
12:53 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
commit:7e94f6f1a7b7a865433edacd6a521f6ea1170eac Sage Weil
10:28 AM Bug #3714 (Fix Under Review): osd: new peering code does not consume osdmaps prior to booting
Sage Weil
12:48 PM CephFS Bug #3718 (Rejected): multi-client dbench gets stuck over NFS exported cephfs
When running qa/workunit dbench.sh the dbench 1 passes, but the dbench 10 gets hung up.
We should check this with ...
David Zafman
12:28 PM CephFS Feature #3621 (In Progress): qa: add knfsd reexport tests to qa suite
David Zafman
09:49 AM RADOS Feature #3717 (New): osd: Make Rebalancing Smarter
From Corin Langosch - During recovery/ rebalacing it can happen that an osd receives lots of new data before data tha... Ian Colle
09:45 AM Bug #3716: recovery should take osd usage into account
1. My cluster already uses the tuned crushmap "crushtool -i /tmp/crush --set-choose-local-tries 0 --set-choose-local-... Corin Langosch
09:36 AM Bug #3716 (Closed): recovery should take osd usage into account
#1: this is a matter of adjusting the crush tunables. see http://ceph.com/docs/master/rados/operations/crush-map/?hig... Sage Weil
09:08 AM Bug #3716 (Closed): recovery should take osd usage into account
Using argonaut 0.48.2. Yesterday one osd crashed (disk io error) and recovery started as expected. All osds had an us... Corin Langosch
09:44 AM Bug #3550: mon: Ceph fails to work when IP address is changed on the host
Joao,
thanks for the update.
Since mine came about due to a testing environment build on DHCP, I did not have the ...
Anonymous
09:32 AM CephFS Bug #3681: kclient fsx fails nightly
Its most likely all the same bug, but fsx fails in different ways each time (always because of a truncate down). The... Sam Lang
09:27 AM CephFS Feature #3543: mds: new encoding
right. about 80% complete, see wip-mds-encoding. Sage Weil
09:22 AM CephFS Feature #3543: mds: new encoding
What is this task? Switching to use our versioned encoding scheme? Greg Farnum
09:17 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I just disabled test 186 from the list run for the nightly
tests. It's defined in the ceph-qa-suite git repository,...
Alex Elder
06:39 AM Revision a32d6c5d (ceph): osd: move common active vs booting code into consume_map
Push osdmaps to PGs in separate method from activate_map() (whose name
is becoming less and less accurate).
Signed-o...
Sage Weil
06:20 AM Revision 0bfad8ef (ceph): osd: let pgs process map advances before booting
The OSD deliberate consumes and processes most OSDMaps from while it
was down before it marks itself up, as this is c...
Sage Weil
06:04 AM Revision 5fc94e89 (ceph): osd: drop oldest_last_clean from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
06:04 AM Revision 67f7ee67 (ceph): osd: drop unused variables from activate_map
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
05:09 AM Revision a14a36ed (ceph): OSDMap: fix modifed -> modified typo
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
04:44 AM Revision 9ca69e73 (ceph): ceph: malloc check =3 means we hear on stderr too
Sage Weil
03:58 AM Revision 2141454e (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil
02:13 AM Revision 6b5a89d2 (ceph): Merge remote-tracking branch 'gh/next'
Sage Weil
01:01 AM Revision 43cba617 (ceph): log: fix locking typo/stupid for dump_recent()
We weren't locking m_flush_mutex properly, which in turn was leading to
racing threads calling dump_recent() and garb...
Sage Weil

01/02/2013

11:59 PM Revision 29ff87a5 (ceph): Merge branch 'master' of https://github.com/ceph/ceph
John Wilkins
11:58 PM Revision 64d2760a (ceph): doc: Added a memory profiling section. Ported from the wiki.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:57 PM Revision 5066abf1 (ceph): doc: Added memory profiling to the index.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
11:08 PM Revision 0e9a0cd7 (ceph): qa/workunit: Update pjd script to use new tarball
The pjd script now uses the latest version of pjd
with an additional test for opening a non-existent
file.
Signed-of...
Sam Lang
11:07 PM Bug #3715: Crash during 0.55 -> 0.56 upgrade
is someone sending an MOSDOp that has no ops? init_op_flags() is called before can_*(), so this sounds like an empty... Sage Weil
10:05 PM Bug #3715 (Duplicate): Crash during 0.55 -> 0.56 upgrade
I started upgrading my 0.55.1 cluster to 0.56 and at one point in the middle of the upgrade, all 0.55.1 OSDs started ... Faidon Liambotis
10:38 PM Revision d8940d15 (ceph): fuse: Fix cleanup code path on init failure
With the changes from 856f32ab, the cfuse.init call returns
a _positive_ errno, which was getting ignored. Also, if ...
Sam Lang
10:15 PM Revision c4370ff0 (ceph): librbd: establish watch before reading header
This eliminates a window in which a race could occur when we have an
image open but no watch established. The previou...
Josh Durgin
09:56 PM rbd Bug #3697: rbd copy.sh test failing in nightly
Reproduces OK on plana cluster, indeed. This seems to point toward some sort of OSD bug where committed state isn't ... Dan Mick
09:39 AM rbd Bug #3697 (In Progress): rbd copy.sh test failing in nightly
Sage Weil
09:42 PM Revision 93656013 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 483c6f76adf960017614a8641c4dcdbd7902ce33)
Sage Weil
09:42 PM Revision be0473bb (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit c461e7fc1e34fdddd8ff8833693d067451df906b)
Sage Weil
09:42 PM Revision de619327 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:42 PM Revision ded454c6 (ceph): os/FileJournal: logger is optional
Signed-off-by: Sage Weil <sage@inktank.com>
(cherry picked from commit 076b418c7f03c5c62f811fdc566e4e2b776389b7)
Sage Weil
09:42 PM Revision 9a1cf518 (ceph): Merge branch 'wip-journal-aio' into next
Reviewed-by: Samuel Just <sam.just@inktank.com>
Backport: bobtail
Sage Weil
09:39 PM Revision dda7b651 (ceph): os/FileJournal: limit size of aio submission
Limit size of each aio submission to IOV_MAX-1 (to be safe). Take care to
only mark the last aio with the seq to sig...
Sage Weil
09:39 PM Revision c461e7fc (ceph): test_filejournal: test journaling bl with >IOV_MAX segments
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:39 PM Revision 483c6f76 (ceph): test_filejournal: optionally specify journal filename as an argument
Signed-off-by: Sage Weil <sage@inktank.com> Sage Weil
09:34 PM Bug #3714 (Resolved): osd: new peering code does not consume osdmaps prior to booting
Previously when we handled the old osdmaps catching up (pre-MOSDBoot) we'd do advance_map and the pgs would update th... Sage Weil
08:32 PM Revision e0858fa8 (ceph): Revert "librbd: ensure header is up to date after initial read"
Using assert version for linger ops doesn't work with retries,
since the version will change after the first send.
Th...
Josh Durgin
08:31 PM Revision 06310994 (ceph): ceph: enable malloc debugging for ceph-osd
Sage Weil
07:49 PM Revision 3686371e (ceph): rados: add test_filejournal
This writes to /tmp by default; should be ok plana, since it's / and not
tmpfs.
Sage Weil
07:24 PM Revision 82297706 (ceph): doc: Minor edits.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
07:15 PM Revision d3b9803e (ceph): doc: Fixed typo, clarified usage.
Signed-off-by: John Wilkins <john.wilkins@inktank.com> John Wilkins
05:23 PM rbd Bug #3685: xfs test 186 fails in the nightlies
It is possible for umount() to return EBUSY. However from
what I can tell that only occurs when the device being
u...
Alex Elder
02:34 PM rbd Bug #3685: xfs test 186 fails in the nightlies
OK I've tried reproducing it manually (on a teuthology node, but
running it using a command line while in an "intera...
Alex Elder
12:06 PM rbd Bug #3685: xfs test 186 fails in the nightlies
Test 184 doesn't touch the scratch device. Looks like the next
one back is 167, which exercises unwritten extent co...
Alex Elder
11:56 AM rbd Bug #3685: xfs test 186 fails in the nightlies
I thought I had updated this but I have not.
Test 186 is exercising activities that at one time caused a
bug in x...
Alex Elder
05:15 PM Bug #3699: osds crashed in ReplicatedPG::sub_op_modify on a mixed node cluster
reproduced this on burnupi21. Tamilarasi muthamizhan
05:00 PM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
with glibc malloc and debug enabled:... Sage Weil
08:57 AM Bug #3678: osd: tcmalloc segfault in PG::CephPeeringEvt::CephPeeringEvt<PG::MNotifyRec>()
another one with full osd logs:... Sage Weil
04:13 PM Documentation #3687 (Resolved): Documentation needs a "memory profiling" section
This has been ported. I haven't added a valgrind use case yet. John Wilkins
01:20 PM Documentation #3687 (In Progress): Documentation needs a "memory profiling" section
John Wilkins
03:51 PM Feature #3713 (Rejected): ceph osd tree should show disk usage
As ceph seems to already monitor the disk usage of each osd it's be great to have it displayed in "ceph osd tree". Corin Langosch
03:08 PM rbd Bug #3619: librbd: read_iterate sparse behavior broken
Mitigated somewhat by sparsification efforts in rbd import/export, but still librbd
should be fixed.
Dan Mick
02:11 PM devops Feature #3712 (New): Ceph Commands should provide appropriate responses, when Ceph Service is not...
When ceph service is not running, running other ceph command should give a response that makes sense instead of just ... Anonymous
02:02 PM Cleanup #2078: ceph tool: only output response data to stdout
i think we need to phase out all of the first-line nonsense. Sage Weil
01:48 PM Cleanup #2078: ceph tool: only output response data to stdout
This also affects things like ceph pg dump --format=json. You can't pipe it to a pretty printer without ignoring the ... Josh Durgin
01:52 PM Documentation #3711 (Resolved): crush-map.rst: choose firstn talks about "N", but does not clearl...
The implication is that 'N' is "the number of buckets of type 'type' available", but Sam believes it must really be "... Dan Mick
01:40 PM Bug #3684 (Resolved): filejournal: aio vector size is not limited
Sage Weil
01:34 PM rbd Feature #3456 (Closed): make exit code of ceph status commands status dependent
Josh Durgin
01:29 PM rbd Documentation #2992 (Resolved): doc: RBD parent/child snapshot
Josh Durgin
01:26 PM rbd Documentation #2992: doc: RBD parent/child snapshot
This should be resolved. John Wilkins
01:24 PM Documentation #3710 (Closed): crush-map.rst: talks about 'step choose' but does not document it
Dan Mick
01:23 PM Documentation #3411 (Resolved): doc: add introductory detail to the main doc page (index.rst)
John Wilkins
01:21 PM rgw Feature #3207 (In Progress): qa: swift functional tests in nightly
Sage Weil
01:21 PM rgw Feature #3366 (In Progress): rgw: dr: define management api
Sage Weil
01:18 PM Documentation #2980 (Resolved): doc: write upgrading Ceph version
This was checked in and also reviewed by Josh and Sage. John Wilkins
01:16 PM Documentation #3322 (Resolved): doc: Explain multi-tenant CephFS
This has been added to a the end of the Ceph Configuration file section. It may benefit from review, as I believe the... John Wilkins
01:12 PM Feature #647 (Duplicate): mon: refactor paxos interaction
Sage Weil
01:11 PM Feature #183 (Resolved): qa: xfstests workunit
Sage Weil
01:10 PM Documentation #3709 (Resolved): crush-map.rst: claims 'types' are default, not true (must be spec...
crush-map.rst claims that the bucket type defaults are as appear in the table, but they're
not defaults; they must b...
Dan Mick
01:09 PM Feature #3376 (Duplicate): use external leveldb package for default builds
Sage Weil
01:08 PM Documentation #3707 (Resolved): crush-map.rst: syntax error in example
example includes:
item ceph-osd-server-1 2.00
this must have 'weight' explicitly in the line:
...
Dan Mick
01:03 PM Feature #3425 (Resolved): mon workload generator
Sage Weil
12:39 PM Bug #3702: OSD SIGABRT during startup
Attempting to start osd.131 (which was down due to the above noted problems) today resulted in quorum loss. Essential... Justin Lott
12:03 PM rgw Bug #3706 (Resolved): rgw functional test testSlashInName failed in nightly
logs: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33224... Tamilarasi muthamizhan
11:25 AM Revision a79493da (ceph): mds: skip frozen inode when assimilating dirty inodes' rstat
CDir::assimilate_dirty_rstat_inodes() may encounter frozen inodes that
are being renamed. Skip these frozen inodes be...
Yan, Zheng
11:25 AM Revision 2f96b472 (ceph): mds: fix anchor table commit race
Anchor table updates for a given inode is fully serialized on client side.
But due to network latency, two commit req...
Yan, Zheng
11:25 AM Revision 7e04504d (ceph): mds: fix on-going two phrase commits tracking
The slaves for two phrase commit should be mdr->more()->witnessed
instead of mdr->more()->slaves. mdr->more()->slaves...
Yan, Zheng
11:25 AM Revision b3796f46 (ceph): mds: indroduce DROPLOCKS slave request
In some rare case, Locker::acquire_locks() drops all acquired locks
in order to auth pin new objects. But Locker::dro...
Yan, Zheng
11:25 AM Revision b2d5005a (ceph): mds: fix lock state transition check
Locker::simple_excl() and Locker::scatter_mix() miss is_rdlocked
check; Locker::file_excl() miss is_rdlocked check an...
Yan, Zheng
11:25 AM Revision fe5936b1 (ceph): mds: remove unnecessary is_xlocked check
Locker::foo_eval() is always called for stable locks, so no need to
check if the lock is xlocked.
Signed-off-by: Yan...
Yan, Zheng
11:25 AM Revision f5ea5c36 (ceph): mds: don't defer processing caps if inode is auth pinned
We should not defer processing caps if the inode is auth pinned by MDRequest,
because the MDRequest may change lock s...
Yan, Zheng
11:25 AM Revision 5e8642a8 (ceph): mds: call maybe_eval_stray after removing a replica dentry
MDCache::handle_cache_expire() processes dentries after inodes, so the
MDCache::maybe_eval_stray() in MDCache::inode_...
Yan, Zheng
11:25 AM Revision 84224743 (ceph): mds: fix rename inode exportor check
Use "srcdn->is_auth() && destdnl->is_primary()" to check if the MDS is
inode exportor of rename operation is not reli...
Yan, Zheng
11:25 AM Revision 26279574 (ceph): mds: don't trigger assertion when discover races with rename
Discover reply that adds replica dentry and inode can race with rename
if slave request for rename sends discover and...
Yan, Zheng
11:25 AM Revision 5ae715be (ceph): mds: xlock stray dentry when handling rename or unlink
This prevents MDS from reintegrating stray before rename/unlink finishes
Signed-off-by: Yan, Zheng <zheng.z.yan@inte...
Yan, Zheng
11:25 AM Revision 7a520168 (ceph): mds: don't journal null dentry for overwrited remote linkage
Server::_rename_prepare() adds null dest dentry to the EMetaBlob if
the rename operation overwrites a remote linkage....
Yan, Zheng
11:25 AM Revision fcb9f988 (ceph): mds: use null dentry to find old parent of renamed directory
When replaying an directory rename operation, MDS need to find old parent of
the renamed directory to adjust auth sub...
Yan, Zheng
11:25 AM Revision d9d71473 (ceph): mds: don't trim ambiguous imports in MDCache::trim_non_auth_subtree
Trimming ambiguous imports in MDCache::trim_non_auth_subtree() confuses
MDCache::disambiguate_imports() and causes in...
Yan, Zheng
11:25 AM Revision 3b13d3dc (ceph): mds: only export directory fragments in stray to their auth MDS
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> Yan, Zheng
11:25 AM Revision 61da9b18 (ceph): mds: mark rename inode as ambiguous auth on all involved MDS
When handling cross authority rename, the master first sends OP_RENAMEPREP
slave requests to witness MDS, then sends ...
Yan, Zheng
11:09 AM Linux kernel client Bug #2764 (Closed): xfstest hang; osd socket closed messages
The fix for the warning messages is:
28362986f8743124b3a0fda20a8ed3e80309cce1
libceph: report connection ...
Alex Elder
10:54 AM Bug #3698: filestore: ENOENT on clone
recent log: ubuntu@teuthology:/a/teuthology-2013-01-01_19:00:03-regression-next-testing-basic/33152 Tamilarasi muthamizhan
09:45 AM CephFS Bug #3700: mds: FAILED assert(!item_session_list.is_on_list())
fixed by revert of bad fix, see commit:6711a4c4038dbdf843f9dfe42c7809c5c37ae534 Sage Weil
09:37 AM CephFS Bug #3700 (Resolved): mds: FAILED assert(!item_session_list.is_on_list())
Sage Weil
09:41 AM rbd Bug #3692 (Won't Fix): OSD's abort with "./common/Mutex.h: 89: FAILED assert(nlock == 0)"
This is a known problem with argonaut, but the fix is a rewrite of the whole module and we've chosen not to backport ... Sage Weil
09:09 AM Bug #3705 (Resolved): osd: crash in scrub finalize [argonaut]
... Sage Weil
08:28 AM Feature #3704 (Resolved): mon: add min log level to send cluster msgs to syslog
e.g., WARN and above only, but not INFO. This is for the mon/LogMonitor.cc submission path, not log/Log.cc (for debu... Sage Weil
05:55 AM Revision e10267b5 (ceph): mds: fix Locker::simple_eval()
Locker::simple_eval() checks if the loner wants CEPH_CAP_GEXCL to
decide if it should change the lock to EXCL state, ...
Yan, Zheng
05:54 AM Revision 7e23321b (ceph): mds: don't renew revoking lease
MDS may receives lease renew request while lease is being revoked,
just ignore the renew request.
Signed-off-by: Yan...
Yan, Zheng

01/01/2013

06:36 PM Revision eb02eaed (ceph): Merge remote-tracking branch 'gh/wip-bobtail-docs'
Sage Weil
05:35 AM Revision f1196c7e (ceph): Merge branch 'master' of https://github.com/ceph/ceph
Gary Lowell
05:31 AM Revision 5dd6b199 (ceph): Merge branch 'next'
Gary Lowell
02:37 AM Revision 8f77ec7d (ceph): Merge branch 'next'
Sage Weil
02:36 AM Revision 94a5dd6b (ceph): Merge remote-tracking branch 'gh/wip-3675'
Reviewed-by: Josh Durgin <josh.durgin@inktank.com> Sage Weil
01:10 AM Revision 1a32f0a0 (ceph): v0.56
Gary Lowell
 

Also available in: Atom