Project

General

Profile

Activity

From 02/17/2013 to 03/18/2013

03/18/2013

09:07 PM Bug #4491 (Resolved): mds: assert failure on _purge_forward_pointers
Sage Weil
03:37 PM Bug #4491 (Fix Under Review): mds: assert failure on _purge_forward_pointers
I pushed a proposed fix to wip-4491. Basically we just need to handle the case that the osd returns ENODATA. Sam Lang
03:03 PM Bug #4491: mds: assert failure on _purge_forward_pointers
This happens soon after ceph-fuse mount. I hit this when trying to run blogbench test. Tamilarasi muthamizhan
02:47 PM Bug #4491 (Resolved): mds: assert failure on _purge_forward_pointers

Joe Buck reported a bug with master:
INFO:teuthology.task.ceph.mds.0.err:mds/MDCache.cc: In function 'void MDCac...
Sam Lang
06:03 PM Bug #4489: ceph fs hangs on file stat
and no other specific events were that moment (like scrubbing, osd/mds/mon failures). Ivan Kudryavtsev
06:02 PM Bug #4489: ceph fs hangs on file stat
no, it started early than #4486, so the reason is another one. Ivan Kudryavtsev
06:00 PM Bug #4489: ceph fs hangs on file stat
I think it could be connected with #4486, because, I found about 150 launched cron tasks and every task is launched t... Ivan Kudryavtsev
05:46 PM Bug #4489: ceph fs hangs on file stat
Wrapping cron.d code.... Ivan Kudryavtsev
05:44 PM Bug #4489: ceph fs hangs on file stat
code, which caused hung running on two hosts:... Ivan Kudryavtsev
10:20 AM Bug #4489: ceph fs hangs on file stat
can provide shell access to one of servers but don't know if it can be reproduced easily. Ivan Kudryavtsev
10:17 AM Bug #4489 (Can't reproduce): ceph fs hangs on file stat
hi. I have cephfs (kernel client) mounted from two hosts at /var/www.
I'm trying to do...
Ivan Kudryavtsev
05:46 PM Feature #1448: test hadoop on sepia
Are nodes available for scale testing? Issdm cluster is withering away.. Noah Watkins
05:42 PM Feature #4484 (Resolved): Enable Hadoop bindings to pull configuration options from the monitor
Noah Watkins
04:28 PM Feature #4494 (New): qa: exercise recovery from migration points
In #4493 we checked recovery in an MDS cluster. Now we need to check recovery following each kill point involved in m... Greg Farnum
04:23 PM Feature #4493 (New): qa: trigger each kill_at point related to clustered recovery
Write a workunit using the restart teuthology task interface that handles running several MDS daemons and fully exerc... Greg Farnum
04:18 PM Tasks #4492 (New): mds: Define kill points involved in clustered migration and recovery
We need to define all the separate points at which a break in 1) clustered recovery and 2) migration leaves a differe... Greg Farnum
09:55 AM Bug #4434: looping waiting for quorum after upgrade
Yep! This says that you ran a branch which included an unreleased set of encoding rules on the MDS which would have c... Greg Farnum

03/17/2013

12:12 PM Feature #4484 (Fix Under Review): Enable Hadoop bindings to pull configuration options from the m...
Noah Watkins
12:12 PM Feature #4484: Enable Hadoop bindings to pull configuration options from the monitor
ceph.git wip-4484
hadoop-common.git cephfs/wip-4484
Noah Watkins
11:10 AM Feature #4485 (New): Improve "needsrecover" handling
Jim Schutt reported issues on the mailing list[1] with slow stats that turned out to be due to inodes with the "needs... Greg Farnum

03/16/2013

03:50 PM Feature #4484: Enable Hadoop bindings to pull configuration options from the monitor
I'd lean towards keyring files, but we may want to float this on Monday's stand-up. Anonymous
03:18 PM Feature #4484: Enable Hadoop bindings to pull configuration options from the monitor
Seems like there are keyring files, secret strings, client usernames, etc… Which one(s) should we use?
http://ceph...
Noah Watkins
03:10 PM Feature #4484: Enable Hadoop bindings to pull configuration options from the monitor
This replaces the current approach, which assumes that the host with the JobTracker (likely not an OSD and possibly n... Anonymous
02:59 PM Feature #4484 (Resolved): Enable Hadoop bindings to pull configuration options from the monitor
At present, the Hadoop bindings require several options be specified in xml files.
It would be easier for users if ...
Anonymous

03/15/2013

02:54 PM Feature #4326: qa: add samba + (kclient|ceph-fuse) to suite
I'm going to need to dig into why it doesn't seem to be finishing, but I think it might be exposing some (more) file ... Greg Farnum
10:51 AM Feature #4326: qa: add samba + (kclient|ceph-fuse) to suite
The wip-samba-on-ceph branch has "samba", "cifs-mount", and "smbtorture" tasks.
I notice that smbtorture on ceph-f...
Greg Farnum
02:23 PM Bug #4451: client: Ceph client not releasing cap
Looked at this again briefly. I notice:
1) the inode was previously in the stray directory (before MDS restart)
2) ...
Greg Farnum
09:31 AM Bug #4451: client: Ceph client not releasing cap
For some reason the MDS is sending back an "export" on the caps for that inode (timestamp 2013-03-15 09:07:38.098273)... Greg Farnum
08:59 AM Bug #4451 (Resolved): client: Ceph client not releasing cap

I'm occasionally hitting a hang in my backtrace testing, where unmount never completes. The client log shows a dis...
Sam Lang
01:56 PM Tasks #4467 (New): qa: make ior tasks work
Sage Weil
01:42 PM Fix #3630 (Resolved): mds: broken closed connection cleanup
Sage Weil
01:38 PM Fix #4286: SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
also, the invalidate callback code probably needs to be conditional, too! Sage Weil
01:35 PM Fix #2215 (Resolved): ceph-fuse does not invalidate page cache
Sage is turning it on by default now following weeks of testing in the nightlies! Greg Farnum
01:33 PM Feature #2903 (Resolved): ceph-fuse: Support -o noallow_other
Sage Weil

03/14/2013

03:51 PM Bug #4434: looping waiting for quorum after upgrade
This is what was captured at the time the test was run successfully: Ceph Version: 0.57-667-g6a9cda7
The next inst...
Ken Franklin
10:12 AM Bug #4434: looping waiting for quorum after upgrade
Just to make sure I'm tracking these upgrades correctly:
It was created on v0.56.3? (Not a branch.) Then it moved to...
Greg Farnum
09:30 AM Bug #4434: looping waiting for quorum after upgrade
It's quite possible the upgrade was corrupted somewhere along the line. Prior to the issues the system was on 0.56.3... Ken Franklin
01:30 PM Feature #4441 (Resolved): libcephfs: add ceph_get_osd_addr()
Noah Watkins
01:19 PM Feature #4441 (Resolved): libcephfs: add ceph_get_osd_addr()
Noah Watkins
01:20 PM Feature #4442 (Resolved): java: add topology API support
Noah Watkins
09:53 AM Bug #4358 (Resolved): kclient: ENOENT during kernel build on kclient
Sage Weil
08:54 AM Bug #4358: kclient: ENOENT during kernel build on kclient
passed another 100 iterations (modulo a machine lockup on the server side) Sage Weil

03/13/2013

08:21 PM Bug #4405: MDCache::populate_mydir can loop forever
And what's interesting all the time MDS server has incoming traffic of ~40MB/s, but no active clients. I found it aft... Ivan Kudryavtsev
08:14 PM Bug #4405: MDCache::populate_mydir can loop forever
OK, I don't know what do you mean under "start" term. But actually, all the time MDS run with
debug ms =1 and debug ...
Ivan Kudryavtsev
06:52 PM Bug #4405: MDCache::populate_mydir can loop forever
Hi Ivan-
Looking at the log, it looks like all 3 times the MDS started up it came up within 5 seconds or so. Do y...
Sage Weil
06:14 PM Bug #4358: kclient: ENOENT during kernel build on kclient
20 iterations on testing branch. i ran a bunch on master to make sure i could trigger the old bug, but then couldn't... Sage Weil
04:40 PM Bug #4390 (Resolved): mds: zapping named mds causes client assertion
commit:f67596a44739e8071cc97fb0463f37203502faaa Sage Weil
04:39 PM Bug #4385 (Resolved): mds: refusing connections with high open socket count
commit:8b713371447f9761597457af2c81f0b870d3c4ba Sage Weil
03:03 PM Bug #4434: looping waiting for quorum after upgrade
More details? I'm not sure how the title relates to the bug description or MDS log. The log is crashing on the Sessio... Greg Farnum
02:52 PM Bug #4434: looping waiting for quorum after upgrade
changed the project Ken Franklin
02:41 PM Bug #4434: looping waiting for quorum after upgrade

Part of the bug appears to be in ceph, where the following returns an error, causing an infinite loop in get_key():...
Sam Lang
02:36 PM Bug #4434 (Resolved): looping waiting for quorum after upgrade
How we got here:
Bobtail .56 installed on burnupi60 failed daily upgrade due to new gitbuilder keys.
updated key.
...
Ken Franklin
02:25 PM Bug #3640 (Duplicate): kclient: hang and kernel panic
dup of #3088 Sage Weil
02:24 PM Bug #3088: NULL pointer dereference at ceph_d_prune
this code may be gone now with yan's d_prune changes... Sage Weil
02:06 PM Bug #1945: blogbench hang on caps
Yan, would you mind taking a look at this when you have time? Ian Colle
02:05 PM Bug #3637: client: not issuing caps for with clients doing shared writes
Sage Weil

03/12/2013

11:18 PM Feature #4277: Move built hadoop artificats to download URL
I have a documentation branch pushed up that is waiting for the URLs. Let me know what those are and I can integrate ... Noah Watkins
11:22 AM Feature #4277 (In Progress): Move built hadoop artificats to download URL
As a starting point, let's post this on the download page as stand-alone jar files. I'll take ownership of doing that... Anonymous
08:57 PM Bug #4385 (Fix Under Review): mds: refusing connections with high open socket count
sounds right. thanks for testing! Sage Weil
08:25 PM Bug #4385: mds: refusing connections with high open socket count
Err, "unclean mounts" = "exiting without unmounting" Noah Watkins
08:23 PM Bug #4385: mds: refusing connections with high open socket count
Well hot damn. That branch seems to solve two problems. First, clients that do a clean unmount don't leave lots of FD... Noah Watkins
07:55 PM Bug #4385: mds: refusing connections with high open socket count
Noah, do you want to try wip-mds-con? Sage Weil
07:45 PM Bug #4385 (In Progress): mds: refusing connections with high open socket count
Sage Weil
01:53 PM Bug #4385: mds: refusing connections with high open socket count
Although the high counts were because of double counting by lsof, the sockets still are not being closed. Without any... Noah Watkins
12:32 PM Bug #4385: mds: refusing connections with high open socket count
Hmm, did we screw up our refactoring work so that replaced sockets are no longer actually closed? That might explain ... Greg Farnum
12:25 PM Bug #4385: mds: refusing connections with high open socket count
Here's some more info after investigating this a bit further.
Open socket counts by category after a fresh MDS reb...
Noah Watkins
11:52 AM Documentation #4422: Typo on Release Process webpage
Make that one fewer "are". Got to love making a typo on a ticket about a typo. Anonymous
11:38 AM Documentation #4422 (Resolved): Typo on Release Process webpage
This sentence (in section 1) needs one less instance of "and":
"The RPM based packages are are built natively, so on...
Anonymous

03/11/2013

09:34 PM Feature #4393 (Resolved): Add apache-hadoop gitbuilder to master gitbuilder webpage
gitbuilder.sepia.com just needed the new gitbuilder added to the proxy config file. Anonymous
05:55 PM Bug #4398: fix kclient_workunit_misc.yaml in the nightlies
ubuntu@teuthology:/a/teuthology-2013-03-11_01:00:04-regression-master-testing-gcov/21326 Tamilarasi muthamizhan
09:38 AM Bug #4398 (Duplicate): fix kclient_workunit_misc.yaml in the nightlies
Ian Colle
03:22 PM Feature #4073 (Resolved): qa: add message delay injection to test suite
Sage Weil
03:20 PM Feature #4190 (Resolved): qa: add mds thrashing to nightly
Sage Weil
09:47 AM Feature #4326 (In Progress): qa: add samba + (kclient|ceph-fuse) to suite
Ian Colle
04:42 AM Bug #4405: MDCache::populate_mydir can loop forever
Log is done when it was stuck last time. I stopped MDS, increased log level and started again. Ivan Kudryavtsev

03/10/2013

11:40 PM Bug #4405: MDCache::populate_mydir can loop forever
Log download link: http://pixeltram.com/ceph-mds.1.log.1.gz
Ivan Kudryavtsev
07:43 AM Bug #4405: MDCache::populate_mydir can loop forever
If the stuck startup is reproducible now (by lowering the cache size and restarting), a log with debug ms =1 and debu... Sage Weil
12:31 AM Bug #4405: MDCache::populate_mydir can loop forever
I mounted ceph root and counted amount of files and it's less than default cache size of 100000... Ivan Kudryavtsev
12:05 AM Bug #4405: MDCache::populate_mydir can loop forever
Actually, regarding initial ticket message. I think MDS goes in some kind of LOOP during start, when cache size is sm... Ivan Kudryavtsev

03/09/2013

11:59 PM Bug #4405: MDCache::populate_mydir can loop forever
I think it's important to specify some kind of metrics so everyone could calculate memory utilization of specific cac... Ivan Kudryavtsev
11:49 PM Bug #4405: MDCache::populate_mydir can loop forever
regarding q2:
I increased mds cache size to
mds cache size = 100000000
and it started in seconds.
I don't...
Ivan Kudryavtsev
11:27 PM Bug #4405 (Resolved): MDCache::populate_mydir can loop forever
I had unusual MDS failure. My server NIC started to flap and as a result (finally)
my CEPH FS started to recover an...
Ivan Kudryavtsev
10:35 PM Bug #4390: mds: zapping named mds causes client assertion
ran this through the fs suite and it passed. i would expect breakage in mds thrashing and multimds situations, thoug... Sage Weil
07:24 AM Bug #4358: kclient: ENOENT during kernel build on kclient
That might work, as long as we don't need to update the flags and i_release_count atomically... that'd have to become... Sage Weil
06:12 AM Bug #4358: kclient: ENOENT during kernel build on kclient
any idea to fix the locking issue? use atomic bit operation to modify the i_ceph_flags? Zheng Yan

03/08/2013

05:27 PM Bug #4398: fix kclient_workunit_misc.yaml in the nightlies
looks like the test failed due to,
2013-03-06T06:38:55.270 INFO:teuthology.task.workunit.client.0.out:
2013-03-06...
Tamilarasi muthamizhan
05:16 PM Bug #4398 (Duplicate): fix kclient_workunit_misc.yaml in the nightlies
log: ubuntu@teuthology:/a/teuthology-2013-03-06_01:00:04-regression-master-testing-gcov/16995... Tamilarasi muthamizhan
04:34 PM Bug #4385: mds: refusing connections with high open socket count
Log file fun. Here is the MDS log up until it stopped accepting connections.
http://piha.soe.ucsc.edu/ceph-mds.a.l...
Noah Watkins
10:57 AM Bug #4385: mds: refusing connections with high open socket count
Would you like to logs up to the point that the MDS stops accepts connections, or just a snap shot after the FD list ... Noah Watkins
10:34 AM Bug #4385: mds: refusing connections with high open socket count
can you reproduce with debug ms = 20 and debug mds = 20 ? those logs would be helpful Sage Weil
10:30 AM Bug #4385: mds: refusing connections with high open socket count
To Greg's question, it seems as though the connections were not timing out. I'd toss out a rough estimate of about 45... Noah Watkins
10:29 AM Bug #4385: mds: refusing connections with high open socket count
Is there anything I can do to get more information for this ticket? Noah Watkins
09:56 AM Bug #4385: mds: refusing connections with high open socket count
It might be contributing, but I believe the sockets should still be getting closed after a timeout period, right? Greg Farnum
09:45 AM Bug #4385: mds: refusing connections with high open socket count
I bet #3630 is contributing here. Sage Weil
07:11 AM Bug #4385: mds: refusing connections with high open socket count
I had this thought that the set of FDs in the logs would be >> than the set shown in lsof, and that we'd want to cros... Noah Watkins
02:11 PM Bug #4390: mds: zapping named mds causes client assertion
pushed wip-4390-b, which solves this on the client side.
i don't really want to delay the mark-down/failing in the...
Sage Weil
08:48 AM Bug #4390: mds: zapping named mds causes client assertion
That approach was breaking the monitor. Just pushed a new approach that queues the zap for later. Sam Lang
06:37 AM Bug #4390 (Fix Under Review): mds: zapping named mds causes client assertion
Sam Lang
06:37 AM Bug #4390: mds: zapping named mds causes client assertion
Proposed fix in wip-4390. Should we also cleanup the client code to wait till the mdsmap contains up members? Separ... Sam Lang
06:31 AM Bug #4390 (Resolved): mds: zapping named mds causes client assertion

Hit the following assertion on the client with backtrace testing:
../../src/mds/MDSMap.h: In function 'const ent...
Sam Lang
09:29 AM Feature #4393 (Resolved): Add apache-hadoop gitbuilder to master gitbuilder webpage
I brought a new gitbuilder online at gitbuilder-precise-apache-hadoop-amd64.front.sepia.ceph.com and ran the command ... Anonymous
09:01 AM Bug #4358: kclient: ENOENT during kernel build on kclient
I hit this today while testing. Sorry, I don't remember
which test but Sage says he knows what happened.
http://pa...
Alex Elder
08:46 AM Fix #2215: ceph-fuse does not invalidate page cache
Those tests are part of the full regression test suite. Sam Lang

03/07/2013

10:32 PM Fix #4286 (In Progress): SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
big_write was added in fuse 2.8, sles has fuse version 2.7.2
atomic_o_trunc requires fuse > 2.2 and kernel > 2.6.2...
Anonymous
09:48 PM Bug #4385: mds: refusing connections with high open socket count
Doesn't /proc tell you whether the fd is a socket or not? Or do you mean correlate activity?
In any case, all the ...
Greg Farnum
07:05 PM Bug #4385: mds: refusing connections with high open socket count
Err, dump up the level on the MDS... Noah Watkins
07:04 PM Bug #4385: mds: refusing connections with high open socket count
I'll test out the ulimit as a workaround, and presumably to verify the open fd limit theory.
I checked all my clie...
Noah Watkins
06:53 PM Bug #4385: mds: refusing connections with high open socket count
The direct cause of this is almost certainly an open fd limit coming from the OS, which you can probably work around ... Greg Farnum
06:04 PM Bug #4385 (Resolved): mds: refusing connections with high open socket count
My MDS has become unresponsive after a long period of map-reduce jobs. The MDS process is idle, but is eating up 16 G... Noah Watkins
09:06 PM Fix #4034 (Resolved): mds: fix replayed ino creation extra_bl
commit:3a7233bc8b199c97fbde9c1e44370353f0504af8 Sage Weil
05:46 PM Fix #4034: mds: fix replayed ino creation extra_bl
There's still a bad comment in 0c0313c6f6d4e2733fcf972b49456bf1faad9255, but the rest looks good! Greg Farnum
03:49 PM Fix #4034: mds: fix replayed ino creation extra_bl
Reviewed on Github. Greg Farnum
09:06 PM Feature #4074 (Resolved): qa: add traceless reply test to fs suite
commit:de62a79589fc4feed4243ac278d365b6363bfa2b fixed ceph.git bugs. added tests to fs suite. Sage Weil
03:49 PM Feature #4074: qa: add traceless reply test to fs suite
Edit; wrong bug, sort of. Greg Farnum
08:41 PM Cleanup #4387 (Resolved): mds: EMetaBlob::client_reqs doesn't need to be a list
It is either set or not set at all, currently. Sage Weil
07:21 PM Feature #4386 (Resolved): kclient: Mount error message when no MDS present
Right now you either get an input/output error or a message about not being able to find the superblock when trying t... Mark Nelson
01:09 PM Bug #4358: kclient: ENOENT during kernel build on kclient
An initial patch from Yan is in our testing branch and should fix this issue. (Or at least fixes one cause.) It may g... Greg Farnum
09:35 AM Bug #4358: kclient: ENOENT during kernel build on kclient
Let's see if this happens in testing branch after Yan's patches are all applied. Ian Colle
01:01 AM Bug #4358: kclient: ENOENT during kernel build on kclient
got following message for kernel build error "find: `./include/generated': No such file or directory".
It's strange ...
Zheng Yan
12:38 PM Bug #4370 (New): mds: high-cpu utilization in memorymodel:_sample
Shortly after running some fs workloads on a 1-mds/16-osd cluster, cpu utilization spikes and never returns to normal... Noah Watkins

03/06/2013

05:16 PM Feature #4361 (Resolved): Setup another gitbuilder VM for building external Hadoop git repo(s)
One of the first things I did today was create it but it was taking a while and I started working on some other stuff... Sandon Van Ness
11:55 AM Feature #4361 (Resolved): Setup another gitbuilder VM for building external Hadoop git repo(s)
We're moving from building our own monolithic Hadoop packages to building a Hadoop/Ceph library and then running that... Anonymous
03:34 PM Feature #4356 (Closed): libcephfs: expose osd topology
Noah Watkins
07:31 AM Bug #4358 (Resolved): kclient: ENOENT during kernel build on kclient
... Sage Weil

03/05/2013

07:47 PM Feature #4356 (Fix Under Review): libcephfs: expose osd topology
Noah Watkins
07:46 PM Feature #4356 (Closed): libcephfs: expose osd topology
wip-expose-topo
e8da4bf and 6b3fce1
Noah Watkins
07:14 PM Feature #4074 (Fix Under Review): qa: add traceless reply test to fs suite
Sage Weil
07:14 PM Fix #4034 (Fix Under Review): mds: fix replayed ino creation extra_bl
Sage Weil
03:47 PM Feature #4355 (New): uclient: add perfcounters
The client currently has 3 perfcounters: average latency of replies, of processing a request, and of a file write.
...
Greg Farnum
03:46 PM Feature #4354 (Resolved): mds: add an equivalent to the OSD OpTracker
Like it says — we want to be able to get information about ops-in-flight and their current status in a lot of differe... Greg Farnum
01:29 PM Cleanup #4166 (Resolved): ceph: simplify ceph_sync_write() page_align
This has been committed to the ceph-client testing branch.
038832c ceph: simplify ceph_sync_write() page_align cal...
Alex Elder
11:12 AM Bug #4350 (Rejected): ceph-fuse: lockup from 40g loopback mkfs.ext3
The underlying RADOS cluster in this report isn't fully healthy. I'm pretty sure that's all there is. Unless we hear ... Greg Farnum
09:00 AM Bug #4350 (Rejected): ceph-fuse: lockup from 40g loopback mkfs.ext3
... Sage Weil

03/04/2013

02:10 PM Feature #4326 (Resolved): qa: add samba + (kclient|ceph-fuse) to suite
Sage Weil
11:01 AM Documentation #3796 (Resolved): FUSE mount documentation needs some corrections for v0,56x
Page has been updated with instructions, and a hyperlink to Cephx Configuration Reference. See http://ceph.com/docs/m... John Wilkins
10:06 AM Documentation #3796 (In Progress): FUSE mount documentation needs some corrections for v0,56x
John Wilkins
10:38 AM Cleanup #4166: ceph: simplify ceph_sync_write() page_align
This patch (2/3 below) has been posted for review, along with
a few others I include here for context. Marking for ...
Alex Elder
08:50 AM Cleanup #4166 (In Progress): ceph: simplify ceph_sync_write() page_align
I'm reopening this after all.
It turns out that the original patch was fine. The only part
that was bad was due ...
Alex Elder
07:14 AM Feature #4277: Move built hadoop artificats to download URL
Thanks for the info Gary! Let me do a little bit more research on how users want to obtain the artifacts. I think tha... Noah Watkins

03/01/2013

04:11 PM Fix #2215: ceph-fuse does not invalidate page cache
Which automatic tests actually run those? I'm not sure that the nightlies do so right now. Greg Farnum
04:00 PM Fix #2215: ceph-fuse does not invalidate page cache
I add the fuse_use_invalidate_cb: true option in the ceph-qa-suite to the basic and verify fs suites (in the btrfs.ya... Sam Lang
02:36 PM Feature #3819: mds: re-add snaptests to qa suite
De-prioritizing in light of ongoing discussions. Greg Farnum
02:35 PM Bug #4134: mds: request locking hang under snaptests
De-prioritizing this for now based on our meetings and discussion yesterday. Greg Farnum
02:18 PM Documentation #3113: Ceph FS Options Could Use Some Additional Information
Any discussion of cephfs options (which really don't have much to do with mount!) should also refer to the equivalent... Greg Farnum
01:45 PM Documentation #3113: Ceph FS Options Could Use Some Additional Information
John - could you move this up your queue? Ian Colle
02:05 PM Fix #4034: mds: fix replayed ino creation extra_bl
Sage Weil
01:44 PM Documentation #2206: Need a control command to gracefully shutdown an active MDS prior to planned...
'ceph mds fail 0' Sage Weil
11:41 AM Bug #4308 (Won't Fix): ceph-fuse crashed during blogbench test (argonaut)
Log: ubuntu@teuthology:/a/teuthology-2013-02-28_20:00:04-regression-argonaut-master-basic/14324... Tamilarasi muthamizhan

02/28/2013

04:51 PM Feature #4074 (In Progress): qa: add traceless reply test to fs suite
Sage Weil
10:48 AM Feature #4295: mds: Actually purge deleted directories
And Sage points out the reason it's not done so far is because fragmentation mucks it up a bit.... Greg Farnum
10:48 AM Feature #4295 (Resolved): mds: Actually purge deleted directories
Right now we never actually delete a directory object when it gets unlinked in the filesystem.
1) Identify the reaso...
Greg Farnum
09:33 AM Fix #4286: SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
Need to figure out what version of fuse added those flags. Ian Colle

02/27/2013

10:42 PM Tasks #4222 (Resolved): Add libcephfs-test.jar to ceph-test deb package
Resolved with commit:
commit b65ca564b6ec7324ba43e4d35a2d114b4f7220ea
Author: Gary Lowell <glowell@inktank.com>
...
Anonymous
11:10 AM Fix #4286 (Rejected): SLES 11 - cfuse: disable 'big_writes'and 'atomic_o_trunc
encounted error when mounting a cfuse volume on SLES 11.
unknown option: big-writes
unknown option: atomic_o_trun...
Ken Franklin

02/26/2013

10:32 PM Feature #4277: Move built hadoop artificats to download URL
The output of the gitbuilders are staged to gitbuilder.ceph.com, the release process (different set of scripts) produ... Anonymous
02:17 PM Feature #4277 (Closed): Move built hadoop artificats to download URL
We are at the point now where we are building some useable jar files that we'd like to reference from the Hadoop docu... Noah Watkins
05:01 PM Bug #4134: mds: request locking hang under snaptests
Oh, and actually the client is the one with the seq of 2 while the MDS only thinks it should have a seq of 1. Curious... Greg Farnum
02:48 PM Bug #4134 (In Progress): mds: request locking hang under snaptests
The request is attempting to read a snapshotted inode, but the inode is still in the LOCK_SNAP_SYNC state from when i... Greg Farnum
04:32 PM Bug #4280: mds: crash on lookupsnap
Logs and core in kai:/home/gregf/logs/4280 Greg Farnum
04:31 PM Bug #4280 (Resolved): mds: crash on lookupsnap
While running qa/workunits/snaps/snaptest-snap-rm-cmp.sh:... Greg Farnum
09:36 AM Bug #4266 (Won't Fix): crush placement from crushtool mismatches observed behavior for pools othe...
The --x is only coincidentally related to the second part of hte pgid (currently). If you want to map a specific pgi... Sage Weil
08:05 AM Bug #4266 (Won't Fix): crush placement from crushtool mismatches observed behavior for pools othe...
Because OSDs add the pool id to the pg number to form the x passed to crush_do_rule, whereas crushtool just gets an x... Alexandre Oliva

02/25/2013

01:51 PM Bug #4248: mds: replay does not correctly update CInode::first and ::last members
This isn't currently journaled correctly, in addition to not setting its snapid ranges correctly. There's nothing tha... Greg Farnum
11:19 AM Bug #4248 (In Progress): mds: replay does not correctly update CInode::first and ::last members
Okay, that didn't fix the problem. The stray dir now was snapids [3, HEAD] instead of [4, HEAD], but the inode remain... Greg Farnum
09:51 AM Feature #3821 (New): qa: run backuppc as part of qa suite
Ian Colle

02/22/2013

03:32 PM Bug #3794 (Resolved): uclient: reports sizes wrong in some cases
Sage Weil
03:10 PM Bug #3794: uclient: reports sizes wrong in some cases
Assuming we aren't pushing this into a stable release without more testing, both the patch and backport are
Review...
Greg Farnum
02:58 PM Bug #3794 (Fix Under Review): uclient: reports sizes wrong in some cases
wip-statvfs Sage Weil
03:31 PM Bug #3793 (Resolved): wrong size reported in some distributions/toolchains
'ceph: fix statvfs fr_size' in kernel tree. Sage Weil
02:58 PM Bug #3793: wrong size reported in some distributions/toolchains
I pushed a wip-statvfs which fixes this for ceph-fuse. Sage Weil
03:22 PM Bug #4248 (Fix Under Review): mds: replay does not correctly update CInode::first and ::last members
wip-4248-snapid-journaling
Also pushing out a v0.57-based branch to the user to try out.
Greg Farnum
03:17 PM Bug #4248 (Resolved): mds: replay does not correctly update CInode::first and ::last members
This came in over the mailing list.... Greg Farnum
01:14 PM Bug #4035 (Rejected): Ceph doesn't recover from fault on Opensuse (cfuse tests & rbd-cli tests)
Sage Weil
01:11 PM Bug #3818 (Duplicate): kclient: fsx fails in mapread
Ian Colle
01:09 PM Bug #3681: kclient fsx fails nightly
Should review entire kernel locking around truncate. Ian Colle
01:08 PM Bug #3681: kclient fsx fails nightly
Ian Colle
12:45 PM Bug #2277: qa: flock test broken
It's not an issue because these tests aren't being run any more. They should be run, though. ;)
That said, this wa...
Greg Farnum
12:30 PM Bug #2277 (Closed): qa: flock test broken
No longer an issue. Ian Colle
12:35 PM Bug #4241 (Duplicate): SELinux fails because it can't set xattrs
... Greg Farnum
12:23 PM Bug #1874 (New): Running `git gc` on a bare git repository hosted by ceph results in a bus error.
Ian Colle

02/21/2013

02:22 PM Bug #4220 (Resolved): MDS is inconsistent about whether layouts are allowed on the root directory
Right, this is in the generic xattr code, not the new vxattr stuff. Sage says this is because the root inode used to ... Greg Farnum
02:03 PM Bug #4220: MDS is inconsistent about whether layouts are allowed on the root directory
If you go through the "handle_client_setdirlayout" interface (ie, cephfs tool), the MDS will let you set a layout on ... Greg Farnum
10:47 AM Bug #4220 (Resolved): MDS is inconsistent about whether layouts are allowed on the root directory
I can't seem to set layouts on the root directory with ceph-fuse. Perhaps I'm doing something else wrong, but check o... Greg Farnum
01:50 PM Bug #1435 (Resolved): mds: loss of layout policies upon mds restart
Change pushed to bobtail (commit:36ed407e0f939a9bca57c3ffc0ee5608d50ab7ed) and next (commit:6bd8781dda524f04bb56bcdac... Greg Farnum
01:49 PM Cleanup #1499 (Resolved): mds: clean up directory layouts
Okay, I tested this according to the procedure described below (I wanted to write some scripts, but the non-manual po... Greg Farnum
11:28 AM Tasks #4222 (In Progress): Add libcephfs-test.jar to ceph-test deb package
Anonymous
11:26 AM Tasks #4222 (Resolved): Add libcephfs-test.jar to ceph-test deb package
The file libcephfs-test.jar is not present in any of the deb packages.
The file ceph.spec was changed to include thi...
Anonymous
10:48 AM Bug #4221 (Resolved): MDS: LogEvent::decode needs to respect mds_log_skip_corrupt_events for DECO...
By far the most common form of corrupt event is one that's somehow the wrong size — and that hits an assert in LogEve... Greg Farnum

02/20/2013

05:35 PM Bug #1435 (Fix Under Review): mds: loss of layout policies upon mds restart
Okay, after discussing with Sage we've decided that placing the directory layouts into the inode_t::layout field is t... Greg Farnum
11:36 AM Bug #1435 (Won't Fix): mds: loss of layout policies upon mds restart
This patch to use the inode layout instead of default_layout shouldn't be helping — directory layouts aren't written ... Greg Farnum
05:33 PM Feature #4215 (Resolved): mds: support setfattr on ceph.dir.layout
Ah, it was just a lost commit! Resolved by commit:5551aa5b3b5c2e9e7006476b9cd8cc181d2c9a04, among others.
(It will p...
Greg Farnum
02:43 PM Feature #4215: mds: support setfattr on ceph.dir.layout
Oh, as a feature this isn't any particular priority. Just an optimization, really. Greg Farnum
02:40 PM Feature #4215: mds: support setfattr on ceph.dir.layout
Oh, and indeed, looking at the source code:... Greg Farnum
02:37 PM Feature #4215 (Resolved): mds: support setfattr on ceph.dir.layout
... Greg Farnum
04:52 PM Cleanup #1499 (Fix Under Review): mds: clean up directory layouts
Okay, this patch is required to fix the loss of directory layouts issue (#1435) in Bobtail, so I rebased it on that b... Greg Farnum
12:04 PM Bug #4213 (Resolved): mds: old_parents is never cleaned up
The CInode::old_parents map is never cleaned up, even if all the snapshots for which a parent is valid have all been ... Greg Farnum
12:01 PM Bug #4212 (Closed): mds: open_snap_parents isn't called all the times it needs to be
Prompted by a bug report/paper-over patch from Alexandre, I had a talk with Sage about snapshots. It came out that we... Greg Farnum
10:01 AM Feature #3730 (Closed): Support replication factor in Hadoop
Woot. Noah Watkins
09:43 AM Feature #4208: Add more replication pool tests for Hadoop / Ceph bindings
Something like this added to the current custom test should do the trick?... Noah Watkins
09:32 AM Feature #4208 (Rejected): Add more replication pool tests for Hadoop / Ceph bindings
We should add a test (or several) for the scenario where the user specifies a level of replication that does not exac... Anonymous

02/19/2013

05:48 PM Cleanup #4166: ceph: simplify ceph_sync_write() page_align
This commit caused regressions. I pulled it out and
I'm not going to re-open it again.
Alex Elder
11:55 AM Fix #4191 (Resolved): qa: mulitiple mds in nightly (non-failure case)
Sage Weil
11:46 AM Feature #4190 (Resolved): qa: add mds thrashing to nightly
Sage Weil
11:43 AM Feature #4002 (Resolved): mds: design fsck
Okay, this is as done as it needed to be. Greg Farnum
10:18 AM Bug #1435: mds: loss of layout policies upon mds restart
Oh, n/m, hurray for our sysadmins fixing hasty upgrades. :) Greg Farnum
10:16 AM Feature #3730: Support replication factor in Hadoop
Yes. I should have closed this and opened a separate ticket for the tests. I'm planning to close it as soon as Joe wi... Noah Watkins
10:14 AM Feature #3730 (In Progress): Support replication factor in Hadoop
Isn't this now merged or something? :) Greg Farnum
09:43 AM Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
I was thinking the offset for readdir might be 32-bits.. but i may be wrong there. Sage Weil
03:07 AM Bug #4188 (Can't reproduce): mds crashes when cow-ing entries in formerly snapshotted dir
I have a dir that I used to snapshot before I switched to hardlink trees. I create files and subdirs in it, then, wh... Alexandre Oliva

02/18/2013

05:00 PM Cleanup #4166 (Resolved): ceph: simplify ceph_sync_write() page_align
commit 29c3c8b721a96b4a82f4224527c7103e5e910b80
Author: Alex Elder <elder@inktank.com>
Date: Fri Feb 15 22:10:16 ...
Alex Elder
 

Also available in: Atom