Project

General

Profile

Activity

From 10/20/2011 to 11/18/2011

11/18/2011

02:37 PM Bug #1682: mds: segfault in CInode::authority
Another crash is CInode::Authority happened today, although a different backtrace.
From teuthology:~teuthworker/arc...
Josh Durgin
02:35 PM Bug #1737 (Resolved): ceph-fuse crash in xlist::remove
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-18-2/2645/remote/ubuntu@sepia13.ceph.dreamhost.com/log/... Josh Durgin

11/17/2011

03:06 PM Bug #1728 (Resolved): multiple cfuse tests failing with non-empty directories
fixed by commit:ef5ca293a7eee6fd37c1ea8e8027a5f6d83b66da Sage Weil
02:13 PM Bug #1728: multiple cfuse tests failing with non-empty directories
My guess is the warning cleanup patch that added an error check in the readdir code, commit:cd90061239a598f6fca94326b... Sage Weil

11/16/2011

05:59 PM Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened again on 11/16, 2056 kclient_workunit_kernel_untar_build
2011-11-16T00:36:30.996 INFO:teuthology.task....
Anonymous
05:51 PM Bug #1728 (Resolved): multiple cfuse tests failing with non-empty directories
All from the 11/16 nightlies:
2044 cfuse_workunit_snaps ...
2011-11-16T00:05:11.781 INFO:teuthology.task.workunit...
Anonymous

11/11/2011

12:37 AM Bug #1702: Ceph MDS crash + client mount problem
Yes I am stopping the clients and remounting...but if im doing a mkcephfs, i make sure to umount all the clients befo... Gokul Krishnan

11/10/2011

03:29 PM Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> Thank you for reverting back so quickly.
>
> Well in my scenario, i just have one Ceph se...
Sage Weil
03:29 PM Bug #1702: Ceph MDS crash + client mount problem
Gokul Krishnan wrote:
> by the way,
> you have assigned a target version as v0.39...but in the site i can find only...
Sage Weil
01:50 AM Bug #1702: Ceph MDS crash + client mount problem
by the way,
you have assigned a target version as v0.39...but in the site i can find only the source for v0.37...
e...
Gokul Krishnan
12:45 AM Bug #1702: Ceph MDS crash + client mount problem
Thank you for reverting back so quickly.
Well in my scenario, i just have one Ceph server running. And yes, every ...
Gokul Krishnan
02:57 PM Feature #1448: test hadoop on sepia
The following benchmark, TestDFSIO, is for 12 OSDs, 1 MDS/MON. There is a single ext4 disk per node dedicated to Ceph... Noah Watkins

11/09/2011

03:00 PM Bug #1702: Ceph MDS crash + client mount problem
Ok, so generally speaking, the only time you shoudl see fsid mismatches like that is if you have daemons from multipl... Sage Weil
02:55 PM Bug #1702: Ceph MDS crash + client mount problem
Hello,
thank you for the reply.
no, unfortunately i am not able to reproduce the error using debug ms = 20(for MD...
Gokul Krishnan
01:23 PM Bug #1702 (Need More Info): Ceph MDS crash + client mount problem
Are you able to reproduce this with 'debug mds = 20' and 'debug ms = 20' in your ceph.conf [mds section]?
Not sure...
Sage Weil
12:51 PM Bug #1702 (Can't reproduce): Ceph MDS crash + client mount problem
Hello,
i have configured ceph using a configuration as shown here[[http://pastebin.com/sQb8WZbx]].
The Ceph serve...
Gokul Krishnan
07:19 AM Bug #1472: cfuse hangs with v0.34
Some of the hangs we've been seeing on the client may have been related to having two nics on each node. We had seen... Sam Lang

11/07/2011

02:59 PM Feature #1693: libcephfs: Support TRIM (hole punching)
Kernelside ceph.ko ticket is #591. Let this ticket stand for the userspace libcephfs (and ceph-fuse) support. Anonymous
02:12 PM Feature #1693 (Resolved): libcephfs: Support TRIM (hole punching)
Anonymous

11/04/2011

02:47 PM Bug #1472: cfuse hangs with v0.34
We're seeing similar hangs again. One thing I didn't mention in my previous posts, we are always adjusting the repli... Sam Lang
12:55 PM Bug #1682 (Resolved): mds: segfault in CInode::authority
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-04/1469/teuthology.log:... Josh Durgin
09:53 AM Feature #1680 (New): support reflink (cheap file copy/clone)
It seems the API is still fs-specific ioctls, but there's repeated discussion about reflink(2).
If a nice common API...
Anonymous

11/03/2011

04:39 PM Bug #1663 (Resolved): Hadoop: file ownership/permission not available in hadoop
This is still a pretty cheap fix :), but I think it's enough to close out this bug. Greg Farnum
04:12 PM Bug #1663: Hadoop: file ownership/permission not available in hadoop
a79b7e17ebbc70cedae80216986ae5fd52a1c0b7 provides an OK fix for now. Basically it makes any file look like the curren... Noah Watkins
04:08 PM Bug #1666: hadoop: time-related meta-data problems
Bummer. Well... for the time being it may be sufficient to force FileStatus.getModificationTime() to go directly to t... Noah Watkins
03:58 PM Bug #1666: hadoop: time-related meta-data problems
Yeah, it's not impossible, I just would have thought that one of the other updates would have prompted the server to ... Greg Farnum
03:52 PM Bug #1666: hadoop: time-related meta-data problems
Do you mean that you are surprised that client-1's inode didn't get updated from the server's change before the stat ... Noah Watkins
03:49 PM Bug #1666: hadoop: time-related meta-data problems
If that's the case then I'm surprised the mtime didn't get updated at an earlier time. If nothing else we can probabl... Greg Farnum
03:44 PM Bug #1666: hadoop: time-related meta-data problems
Greg Farnum wrote:
> So the "bad" mtime is the same time the inode was created on the MDS server?
I think so. Her...
Noah Watkins
03:35 PM Bug #1666: hadoop: time-related meta-data problems
So the "bad" mtime is the same time the inode was created on the MDS server? Greg Farnum
03:30 PM Bug #1666: hadoop: time-related meta-data problems
If Client-1 is seeing a cached copy of the inode's mtime, then the following server-side scenario may explain what's ... Noah Watkins
02:44 PM Bug #1666: hadoop: time-related meta-data problems
Grepping for the inode number got me this:... Greg Farnum
01:20 PM Bug #1666: hadoop: time-related meta-data problems
Sage Weil wrote:
> If you can generate client logs for C1 and C2 (debug ms = 1, debug client = 10) that should tell ...
Noah Watkins
11:44 AM Bug #1666: hadoop: time-related meta-data problems
If you can generate client logs for C1 and C2 (debug ms = 1, debug client = 10) that should tell us everything. Sage Weil
11:07 AM Bug #1666: hadoop: time-related meta-data problems
Just ran a little experiment that may shed some light on this.... Noah Watkins
03:49 PM Bug #1677: mds interval_set.h: 385: FAILED assert(p->first <= start)
Here is the log from the MDS that caused this. I have from the other mds's, mon, and osd if it is relevant -- but not... Noah Watkins
03:44 PM Bug #1677 (Resolved): mds interval_set.h: 385: FAILED assert(p->first <= start)
Noah got this and sent it to the mailing list on Oct 28, 2011:... Greg Farnum
11:54 AM Bug #1675 (Can't reproduce): mds: failed rstat assert
This happened during the multiple_rsync workunit.
From teuthology:~teuthworker/archive/nightly_coverage_2011-11-03/1...
Josh Durgin

11/02/2011

08:45 PM Bug #1666: hadoop: time-related meta-data problems
Something like this would make the most sense to me. (I'd have to check the specifics of mtime updating to see exactly.) Greg Farnum
08:30 PM Bug #1666: hadoop: time-related meta-data problems
Formatting oops:... Noah Watkins
08:29 PM Bug #1666: hadoop: time-related meta-data problems
You're right about that last point Greg, it doesn't quite add up--not thinking straight today.
Here is what happen...
Noah Watkins
07:46 PM Bug #1666: hadoop: time-related meta-data problems
I'd have to look at the specifics again -- but it probably can't be done. If the client buffers a write and then flus... Greg Farnum
06:39 PM Bug #1666: hadoop: time-related meta-data problems
So, I think I've got this nailed down. The good news is that the error was a clock sync issue. The bad news is that i... Noah Watkins

11/01/2011

11:08 AM Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
Someone needs to try to reproduce this with logs. fwiw metropolis:~sage/src/teuthology/hammer.sh is what i've been u... Sage Weil
10:22 AM Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened after the misc workunit today. Josh Durgin

10/31/2011

05:32 PM Bug #1666: hadoop: time-related meta-data problems
It looks like the check is equality of timestamps. So, I think Hadoop is setting an explicit timestamp, and sometime ... Noah Watkins
05:30 PM Bug #1666: hadoop: time-related meta-data problems
All of the local clocks on the nodes look good. The code is comparing timestamps (I assume since epoch), so maybe the... Noah Watkins
05:06 PM Bug #1666: hadoop: time-related meta-data problems
Neither of these errors are in code that's remotely familiar to me. So my first favorite question is:
Are your clock...
Greg Farnum
04:55 PM Bug #1666 (Resolved): hadoop: time-related meta-data problems
The following exceptions are being thrown. It looks like something related to lstat?
pre>
java.io.IOException: Th...
Noah Watkins
11:35 AM Bug #1661 (Resolved): Hadoop: expected system directories not present
Apparently this was actually the result of an API mismatch. Fixed by Noah's patch in commit:f9b7ecdb5bba1439dc4c13005... Greg Farnum

10/28/2011

03:46 PM Bug #1661: Hadoop: expected system directories not present
Blindly creating directories is definitely not the proper solution. Somebody will need to take the time to figure out... Greg Farnum
03:32 PM Bug #1661: Hadoop: expected system directories not present
In this particular instance it is a map-reduce specific directory. I suspect that MapReduce is responsible for this, ... Noah Watkins
03:22 PM Bug #1661: Hadoop: expected system directories not present
Sounds to me like CephFileSystem should just create the directory if it doesn't exist.. Sage Weil
03:13 PM Bug #1661: Hadoop: expected system directories not present
Good to know. I think at this point I need to paper over many things, but want to record all these issues. I'll just ... Noah Watkins
03:08 PM Bug #1661: Hadoop: expected system directories not present
I remember running into this issue when developing things and deciding to just paper over it at the time -- I couldn'... Greg Farnum
03:05 PM Bug #1661: Hadoop: expected system directories not present
Adding: when this directory is created by hand before map reduce starts the error is gone. Noah Watkins
03:04 PM Bug #1661 (Resolved): Hadoop: expected system directories not present
Hadoop complains that directories within the file system that are expected to be present are not present. Hadoop may ... Noah Watkins
03:24 PM Bug #1663: Hadoop: file ownership/permission not available in hadoop
Noah Watkins wrote:
> This is a very simple hack that will make hadoop ignore the permission for the time being:
...
Noah Watkins
03:23 PM Bug #1663: Hadoop: file ownership/permission not available in hadoop
This is a very simple hack that will make hadoop ignore the permission for the time being:
diff --git a/src/mapred...
Noah Watkins
03:16 PM Bug #1663 (Resolved): Hadoop: file ownership/permission not available in hadoop
Hadoop complains about incorrect file ownership. An 'ls' via Hadoop FS interface reveals no permission information, b... Noah Watkins

10/27/2011

10:46 AM Bug #1549 (Need More Info): mds: zeroed root CDir* vtable in scatter_writebehind_finish
bleh. need logs... i'll start this up in a loop again. Sage Weil
10:33 AM Bug #1549: mds: zeroed root CDir* vtable in scatter_writebehind_finish
This happened again today after fsstress. From teuthology:~teuthworker/archive/nightly_coverage_2011-10-27/1083/teuth... Josh Durgin

10/26/2011

04:03 PM Bug #1656: Hadoop client unit test failures
Sounds good to me -- which patches we want to keep in the tree are probably a management decision but I'm happy to pu... Greg Farnum
03:55 PM Bug #1656: Hadoop client unit test failures
Alright, so I think at this point I'd like to see two patches:
1) A patch against the downloadable tarball (much e...
Noah Watkins
03:49 PM Bug #1656: Hadoop client unit test failures
I believe the patch was made against the then-current svn 0.21 branch (which is now very dead). I pushed changes to t... Greg Farnum
03:39 PM Bug #1656: Hadoop client unit test failures
This was hadoop-0.20.205.0 with the latest Ceph master branch.
It looked like the patch in src/client/hadoop was o...
Noah Watkins
03:30 PM Bug #1656: Hadoop client unit test failures
What versions of the systems were you running when these failed?
I don't remember how they're set up but they migh...
Greg Farnum
01:59 PM Bug #1656 (Won't Fix): Hadoop client unit test failures
The Ceph Hadoop File System passes nearly all its tests except a few. I've included the test log below that shows the... Noah Watkins

10/25/2011

04:48 PM Bug #1114 (Need More Info): NFS export extreme slowdown
Need to reproduce this on the current trunk and fully characterize what is going on.
- the the nfs server in sync ...
Sage Weil
03:50 PM Bug #1585 (Can't reproduce): mds crash during shutdown
Sage Weil

10/21/2011

10:59 AM Bug #1640 (Need More Info): mds: failed assert(trim_to > trimming_pos)
need logs with 'debug journaler = 20' and 'debug ms = 1' on the mds for this one Sage Weil
10:57 AM Bug #1509 (Need More Info): cfuse sometimes hangs after unmount
Sage Weil
10:56 AM Bug #1596 (Need More Info): mds crash during ffsb on kernel client in CInode::is_frozen
Sage Weil
10:52 AM Bug #1603 (Need More Info): ceph-fuse crash during unmount
have this one going in a loop to catch it with logs Sage Weil

10/20/2011

05:22 PM Bug #1640 (Resolved): mds: failed assert(trim_to > trimming_pos)
This happened with bonnie++ on cfuse in teuthology:~teuthworker/archive/nightly_coverage_2011-10-20/729/remote/ubuntu... Josh Durgin
 

Also available in: Atom