Project

General

Profile

Activity

From 09/24/2014 to 10/23/2014

10/23/2014

01:47 PM Bug #9869 (Pending Backport): Client: not handling cap_flush_ack messages properly
I tested this manually with a patch that sets the starting tid value to 65535 and looking at the logs. That causes im... Greg Farnum
12:47 PM Bug #9870: kernel: not handling cap_flush_ack messages properly
Zheng Yan

10/22/2014

05:34 PM Bug #9870 (Resolved): kernel: not handling cap_flush_ack messages properly
This is the analogue to #9869, which Zheng tells me is also a problem in the kernel. We need to downcast the message ... Greg Farnum
05:30 PM Bug #9869: Client: not handling cap_flush_ack messages properly
Waiting for this to build so it can be tested. Greg Farnum
05:28 PM Bug #9869 (Resolved): Client: not handling cap_flush_ack messages properly
We saw a log segment that contained this:... Greg Farnum

10/21/2014

03:22 PM Feature #9557 (Fix Under Review): mds: verify backtrace on fetch_dir
Zheng Yan
10:44 AM Feature #9557 (In Progress): mds: verify backtrace on fetch_dir
Greg Farnum
11:43 AM Bug #8809 (Can't reproduce): uclient: memory leak
maybe fixed by 2313ce1d024361fd7f4d2cbca789010f0fe0faad Zheng Yan
10:55 AM Bug #9674: nightly failed multiple_rsync.sh
commit:477073aba1da880dfd0b8c82f4792788579f28b9 in master and commit:44ce33c12443909b02c7ee451ad45400f55d53c9 in giant Greg Farnum

10/20/2014

01:23 PM Feature #414 (Resolved): ceph-fuse: implement file locking
Zheng Yan
01:22 PM Bug #8576: teuthology: nfs tests failing on umount
teuthology commit:4f2957c42d0f76a399cb26c660ede9243c095779 runs those commands as well as the previous ones. Greg Farnum
01:02 PM Bug #9679 (Closed): Ceph hadoop terasort job failure
Fixed in cephfs-hadoop repo. Noah Watkins
11:15 AM Bug #9800: client-limits test is not passing

Same failure:
http://pulpito.front.sepia.ceph.com/teuthology-2014-10-17_23:04:02-fs-giant-distro-basic-multi/555...
John Spray

10/19/2014

07:20 PM Bug #9341 (Pending Backport): MDS: very slow rejoin
Hmm, we didn't put this in Giant initially because we were trying not to perturb it. Master hasn't been run through t... Greg Farnum
06:45 PM Bug #9341 (Fix Under Review): MDS: very slow rejoin
Please include this fix to 0.87 which is affected just as badly as 0.80.x.
On 0.87 MDS stuck in "rejoin" for hours a...
Dmitry Smirnov

10/16/2014

01:54 PM Bug #9800 (Resolved): client-limits test is not passing
/a/teuthology-2014-10-13_23:04:01-fs-giant-distro-basic-multi/547170
The client isn't dropping its caps:...
Greg Farnum
10:50 AM Feature #4137: MDS: Implement a forward-scrubbing mechanism.
I realized today that we probably want to optionally scrub directories that were renamed into place following a scrub... Greg Farnum

10/15/2014

12:39 AM Bug #8576: teuthology: nfs tests failing on umount
I notice that if I execute 'service nfs stop' first, umounting cephfs always successes. 'service nfs stop' runs two c... Zheng Yan

10/14/2014

06:15 PM Bug #9674: nightly failed multiple_rsync.sh
rsync asks us to see previous errors;) yes, I think sudo should work Zheng Yan
02:36 PM Bug #9674: nightly failed multiple_rsync.sh
Well, that would make sense. How did you find those in the log?
We should probably just run this as sudo or someth...
Greg Farnum
06:30 AM Bug #9674: nightly failed multiple_rsync.sh
... Zheng Yan
06:20 AM Feature #9755: Fence late clients during reconnect timeout
There can be certain cases where a client can reconnect after being evicted, e.g. if:
* the client didn't hold an...
John Spray

10/13/2014

04:50 PM Feature #414 (Fix Under Review): ceph-fuse: implement file locking
Zheng Yan
12:52 PM Feature #9755: Fence late clients during reconnect timeout
Hmm, I like the basic thrust of this, but I'm a little concerned as well — we have other tickets to let clients recon... Greg Farnum
03:39 AM Feature #9755 (Resolved): Fence late clients during reconnect timeout

During reconnect, MDSs terminate the sessions of any clients which fail to reconnect within the window. Because wh...
John Spray
03:16 AM Feature #9754 (Resolved): A 'fence and evict' client eviction command

Currently the "session evict" operation on the MDS admin socket will terminate the session, and release any capabil...
John Spray

10/10/2014

04:11 PM Bug #9679: Ceph hadoop terasort job failure
I do believe that Hadoop kills the clients after they reach a point that the run-time believes everything has been fl... Noah Watkins
02:02 PM Bug #9679: Ceph hadoop terasort job failure
Looking at the bad client (11139), the first thing I notice is that the messaging is way backed up. What's the networ... Greg Farnum
09:13 AM Bug #9679: Ceph hadoop terasort job failure
Here is the directory listing. All of the files should be the same size.... Noah Watkins
07:18 AM Bug #9692 (Resolved): ACL workunit syntax error
Zheng Yan

10/09/2014

12:07 PM Bug #9679: Ceph hadoop terasort job failure
empty fs:... Noah Watkins
08:21 AM Bug #9679: Ceph hadoop terasort job failure
Thanks Huamin. Yeh, It looks like some writes are being lost, probably due to an unclean shutdown. I'll get some trac... Noah Watkins
08:06 AM Bug #9679: Ceph hadoop terasort job failure
For comparison, teragen files on CephFS
./hadoop/bin/hadoop fs -ls /in-dir-3
14/10/09 08:05:05 WARN util.NativeC...
Huamin Chen
07:04 AM Bug #9679: Ceph hadoop terasort job failure
Run the same tests on HDFS 2.4.1, thoguh on a different setup. Terasort finished without any problem.
Cmd:
./hado...
Huamin Chen

10/08/2014

11:08 PM Bug #9679: Ceph hadoop terasort job failure
missing one of these?... Noah Watkins
10:46 PM Bug #9679: Ceph hadoop terasort job failure
My bet at this point is on the generation of the input data set. Teragen creates a file with X 100byte entries. When ... Noah Watkins
07:28 AM Feature #9437 (Resolved): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
... John Spray

10/07/2014

07:28 PM Bug #9692 (Resolved): ACL workunit syntax error
http://pulpito.ceph.com/gregf-2014-10-06_19:59:42-kcephfs-wip-9628-testing-basic-multi/531900... Greg Farnum
07:26 PM Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
Merged to master in commit:1b7fae7b2953649564a9e226b4abedad0ce652cc Greg Farnum
09:54 AM Bug #9679: Ceph hadoop terasort job failure
https://issues.apache.org/jira/browse/MAPREDUCE-2018 Noah Watkins
09:53 AM Bug #9679: Ceph hadoop terasort job failure
https://svn.apache.org/repos/asf/hadoop/common/branches/MAPREDUCE-233/src/examples/org/apache/hadoop/examples/terasor... Noah Watkins
09:39 AM Bug #9679: Ceph hadoop terasort job failure
Teragen command:
./hadoop/bin/hadoop jar ./hadoop-2.4.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar t...
Huamin Chen
09:22 AM Bug #9679: Ceph hadoop terasort job failure
Thanks for adding this. What command did you use to generate the input? Noah Watkins
09:04 AM Bug #9679 (Closed): Ceph hadoop terasort job failure
Hadoop version: 2.4.1
Ceph version:
ceph --version
ceph version 0.85-986-g031ef05 (031ef0551ebc98d824075558e884...
Huamin Chen
07:03 AM Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client
Greg Farnum
07:02 AM Bug #9562 (Resolved): Lockdep assertion in Filer purge
Backported to giant:... John Spray
07:02 AM Bug #8576 (Need More Info): teuthology: nfs tests failing on umount
Greg Farnum

10/06/2014

06:27 PM Bug #9674: nightly failed multiple_rsync.sh
rsync return codes aren't standard error codes. The man page says that 23 means... Greg Farnum
05:59 PM Bug #9674: nightly failed multiple_rsync.sh
#define ENFILE 23 /* File table overflow */
maybe we should adjust ulimit
Zheng Yan
02:23 PM Bug #9674 (Resolved): nightly failed multiple_rsync.sh
http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-03_23:04:01-fs-giant-distro-basic-multi/527949/... Greg Farnum

10/03/2014

02:50 PM Feature #9659 (Duplicate): MDS: support cache eviction
It would be really useful when writing certain kinds of tests (eg, for scrubbing) to be able to know that a particula... Greg Farnum
06:52 AM Bug #9636: segfault in CInode::get_caps_allowed_for_client
looks like it's the same as #9628 Zheng Yan

10/02/2014

02:15 PM Bug #9514 (Resolved): ceph-fuse pjd test is failing in giant nightlies
Dumpling commit:5f601f099be98c2b061cc94fb06917e7543f3efe
Firefly commit:9fee8de25ab5c155cd6a3d32a71e45630a5ded15
Greg Farnum

10/01/2014

10:42 AM Bug #9636 (Duplicate): segfault in CInode::get_caps_allowed_for_client

While doing ad-hoc killing of clients stuck on full cluster: unchecked dereference of session connection....
John Spray
06:18 AM Feature #7317 (In Progress): mds: behave with fs fills (e.g., allow deletion)
John Spray
06:15 AM Feature #9437 (Fix Under Review): make 'ceph tell mds.* ...' work, deprecate 'ceph mds tell * ...'
John Spray

09/30/2014

10:29 AM Bug #9562 (Pending Backport): Lockdep assertion in Filer purge
This is popping up in Giant as well, which I believe has the new code that was the proximate cause. :) Greg Farnum
10:27 AM Bug #9514 (Pending Backport): ceph-fuse pjd test is failing in giant nightlies
In giant as commit:0ea20a668cf859881c49b33d1b6db4e636eda18a.
Needs to go to firefly as well.
Greg Farnum
12:08 AM Bug #9628: mds: race between ms_handle_accept() and ms_handle_reset()
https://github.com/ceph/ceph/pull/2596 Zheng Yan
12:08 AM Bug #9628 (Resolved): mds: race between ms_handle_accept() and ms_handle_reset()
ceph version 0.85-1003-g3ae673c (3ae673c764a4fac6e554e05722f0179566ed3fb3)
1: (ceph::BackTrace::BackTrace(int)+0x2...
Zheng Yan

09/29/2014

08:18 PM Bug #9562 (Resolved): Lockdep assertion in Filer purge
Zheng Yan
04:43 PM Bug #9341: MDS: very slow rejoin
John Spray wrote:
> The userspace change and test for this are merged into master. Is the kernel side all done too?...
Dmitry Smirnov
01:07 PM Bug #9341: MDS: very slow rejoin
The userspace change and test for this are merged into master. Is the kernel side all done too? John Spray
04:33 PM Bug #9514: ceph-fuse pjd test is failing in giant nightlies
Greg Farnum
03:49 PM Bug #9514: ceph-fuse pjd test is failing in giant nightlies
So here's a question: why does the client (temporarily) remember its ctime as being 2014-09-26 19:22:06.889397, but n... Greg Farnum
02:58 PM Bug #9514 (In Progress): ceph-fuse pjd test is failing in giant nightlies
Hah, we got the failure with logs in /a/sage-2014-09-26_17:51:11-smoke-giant-distro-basic-multi/513914
All of the ...
Greg Farnum
01:15 PM Bug #8576: teuthology: nfs tests failing on umount
Trying the sync on Sage's go-ahead. :)
commit:56223ce98b659fe7b25b55161ef8163495f438fc in teuthology.
Greg Farnum
10:45 AM Bug #8576: teuthology: nfs tests failing on umount
Is there any chance that just running a sync on the node prior to trying to "exportfs -au" might prevent this? I'm he... Greg Farnum

09/26/2014

03:32 PM Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
Sage believes this is a bug with readahead that got fixed in subsequent releases. Greg Farnum
06:51 AM Bug #8427 (Won't Fix): ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to relea...
Sage Weil

09/25/2014

06:23 PM Feature #541 (Resolved): mds: tempsync
this is implemented... TSYN and related states Sage Weil
05:47 PM Feature #630 (Resolved): release caps on inodes unlinked by other clients
Zheng Yan
05:47 PM Feature #630: release caps on inodes unlinked by other clients
dup of #5039. already fixed by commit f8a947d92 client: trim deleted inode Zheng Yan
04:34 PM Bug #9514: ceph-fuse pjd test is failing in giant nightlies
This hasn't reproduced since we turned on debug logging. :(
But I did see it on a run without any logging: /a/gregf-...
Greg Farnum
03:31 AM Bug #9562 (Fix Under Review): Lockdep assertion in Filer purge
https://github.com/ceph/ceph/pull/2572 John Spray
12:56 AM Bug #9563 (Resolved): kcephfs crash in ceph_mdsc_do_request
Zheng Yan
12:55 AM Bug #9564 (Resolved): kcephfs crash in _nfs4_do_open
the bug is fixed upstream commit f39c0104 (NFS: remove BUG possibility in nfs4_open_and_get_state). I rebased the tes... Zheng Yan

09/24/2014

07:47 PM Bug #6613: samba is crashing in teuthology
Still happening
/a/teuthology-2014-09-22_23:14:01-samba-giant-testing-basic-multi/50607
Greg Farnum
07:43 PM Bug #8427: ceph-fuse: Dumpling "cache still has 0+1 items, waiting (for caps to release?)" on shu...
/a/teuthology-2014-09-22_19:06:01-fs-dumpling-testing-basic-multi/505408
Grabbed all the logs out of /var/log/ceph...
Greg Farnum
02:18 PM Bug #8576: teuthology: nfs tests failing on umount
https://github.com/ceph/teuthology/pull/336 Greg Farnum
10:00 AM Cleanup #2378 (Resolved): "ceph -s" MDS output is confusing
We don't print mds status if there's not an FS any more. Greg Farnum
 

Also available in: Atom