Project

General

Profile

Activity

From 04/24/2013 to 05/23/2013

05/23/2013

11:16 PM Bug #5162 (Can't reproduce): File is locked unexpected and not released anymore
I deployed a ceph cluster and mount cephfs via kernel module. After using it few days later, when I ls a particular f... joe huang
10:35 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
I have attached the logs from two nodes of my MDS cluster.
I started mds.0 first. When I started mds.1, mds.0 crashed.
Walter Huf
05:54 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Sage Weil wrote:
> Argh.. i don't have a log after all.
>
> Yan, dropping the assert avoids teh crash, but it see...
Zheng Yan
09:20 AM Bug #4832: mds: failed auth_unpin assert
ubuntu@teuthology:/a/teuthology-2013-05-23_01:00:08-rados-next-testing-basic/20276 Sage Weil

05/22/2013

02:12 PM Bug #5031 (Need More Info): mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Sage Weil
02:12 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Walter: can you produce a log? 'debug mds = 20', 'debug ms = 1', restart the mds and wait for it to crash.
I have...
Sage Weil
02:10 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
Argh.. i don't have a log after all.
Yan, dropping the assert avoids teh crash, but it seems like the real issue i...
Sage Weil

05/21/2013

07:37 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
I also have encountered this. Under Bobtail, I had it running with 2 active nodes and a passive node. Now, I can only... Walter Huf
12:57 PM Bug #4832: mds: failed auth_unpin assert
Sage Weil
12:57 PM Bug #4832: mds: failed auth_unpin assert
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-21_01:00:40-fs-next-testing-basic/18590 Sage Weil

05/20/2013

01:49 PM Bug #5104 (Can't reproduce): MDS crashed in Objecter::handle_osd_op_reply
if there is a decode error, find the parent frame with the bl and p bl._len to see how big it is. usually it is fall... Sage Weil
01:44 PM Bug #5104: MDS crashed in Objecter::handle_osd_op_reply
Sadly, no logs on this guy any more, sorry.
If it happens again, I'll collect that frame 15 info.
Do you have a...
Jim Schutt
01:23 PM Bug #5104 (Need More Info): MDS crashed in Objecter::handle_osd_op_reply
any logs? would love to see value of 'r' and bl._len in frame 15.. Sage Weil
09:38 AM Bug #5105: mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
probably a dup of #4832? Sage Weil

05/17/2013

03:10 PM Bug #5105 (Duplicate): mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
While trying to reproduce #4999, I collected this in an MDS.
I was running next branch (commit c80c6a032c) merged ...
Jim Schutt
02:22 PM Bug #5104 (Can't reproduce): MDS crashed in Objecter::handle_osd_op_reply
While trying to reproduce #4999, I collected this in an MDS.
I was running next branch (commit c80c6a032c) merged ...
Jim Schutt
02:14 PM Bug #5103 (Rejected): mds: hung getattrs after restart
this was an osd issue. Sage Weil
11:06 AM Bug #5103 (Rejected): mds: hung getattrs after restart
logs on cephdrop ceph-mds.1.log
hung requests are...
Sage Weil

05/16/2013

11:46 PM Documentation #3672 (Resolved): doc: how to mount ceph-fuse from fstab
John Wilkins
04:40 PM Bug #5021: ceph-fuse: crash on traceless reply
Sage Weil
04:39 PM Bug #5021: ceph-fuse: crash on traceless reply
500 passes of the job on commit:1f65594c23309b527d74afe648c888c69a3c2acd wip-5021 Sage Weil
01:05 PM Bug #4965 (Resolved): libcephfs-java test failure
Sage Weil
09:43 AM Bug #5079 (Resolved): assert in MDCache::_recovered()
thanks, this one was easy to fix.
commit:64871e093159ad06d84fb2a84c7808a81800dfc4
Sage Weil

05/15/2013

11:37 AM Bug #5079 (Resolved): assert in MDCache::_recovered()
While trying to reproduce 4999 with the requested logging, I got this MDS assert.
I'm running cuttlefish branch @ ...
Jim Schutt
10:54 AM Bug #5021: ceph-fuse: crash on traceless reply
... Sage Weil
10:32 AM Bug #5021: ceph-fuse: crash on traceless reply
... Sage Weil

05/14/2013

10:05 PM Bug #5021 (In Progress): ceph-fuse: crash on traceless reply
Sage Weil
05:15 PM Bug #4832 (Need More Info): mds: failed auth_unpin assert
cranked up mds logs in qa.. should get useful info next time we hit this. Sage Weil
05:07 PM Bug #4832: mds: failed auth_unpin assert
recent logs: ubuntu@teuthology:/a/teuthology-2013-05-14_01:00:46-kernel-next-testing-basic/13128 Tamilarasi muthamizhan
01:45 AM Bug #5037: Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
Our ceph is productive, yeah. We are only using rbd, not CephFS or RadosGW, though. SJust and Sage are familiar with ... Christopher Kunz
01:11 AM Bug #5036: `ls` hangs on random folder
By turning on the debug mode of MDS:... Quan Tong Anh

05/13/2013

12:37 PM Feature #4326 (Resolved): qa: add samba + (kclient|ceph-fuse) to suite
Sage Weil
11:07 AM Bug #5021: ceph-fuse: crash on traceless reply
Never mind that comment, I was just looking at the job it happened on, not the actual failure... Greg Farnum
10:20 AM Bug #5021: ceph-fuse: crash on traceless reply
Will come back for another pass and verify, but I assume this is the disconnected inode error. Greg Farnum
10:54 AM Bug #5033: oops in ceph_put_wrbuffer_cap_refs
plana47 died with:
[0]kdb> bt
Stack traceback for pid 25102
0xffff88001c499f90 25102 23405 1 0 R 0x...
Sandon Van Ness
10:17 AM Bug #5030 (Resolved): libcephfs xattr test failure
Sage Weil
10:11 AM Bug #5037: Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
It couldn't find the actual table object in RADOS. We've seen this pop up a few times, but I believe it's always been... Greg Farnum
01:43 AM Bug #5037 (Can't reproduce): Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
After upgrading our Ceph setup to 0.56.6 from 0.56.2, the MDS processes assert() on start and will not work.
This i...
Christopher Kunz
09:01 AM Bug #5039 (Resolved): client: unlinking files leaves the cached entry behind
http://comments.gmane.org/gmane.comp.file-systems.ceph.user/1277
When unlinking a file, the client should make an ...
Mike Bryant
04:11 AM Bug #5036: `ls` hangs on random folder
As you can see, the @ls@ process is stuck in D state:
*@/proc/10297/status@*...
Quan Tong Anh
12:16 AM Bug #5036 (Resolved): `ls` hangs on random folder
strace hangs at "getdents(3,": https://clbin.com/LktUw
The informations when dumping via SysRq:...
Quan Tong Anh
01:31 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
FYI: I have code that finds the missing inode by using backtrace. The code is under test, will send out soon. Zheng Yan
01:11 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
The items left in reconnected_snaprealms should be other MDS's mdsdir. I comment out that line when running test Zheng Yan

05/11/2013

05:41 PM Bug #5033 (Can't reproduce): oops in ceph_put_wrbuffer_cap_refs
... Sage Weil
05:27 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
logs copied to logs/ subdir Sage Weil
05:27 PM Bug #5031 (Resolved): mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
... Sage Weil
05:22 PM Bug #5030 (Resolved): libcephfs xattr test failure
2013-05-11T02:29:39.882 INFO:teuthology.task.workunit.client.0.out:[ RUN ] LibCephFS.Xattrs
2013-05-11T02:29:39...
Sage Weil

05/10/2013

11:13 PM Bug #5025 (Resolved): samba smbtorture lock test fails on kclient
... Sage Weil
10:19 PM Bug #5022 (Resolved): samba: smbtorture failures
Sage Weil
05:27 PM Bug #5022 (Resolved): samba: smbtorture failures
logs: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10437... Tamilarasi muthamizhan
08:21 PM Bug #4965: libcephfs-java test failure
Commit a095075fe4dcdac817895dac316100e733ab4698 has a patch that I believe fixes this issue. If it resolves things in... Anonymous
05:31 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
... Tamilarasi muthamizhan
05:31 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
log: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10442... Tamilarasi muthamizhan
05:23 PM Bug #5021 (Resolved): ceph-fuse: crash on traceless reply
logs: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10448... Tamilarasi muthamizhan
04:58 PM Bug #4832: mds: failed auth_unpin assert
ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:06-rados-master-testing-basic/10278... Tamilarasi muthamizhan
04:53 PM Feature #3243 (Resolved): qa: test samba reexport via libcephfs vfs plugin in teuthology
Sage Weil

05/09/2013

04:04 PM Bug #4965 (In Progress): libcephfs-java test failure
Anonymous
02:44 PM Bug #4965: libcephfs-java test failure
Using the YAML file posted above, this test is passing for me. I ran it 4 times on 2 different sets of plana nodes an... Anonymous

05/08/2013

08:57 PM Bug #4965 (Resolved): libcephfs-java test failure
... Sage Weil
04:24 PM Bug #4832: mds: failed auth_unpin assert
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-08_01:00:07-rados-master-testing-basic/8600 Sage Weil
03:13 AM Bug #4241: SELinux fails because it can't set xattrs
Are you sure about that? ceph_file_iops hasn't been changed since 2009, and the methods are there. The problem still ... Carl-Johan Schenström

05/07/2013

04:12 PM Documentation #4422 (Resolved): Typo on Release Process webpage
John Wilkins

05/06/2013

02:57 PM Bug #4920 (Resolved): client: does not respect O_NOFOLLOW
It looks like doing an open() always implicitly follows symlinks, because we call path_walk() with followsym set to t... Greg Farnum

05/05/2013

06:36 AM Bug #4909: mds: stalled/stuck directory (standby)
Directory accessed only after reboot one of node (with stalled mount's) - not after only ceph daemons restarting. Denis kaganovich

05/04/2013

03:56 AM Bug #4909: mds: stalled/stuck directory (standby)
Sorry, comment 1 is about ctdbd (IMHO), forget. Only main issue. Denis kaganovich
03:51 AM Bug #4909: mds: stalled/stuck directory (standby)
& (without debug 10) now log flooding on other node (mds.4):
2013-05-04 13:47:27.648019 7fe8c59ca700 0 mds.0.serv...
Denis kaganovich

05/03/2013

07:18 PM Bug #4909 (Can't reproduce): mds: stalled/stuck directory (standby)
I many times break actions (debug mysql replication script, just multiple dump redirections) directly to directory, m... Denis kaganovich
02:53 PM Bug #4894: mds: standby shut itself down due to not having any data
MDS::boot_create() first starts a new log segment (its ESubtreemap is empty), then use MDCache::create_empty_hierarch... Zheng Yan
10:40 AM Bug #4894: mds: standby shut itself down due to not having any data
You must be racing ahead of me here, Yan — what's your theory? Just that the first active MDS failed to write any log... Greg Farnum
12:24 PM Feature #4906 (Resolved): ceph-fuse: use the Preforker class
Sage wrote a Preforker class for the Monitor. We should switch to using that instead of our own band-aided daemonizat... Greg Farnum

05/02/2013

07:29 PM Bug #4894: mds: standby shut itself down due to not having any data
I think MDS::boot_create() should start a new log segment after creating the fs hierarchy. Zheng Yan
10:56 AM Bug #4894 (Resolved): mds: standby shut itself down due to not having any data
... Greg Farnum
03:05 PM Feature #4326 (Fix Under Review): qa: add samba + (kclient|ceph-fuse) to suite
These changes were part of the samba.py task changes in wip-samba-tasks. An example of use is in ceph-qa-suite:suite... Sam Lang
08:28 AM Bug #4832: mds: failed auth_unpin assert
hit this again:... Sage Weil

05/01/2013

03:06 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
This is happening with the argonaut ceph-fuse daemon, not a cuttlefish one. Going to turn this down to High again and... Greg Farnum
02:19 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
steps to reproduce:
bring up a cluster of 2 nodes running argonaut, run blogbench workload on it from client.
upg...
Tamilarasi muthamizhan
12:46 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
Are you still running old clients when you hit this? Greg Farnum
12:04 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
hitting this pretty consistently when upgrading mds from argonaut to cuttlefish.
have this reproduced on burnupi39...
Tamilarasi muthamizhan
02:27 PM Bug #4105 (Resolved): mds: fix up the Dumper
Merged into next in commit:dfacd1bd805ebb730b5206c9830b28f47cc7f9cf. Hurray! Greg Farnum
02:20 PM Bug #4105 (Fix Under Review): mds: fix up the Dumper
wip-4105-mds-dumper
Wasn't actually that complicated; it's just the locking expectations around the Objecter chang...
Greg Farnum
02:23 PM Feature #4886 (Resolved): teuthology: add tests that use the MDS dumper
We want to prevent the Dumper from bitrotting like it has been. Figure out a simple and effective way to test the dum... Greg Farnum
02:20 PM Feature #4885 (Resolved): dumper: do an incremental log dump
Right now we read it all into memory and then dump it out into a file. So far that's been okay, but we probably want ... Greg Farnum
11:28 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
Hmm, I thought we handled renames properly since they involve changing the caps state. But maybe we don't propagate t... Greg Farnum
11:13 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-01_01:00:37-fs-next-testing-basic/4534 Sage Weil
04:41 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
I think this is a general issue. When handling MClientReconnect, if an inode is not in the cache, the MDS tries fetch... Zheng Yan

04/30/2013

01:26 PM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
The attached files include the complete client log, along with the mds logs that include 10000000004 (one of the indo... Sam Lang
09:39 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
We can't revoke on unlink because the file might still be held open with something accessing it. :) Greg Farnum
09:38 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
This looks like the client creates a file, then unlinks it, but it never removes it from its cache, because it still ... Sam Lang
05:41 AM Bug #4850 (In Progress): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
Sam Lang

04/29/2013

04:58 PM Bug #4853 (Resolved): ceph-fuse hang on mount getattr
commit:ee553ac279664b7f1b527a0b1b56768134cf5157 Sage Weil
12:43 PM Bug #4853: ceph-fuse hang on mount getattr
this is not a new race, and is only triggered when a mds session open and request race with an mds restart. not a cu... Sage Weil
10:47 AM Bug #4853 (Fix Under Review): ceph-fuse hang on mount getattr
fix in wip-up
here is the client-side log that shows we send the getattr twice. we only process the first reply, ...
Sage Weil
09:21 AM Bug #4853: ceph-fuse hang on mount getattr
Ignore that, wrong bug — sorry. Greg Farnum
09:20 AM Bug #4853: ceph-fuse hang on mount getattr
/a/teuthology-2013-04-28_21:32:40-fs-next-testing-basic/2662
That's an fsstress run that got hung, I copied the cl...
Greg Farnum
09:02 AM Bug #4853 (In Progress): ceph-fuse hang on mount getattr
Sage Weil
08:38 AM Bug #4853 (Resolved): ceph-fuse hang on mount getattr
100% reproducible with this job file... Sage Weil
02:26 PM Bug #4861 (Rejected): Alter Java components to build against Java 1.6 (or 1.7)
The Java packages use -source 1.5 to specify that they should use that version of the API. This is being done for com... Anonymous
09:21 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
/a/teuthology-2013-04-28_21:32:40-fs-next-testing-basic/2662
That's an fsstress run that got hung, I copied the cl...
Greg Farnum

04/28/2013

08:51 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
have full log.. put a copy in the run dir Sage Weil
08:50 AM Bug #4850 (Resolved): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
... Sage Weil

04/26/2013

11:19 AM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
/a/teuthology-2013-04-26_02:29:14-fs-next-testing-basic/1450 Greg Farnum
11:17 AM Bug #4832 (Resolved): mds: failed auth_unpin assert
... Greg Farnum
10:44 AM Bug #4829 (Closed): client: handling part of MClientForward incorrectly?
(In reference to a backwards check for is_replay when doing encode_cap_releases())... Greg Farnum
09:52 AM Bug #4742 (Resolved): mds: stuck clientreplay request
commit:5121e56c255c079569f02e0ee852e469f38f470e Sage Weil

04/25/2013

06:34 PM Bug #4742: mds: stuck clientreplay request
Yeah, we've discussed this some on github around wip-4742 and on irc. :) Greg Farnum
06:31 PM Bug #4742: mds: stuck clientreplay request
Looks like a client bug, it may add cap releases to the replay requests. (encode_cap_releases() should be called when... Zheng Yan
10:38 AM Bug #4742: mds: stuck clientreplay request
Logs for two runs, one is stuck in replay from a setattr, the other is stuck in replay from a rename.
Sam Lang
 

Also available in: Atom