Activity
From 04/24/2013 to 05/23/2013
05/23/2013
- 11:16 PM Bug #5162 (Can't reproduce): File is locked unexpected and not released anymore
- I deployed a ceph cluster and mount cephfs via kernel module. After using it few days later, when I ls a particular f...
- 10:35 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- I have attached the logs from two nodes of my MDS cluster.
I started mds.0 first. When I started mds.1, mds.0 crashed. - 05:54 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- Sage Weil wrote:
> Argh.. i don't have a log after all.
>
> Yan, dropping the assert avoids teh crash, but it see... - 09:20 AM Bug #4832: mds: failed auth_unpin assert
- ubuntu@teuthology:/a/teuthology-2013-05-23_01:00:08-rados-next-testing-basic/20276
05/22/2013
- 02:12 PM Bug #5031 (Need More Info): mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- 02:12 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- Walter: can you produce a log? 'debug mds = 20', 'debug ms = 1', restart the mds and wait for it to crash.
I have... - 02:10 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- Argh.. i don't have a log after all.
Yan, dropping the assert avoids teh crash, but it seems like the real issue i...
05/21/2013
- 07:37 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- I also have encountered this. Under Bobtail, I had it running with 2 active nodes and a passive node. Now, I can only...
- 12:57 PM Bug #4832: mds: failed auth_unpin assert
- 12:57 PM Bug #4832: mds: failed auth_unpin assert
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-21_01:00:40-fs-next-testing-basic/18590
05/20/2013
- 01:49 PM Bug #5104 (Can't reproduce): MDS crashed in Objecter::handle_osd_op_reply
- if there is a decode error, find the parent frame with the bl and p bl._len to see how big it is. usually it is fall...
- 01:44 PM Bug #5104: MDS crashed in Objecter::handle_osd_op_reply
- Sadly, no logs on this guy any more, sorry.
If it happens again, I'll collect that frame 15 info.
Do you have a... - 01:23 PM Bug #5104 (Need More Info): MDS crashed in Objecter::handle_osd_op_reply
- any logs? would love to see value of 'r' and bl._len in frame 15..
- 09:38 AM Bug #5105: mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
- probably a dup of #4832?
05/17/2013
- 03:10 PM Bug #5105 (Duplicate): mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
- While trying to reproduce #4999, I collected this in an MDS.
I was running next branch (commit c80c6a032c) merged ... - 02:22 PM Bug #5104 (Can't reproduce): MDS crashed in Objecter::handle_osd_op_reply
- While trying to reproduce #4999, I collected this in an MDS.
I was running next branch (commit c80c6a032c) merged ... - 02:14 PM Bug #5103 (Rejected): mds: hung getattrs after restart
- this was an osd issue.
- 11:06 AM Bug #5103 (Rejected): mds: hung getattrs after restart
- logs on cephdrop ceph-mds.1.log
hung requests are...
05/16/2013
- 11:46 PM Documentation #3672 (Resolved): doc: how to mount ceph-fuse from fstab
- 04:40 PM Bug #5021: ceph-fuse: crash on traceless reply
- 04:39 PM Bug #5021: ceph-fuse: crash on traceless reply
- 500 passes of the job on commit:1f65594c23309b527d74afe648c888c69a3c2acd wip-5021
- 01:05 PM Bug #4965 (Resolved): libcephfs-java test failure
- 09:43 AM Bug #5079 (Resolved): assert in MDCache::_recovered()
- thanks, this one was easy to fix.
commit:64871e093159ad06d84fb2a84c7808a81800dfc4
05/15/2013
- 11:37 AM Bug #5079 (Resolved): assert in MDCache::_recovered()
- While trying to reproduce 4999 with the requested logging, I got this MDS assert.
I'm running cuttlefish branch @ ... - 10:54 AM Bug #5021: ceph-fuse: crash on traceless reply
- ...
- 10:32 AM Bug #5021: ceph-fuse: crash on traceless reply
- ...
05/14/2013
- 10:05 PM Bug #5021 (In Progress): ceph-fuse: crash on traceless reply
- 05:15 PM Bug #4832 (Need More Info): mds: failed auth_unpin assert
- cranked up mds logs in qa.. should get useful info next time we hit this.
- 05:07 PM Bug #4832: mds: failed auth_unpin assert
- recent logs: ubuntu@teuthology:/a/teuthology-2013-05-14_01:00:46-kernel-next-testing-basic/13128
- 01:45 AM Bug #5037: Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
- Our ceph is productive, yeah. We are only using rbd, not CephFS or RadosGW, though. SJust and Sage are familiar with ...
- 01:11 AM Bug #5036: `ls` hangs on random folder
- By turning on the debug mode of MDS:...
05/13/2013
- 12:37 PM Feature #4326 (Resolved): qa: add samba + (kclient|ceph-fuse) to suite
- 11:07 AM Bug #5021: ceph-fuse: crash on traceless reply
- Never mind that comment, I was just looking at the job it happened on, not the actual failure...
- 10:20 AM Bug #5021: ceph-fuse: crash on traceless reply
- Will come back for another pass and verify, but I assume this is the disconnected inode error.
- 10:54 AM Bug #5033: oops in ceph_put_wrbuffer_cap_refs
- plana47 died with:
[0]kdb> bt
Stack traceback for pid 25102
0xffff88001c499f90 25102 23405 1 0 R 0x... - 10:17 AM Bug #5030 (Resolved): libcephfs xattr test failure
- 10:11 AM Bug #5037: Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
- It couldn't find the actual table object in RADOS. We've seen this pop up a few times, but I believe it's always been...
- 01:43 AM Bug #5037 (Can't reproduce): Ceph-MDS asserts after upgrade 0.56.2 -> 0.56.6
- After upgrading our Ceph setup to 0.56.6 from 0.56.2, the MDS processes assert() on start and will not work.
This i... - 09:01 AM Bug #5039 (Resolved): client: unlinking files leaves the cached entry behind
- http://comments.gmane.org/gmane.comp.file-systems.ceph.user/1277
When unlinking a file, the client should make an ... - 04:11 AM Bug #5036: `ls` hangs on random folder
- As you can see, the @ls@ process is stuck in D state:
*@/proc/10297/status@*... - 12:16 AM Bug #5036 (Resolved): `ls` hangs on random folder
- strace hangs at "getdents(3,": https://clbin.com/LktUw
The informations when dumping via SysRq:... - 01:31 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- FYI: I have code that finds the missing inode by using backtrace. The code is under test, will send out soon.
- 01:11 AM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- The items left in reconnected_snaprealms should be other MDS's mdsdir. I comment out that line when running test
05/11/2013
- 05:41 PM Bug #5033 (Can't reproduce): oops in ceph_put_wrbuffer_cap_refs
- ...
- 05:27 PM Bug #5031: mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- logs copied to logs/ subdir
- 05:27 PM Bug #5031 (Resolved): mds/MDCache.cc: 5221: FAILED assert(reconnected_snaprealms.empty())
- ...
- 05:22 PM Bug #5030 (Resolved): libcephfs xattr test failure
- 2013-05-11T02:29:39.882 INFO:teuthology.task.workunit.client.0.out:[ RUN ] LibCephFS.Xattrs
2013-05-11T02:29:39...
05/10/2013
- 11:13 PM Bug #5025 (Resolved): samba smbtorture lock test fails on kclient
- ...
- 10:19 PM Bug #5022 (Resolved): samba: smbtorture failures
- 05:27 PM Bug #5022 (Resolved): samba: smbtorture failures
- logs: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10437...
- 08:21 PM Bug #4965: libcephfs-java test failure
- Commit a095075fe4dcdac817895dac316100e733ab4698 has a patch that I believe fixes this issue. If it resolves things in...
- 05:31 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- ...
- 05:31 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- log: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10442...
- 05:23 PM Bug #5021 (Resolved): ceph-fuse: crash on traceless reply
- logs: ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:36-fs-master-testing-basic/10448...
- 04:58 PM Bug #4832: mds: failed auth_unpin assert
- ubuntu@teuthology:/a/teuthology-2013-05-10_01:00:06-rados-master-testing-basic/10278...
- 04:53 PM Feature #3243 (Resolved): qa: test samba reexport via libcephfs vfs plugin in teuthology
05/09/2013
- 04:04 PM Bug #4965 (In Progress): libcephfs-java test failure
- 02:44 PM Bug #4965: libcephfs-java test failure
- Using the YAML file posted above, this test is passing for me. I ran it 4 times on 2 different sets of plana nodes an...
05/08/2013
- 08:57 PM Bug #4965 (Resolved): libcephfs-java test failure
- ...
- 04:24 PM Bug #4832: mds: failed auth_unpin assert
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-08_01:00:07-rados-master-testing-basic/8600
- 03:13 AM Bug #4241: SELinux fails because it can't set xattrs
- Are you sure about that? ceph_file_iops hasn't been changed since 2009, and the methods are there. The problem still ...
05/07/2013
05/06/2013
- 02:57 PM Bug #4920 (Resolved): client: does not respect O_NOFOLLOW
- It looks like doing an open() always implicitly follows symlinks, because we call path_walk() with followsym set to t...
05/05/2013
- 06:36 AM Bug #4909: mds: stalled/stuck directory (standby)
- Directory accessed only after reboot one of node (with stalled mount's) - not after only ceph daemons restarting.
05/04/2013
- 03:56 AM Bug #4909: mds: stalled/stuck directory (standby)
- Sorry, comment 1 is about ctdbd (IMHO), forget. Only main issue.
- 03:51 AM Bug #4909: mds: stalled/stuck directory (standby)
- & (without debug 10) now log flooding on other node (mds.4):
2013-05-04 13:47:27.648019 7fe8c59ca700 0 mds.0.serv...
05/03/2013
- 07:18 PM Bug #4909 (Can't reproduce): mds: stalled/stuck directory (standby)
- I many times break actions (debug mysql replication script, just multiple dump redirections) directly to directory, m...
- 02:53 PM Bug #4894: mds: standby shut itself down due to not having any data
- MDS::boot_create() first starts a new log segment (its ESubtreemap is empty), then use MDCache::create_empty_hierarch...
- 10:40 AM Bug #4894: mds: standby shut itself down due to not having any data
- You must be racing ahead of me here, Yan — what's your theory? Just that the first active MDS failed to write any log...
- 12:24 PM Feature #4906 (Resolved): ceph-fuse: use the Preforker class
- Sage wrote a Preforker class for the Monitor. We should switch to using that instead of our own band-aided daemonizat...
05/02/2013
- 07:29 PM Bug #4894: mds: standby shut itself down due to not having any data
- I think MDS::boot_create() should start a new log segment after creating the fs hierarchy.
- 10:56 AM Bug #4894 (Resolved): mds: standby shut itself down due to not having any data
- ...
- 03:05 PM Feature #4326 (Fix Under Review): qa: add samba + (kclient|ceph-fuse) to suite
- These changes were part of the samba.py task changes in wip-samba-tasks. An example of use is in ceph-qa-suite:suite...
- 08:28 AM Bug #4832: mds: failed auth_unpin assert
- hit this again:...
05/01/2013
- 03:06 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- This is happening with the argonaut ceph-fuse daemon, not a cuttlefish one. Going to turn this down to High again and...
- 02:19 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- steps to reproduce:
bring up a cluster of 2 nodes running argonaut, run blogbench workload on it from client.
upg... - 12:46 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- Are you still running old clients when you hit this?
- 12:04 PM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- hitting this pretty consistently when upgrading mds from argonaut to cuttlefish.
have this reproduced on burnupi39... - 02:27 PM Bug #4105 (Resolved): mds: fix up the Dumper
- Merged into next in commit:dfacd1bd805ebb730b5206c9830b28f47cc7f9cf. Hurray!
- 02:20 PM Bug #4105 (Fix Under Review): mds: fix up the Dumper
- wip-4105-mds-dumper
Wasn't actually that complicated; it's just the locking expectations around the Objecter chang... - 02:23 PM Feature #4886 (Resolved): teuthology: add tests that use the MDS dumper
- We want to prevent the Dumper from bitrotting like it has been. Figure out a simple and effective way to test the dum...
- 02:20 PM Feature #4885 (Resolved): dumper: do an incremental log dump
- Right now we read it all into memory and then dump it out into a file. So far that's been okay, but we probably want ...
- 11:28 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- Hmm, I thought we handled renames properly since they involve changing the caps state. But maybe we don't propagate t...
- 11:13 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-05-01_01:00:37-fs-next-testing-basic/4534
- 04:41 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- I think this is a general issue. When handling MClientReconnect, if an inode is not in the cache, the MDS tries fetch...
04/30/2013
- 01:26 PM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- The attached files include the complete client log, along with the mds logs that include 10000000004 (one of the indo...
- 09:39 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- We can't revoke on unlink because the file might still be held open with something accessing it. :)
- 09:38 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- This looks like the client creates a file, then unlinks it, but it never removes it from its cache, because it still ...
- 05:41 AM Bug #4850 (In Progress): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
04/29/2013
- 04:58 PM Bug #4853 (Resolved): ceph-fuse hang on mount getattr
- commit:ee553ac279664b7f1b527a0b1b56768134cf5157
- 12:43 PM Bug #4853: ceph-fuse hang on mount getattr
- this is not a new race, and is only triggered when a mds session open and request race with an mds restart. not a cu...
- 10:47 AM Bug #4853 (Fix Under Review): ceph-fuse hang on mount getattr
- fix in wip-up
here is the client-side log that shows we send the getattr twice. we only process the first reply, ... - 09:21 AM Bug #4853: ceph-fuse hang on mount getattr
- Ignore that, wrong bug — sorry.
- 09:20 AM Bug #4853: ceph-fuse hang on mount getattr
- /a/teuthology-2013-04-28_21:32:40-fs-next-testing-basic/2662
That's an fsstress run that got hung, I copied the cl... - 09:02 AM Bug #4853 (In Progress): ceph-fuse hang on mount getattr
- 08:38 AM Bug #4853 (Resolved): ceph-fuse hang on mount getattr
- 100% reproducible with this job file...
- 02:26 PM Bug #4861 (Rejected): Alter Java components to build against Java 1.6 (or 1.7)
- The Java packages use -source 1.5 to specify that they should use that version of the API. This is being done for com...
- 09:21 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- /a/teuthology-2013-04-28_21:32:40-fs-next-testing-basic/2662
That's an fsstress run that got hung, I copied the cl...
04/28/2013
- 08:51 AM Bug #4850: ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- have full log.. put a copy in the run dir
- 08:50 AM Bug #4850 (Resolved): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- ...
04/26/2013
- 11:19 AM Bug #4565: MDS/client: issue decoding MClientReconnect on MDS
- /a/teuthology-2013-04-26_02:29:14-fs-next-testing-basic/1450
- 11:17 AM Bug #4832 (Resolved): mds: failed auth_unpin assert
- ...
- 10:44 AM Bug #4829 (Closed): client: handling part of MClientForward incorrectly?
- (In reference to a backwards check for is_replay when doing encode_cap_releases())...
- 09:52 AM Bug #4742 (Resolved): mds: stuck clientreplay request
- commit:5121e56c255c079569f02e0ee852e469f38f470e
04/25/2013
- 06:34 PM Bug #4742: mds: stuck clientreplay request
- Yeah, we've discussed this some on github around wip-4742 and on irc. :)
- 06:31 PM Bug #4742: mds: stuck clientreplay request
- Looks like a client bug, it may add cap releases to the replay requests. (encode_cap_releases() should be called when...
- 10:38 AM Bug #4742: mds: stuck clientreplay request
- Logs for two runs, one is stuck in replay from a setattr, the other is stuck in replay from a rename.
Also available in: Atom