Activity
From 06/10/2013 to 07/09/2013
07/09/2013
- 09:37 AM Feature #5520: osdc: should handle namespaces
- This is needed before librbd or cephfs can use namespaces
07/08/2013
- 05:55 PM Bug #5036: `ls` hangs on random folder
- was your mds complied from the newest source code? was there mds restart before you saw the hang? if there was, the b...
- 01:03 PM Bug #5036: `ls` hangs on random folder
- Yan,
Even after rebuilding my 3.10 kernel with the missing fix (libceph: call r_unsafe_callback when unsafe reply ... - 02:47 PM Feature #5520 (Rejected): osdc: should handle namespaces
As a follow on to 4982/4983 we should implement namespace handling in the ObjectCacher.
07/03/2013
- 10:11 PM Bug #4850 (Resolved): ceph-fuse: disconnected inode on shutdown with fsstress + mds thrashing
- i think this is resolved now...
- 09:56 PM Bug #5453 (Resolved): kclient: multiple_rsync tee output partially zeroed
- thanks, added this to the test suite
- 06:32 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- my reproducer...
- 10:31 AM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- patch is in testing branch (tho i'm tracking down a different regression in that branch).
btw, yan, were you able ... - 06:03 AM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- Sage Weil wrote:
> in combination with the mds s/wrlock/xlock/ change?
the kclient patch fixes this issue alone. ... - 06:19 PM Bug #5036: `ls` hangs on random folder
- I suggest trying test branch ceph-client. At least two bugs that can cause hang like this have been fixed.
- 04:08 PM Bug #5036: `ls` hangs on random folder
- ...
- 02:25 PM Bug #5036: `ls` hangs on random folder
- Milosz, I think you've run into #2019, which I've reopened. Quan might have seen the same issue but there's not the r...
- 02:15 PM Bug #5036: `ls` hangs on random folder
- I'm experiencing the same issue here when trying to ls one directory in our ceph cluster on nodes. Using both vanilla...
- 02:53 PM Bug #2019: mds: CInode::filelock stuck in sync->mix
- it's a kclient bug, probably already fixed by 'libceph: call r_unsafe_callback when unsafe reply is received'
- 02:31 PM Bug #2019: mds: CInode::filelock stuck in sync->mix
- I'm into this as part of bug: #5036...
- 02:27 PM Bug #2019: mds: CInode::filelock stuck in sync->mix
- See #5036.
- 12:48 PM Fix #5498 (New): client: report actual file count instead of object count
- Right now we fill in the statvfs struct's f_files member with the number of files. Instead we should use the mount po...
- 09:33 AM Bug #5458: mds: standby-replay -> replay takeover does not handle racing expire/trim
- Saw this backtrace again at /a/teuthology-2013-07-02_01:00:48-fs-next-testing-basic/52429/teuthology.log
07/02/2013
- 04:57 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- btw the mds change passed 11 iterations before it stopped because of a chef/network hiccup.
- 09:42 AM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- in combination with the mds s/wrlock/xlock/ change?
- 09:30 AM Support #5491 (Rejected): Use of RBD kernel module
- Questions like this should be addressed to the ceph-users list, please. :)
http://ceph.com/resources/mailing-list-irc/ - 04:02 AM Support #5491 (Rejected): Use of RBD kernel module
- I would be grateful if you could explain a couple of items as these are not immediately clear to me from the Ceph doc...
07/01/2013
- 09:44 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- patch "ceph: fix pending vmtruncate race" should fix the issue.
- 12:36 PM Feature #5486 (Resolved): kclient: make it work with selinux
- see #5477 for the latest failed attempt
- 12:34 PM Bug #5477 (Resolved): Unable to create files on CephFS on Fedora 18 using kernel module
- 12:19 PM Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
- Many thanks for the responses Sage and Greg. You were right - once I disabled SElinux this worked.
Chris - 10:51 AM Bug #5485 (Can't reproduce): failed cifs mount
- In teuthology, logs at /a/teuthology-2013-07-01_01:00:46-fs-master-testing-basic/51619...
06/29/2013
- 05:35 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- please check if the attached patch solves this issue
06/28/2013
- 06:31 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- i hit it after just a couple iterations of the teuthology test. i'll capture the osd log...
- 06:08 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- I can't reproduce this locally. how difficult to reproduce this? what's the backend fs for osd?
- 05:32 PM Bug #5411: teuthology: bad object dereference
- IME that's what this kind of error from gevent/eventlet etc. means - once the thread exits in a certain abnormal way,...
- 03:28 PM Bug #5411: teuthology: bad object dereference
- Yeah, I am/somebody will need to spend some time digging into this when we have some time free. There's another issue...
- 03:24 PM Bug #5411: teuthology: bad object dereference
- I think this is just a symtom of the mds_thrasher crashing, but not logging the exception since this join happens bef...
- 02:25 PM Bug #5381 (Pending Backport): ceph-fuse: stuck with disconnected inodes on shutdown
- commit:946a838cffa0927d1237489e8c2c143e87d66892
- 09:31 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- Wow, that is a much simpler test case than I would expect to be required. I can reproduce with a single file and this...
- 02:24 AM Bug #5250: ceph-mds 0.61.2 aborts on start
This is all in the lab at present.
We have been doing some additional testing, and have now confirmed that this...- 09:23 AM Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
- And you don't need any kernel support to run the Ceph daemons. You should also check the permissions — it's possible ...
- 06:51 AM Bug #5477: Unable to create files on CephFS on Fedora 18 using kernel module
- I suspect this is SElinux or something similar getting in the way...
- 04:47 AM Bug #5477 (Resolved): Unable to create files on CephFS on Fedora 18 using kernel module
- I have mounted a CephFS filesystem on a Fedora 18 system, which succeeds as follows:
[root@e8c4-dl360g7-03 ceph]# ...
06/27/2013
- 09:39 PM Bug #5381 (Fix Under Review): ceph-fuse: stuck with disconnected inodes on shutdown
- 09:22 AM Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
06/26/2013
- 11:15 PM Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
- this is sufficient to reproduce. i think this is a problem with unlinked inodes in the client cache not getting clea...
- 10:08 PM Bug #5453: kclient: multiple_rsync tee output partially zeroed
- putting the tee'd file in /tmp fixes the problem, implying this is a kclient/cephfs bug of some sort. moving this in...
06/25/2013
- 07:48 PM Bug #5450 (Resolved): mds: failed CDir::_fetched() assert
- nice! cherry-picked to commit:ccb3dd5ad5533ca4e9b656b4e3df31025a5f2017
- 07:08 PM Bug #5450: mds: failed CDir::_fetched() assert
- 0.61.4-5-gd572cf6 ? probably already fixed by commit:81d073fecb (mds: fix underwater dentry cleanup)
- 10:20 AM Bug #5450 (Resolved): mds: failed CDir::_fetched() assert
- ...
- 07:19 PM Bug #5418: kceph: crash in remove_session_caps
- Zheng Yan wrote:
> I still don't figure out that root cause of the crash, infinite loop in iterate_session_caps(), B... - 07:01 PM Bug #5418: kceph: crash in remove_session_caps
- I still don't figure out the cause of the crash, infinite loop in iterate_session_caps(), BUG_ON(session->s_nr_caps >...
- 01:00 PM Bug #5418: kceph: crash in remove_session_caps
- ubuntu@teuthology:/a/teuthology-2013-06-25_01:00:47-kernel-next-testing-basic/45603
- 06:01 PM Bug #5458 (Duplicate): mds: standby-replay -> replay takeover does not handle racing expire/trim
- not sure this is the right diagnosis since i only looked at this briefly, but:...
- 12:13 PM Bug #5453 (In Progress): kclient: multiple_rsync tee output partially zeroed
- 12:09 PM Bug #5453 (Resolved): kclient: multiple_rsync tee output partially zeroed
- latest run:...
06/24/2013
- 05:54 PM Bug #5381 (Need More Info): ceph-fuse: stuck with disconnected inodes on shutdown
- 01:03 PM Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
- next time we see this (or any other ceph-fuse hsutdown hang), grab teh logs manually via scp before nuking, and note ...
- 10:58 AM Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
- done, commit:f046dab88fcfeda23391bcd694abc65ff1ed8cd8
- 10:12 AM Bug #5333 (Pending Backport): mds: segfault in MDLog::standby_trim_segments
- I saw this crash under teuthology in the next branch as well; can we put it there?
- 10:44 AM Bug #5411: teuthology: bad object dereference
- #5333 is what I was referring to. There's a whole string of failures which are hitting both that and this.
- 10:08 AM Bug #5411: teuthology: bad object dereference
- Josh, I went back and looked at the first instance (/a/teuthology-2013-06-18_01\:00\:37-fs-next-testing-basic/38877/)...
- 10:05 AM Bug #5411: teuthology: bad object dereference
- Happened again...
- 09:45 AM Bug #5411: teuthology: bad object dereference
- If you look at the message from the first exception, it says the mds failed:...
- 09:36 AM Bug #5382: mds: failed objecter assert on shutdown
- /a/teuthology-2013-06-23_20:00:47-fs-cuttlefish-testing-basic/43843/teuthology.log
- 09:10 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- Unfortunately this is an area where CephFS needs some hardening and some recovery tools — part of why we don't recomm...
- 05:49 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- We have fit a very similar problem with V0.61.2. We are unable to start any MDS daemons following testing that involv...
06/23/2013
- 10:12 PM Bug #5021: ceph-fuse: crash on traceless reply
- btw wip-5021 still hasn't merged because it failed the smbtorture test. i'll rebase on master and retest to see wher...
- 10:09 PM Bug #5105 (Duplicate): mds/CInode.cc: 1996: FAILED assert(auth_pins >= 0)
- #4832
- 10:06 PM Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
- commit:abd0ff64e108b7670a062b3fa39baaf3d3e48fb3
- 04:30 PM Bug #5430 (Duplicate): newfs makes ceph-mds segfault in suicide
- #5432
- 10:57 AM Bug #5430: newfs makes ceph-mds segfault in suicide
- ...
- 10:52 AM Bug #5430 (Duplicate): newfs makes ceph-mds segfault in suicide
- ...
06/21/2013
- 12:02 PM Bug #5418: kceph: crash in remove_session_caps
- kdb dumpall attached
- 12:02 PM Bug #5418 (Resolved): kceph: crash in remove_session_caps
- ...
06/20/2013
- 09:33 PM Fix #5399: timestamp changes on replayed mds request (pjd link 71)
- probably need to extend the replayed request message to include the timestamps we got for the inode and dir so that t...
- 09:33 PM Fix #5399: timestamp changes on replayed mds request (pjd link 71)
- - we send a create to mds
- get an ack, but it isn't journaled
- pjd stats the mtime/ctime/ec.
- mds restarts
- w... - 09:12 PM Bug #5290: mds: crash whilst trying to reconnect
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_13:32:57-fs-master-testing-basic/41231
logs in ... - 06:45 PM Bug #5333 (Fix Under Review): mds: segfault in MDLog::standby_trim_segments
- wip-5333
this looks like a simple matter of not crashing if the segment list is empty. that at least covers this ... - 12:53 PM Bug #5333: mds: segfault in MDLog::standby_trim_segments
- Just a note: maybe we missed a spot, but I remember doing a re-read head object, retry journal read whenever we get a...
- 12:47 PM Bug #5333: mds: segfault in MDLog::standby_trim_segments
- ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-06-20_01:00:49-fs-next-testing-basic/40965
with ful... - 06:15 PM Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
- 12:30 PM Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
- 02:42 PM Bug #5411 (Resolved): teuthology: bad object dereference
- ...
- 01:30 PM Fix #5268: mds: fix/clean up file size/mtime recovery code
- See also #4485.
- 01:30 PM Feature #4485: Improve "needsrecover" handling
- See also #5268.
- 01:24 PM Feature #1693 (In Progress): libcephfs: Support TRIM (hole punching)
- See "[PATCH] Ceph-fuse: Punch hole support" from Li Wang.
- 01:17 PM Feature #3541 (In Progress): mds: robust ino lookup using file backpointers
- A bunch of this got done, but Sage isn't sure if the client -> LOOKUPINO messages are wired up to that infrastructure...
06/19/2013
- 10:48 PM Bug #5289: mds closing stale session
- Sage Weil wrote:
> this is caused when teh client is not talknig to the mds. can you verify the network is working, ... - 08:08 PM Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
- The patch only fixes the root cause. It doesn't help if objects already have wrong size.
- 04:02 PM Fix #5399 (New): timestamp changes on replayed mds request (pjd link 71)
- Hmm, Sage points out this might be something else; reopening.
- 03:56 PM Fix #5399 (Rejected): timestamp changes on replayed mds request (pjd link 71)
- It's a time stamp check for things going backwards, and is failing due to out-of-sync clocks (over a network) being h...
- 03:44 PM Fix #5399 (Resolved): timestamp changes on replayed mds request (pjd link 71)
- teuthology-2013-06-19_10:46:59-fs-cuttlefish-master-basic 40138 40141
- 11:43 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- I'm still using the cluster with the modified ceph-mds program, it still works. I caused another power outage (this i...
06/18/2013
- 12:51 PM Bug #5289 (Can't reproduce): mds closing stale session
- this is caused when teh client is not talknig to the mds. can you verify the network is working, and ceph-fuse is hea...
- 09:34 AM Bug #5379 (Resolved): mds/ceph-fuse hang on mount
06/17/2013
- 09:15 PM Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
- This is different from #4850. In issue #4850, disconnected inodes have no cap. In this issue, all disconnected inodes...
- 01:32 PM Bug #5381: ceph-fuse: stuck with disconnected inodes on shutdown
- Good chance this is a duplicate of #4850 (though that's fsstress, so maybe not).
- 01:22 PM Bug #5381 (Resolved): ceph-fuse: stuck with disconnected inodes on shutdown
- Seen this at least 2x in the last few days:...
- 05:43 PM Bug #5380: osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
- see commit a41bad1a9b(ceph: re-calculate truncate_size for strip object)
- 01:18 PM Bug #5380 (Resolved): osdc/Filer.cc: 163: FAILED assert(probe->known_size[p->oid] <= shouldbe)
- on mds shutdown...
- 04:44 PM Bug #5379: mds/ceph-fuse hang on mount
- 12:52 PM Bug #5379 (Resolved): mds/ceph-fuse hang on mount
- have observed serveral times ceph-fuse hanging on getattr(#1). latest job was...
- 02:09 PM Bug #5382: mds: failed objecter assert on shutdown
- Sorry, logs at /a/teuthology-2013-06-15_01:00:44-fs-next-testing-basic/36375
- 02:07 PM Bug #5382 (Can't reproduce): mds: failed objecter assert on shutdown
- I haven't been through this completely, but it looks like the mds went laggy, and then it received a SIGTERM (the tes...
- 12:24 PM Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
- commit:ee40c217e373b538e227f7218b09c1c794b4124a
06/16/2013
- 05:50 AM Bug #5367: multiclient tests: kernel mount gets EPERM
- kclient and MDS never return -EACCES. was ior executed with root privilege?
06/15/2013
- 07:46 PM Bug #5367: multiclient tests: kernel mount gets EPERM
- mpi-fsx also gets EPERM.
- 07:15 PM Bug #5367 (Resolved): multiclient tests: kernel mount gets EPERM
- ...
- 07:45 PM Bug #5368 (Resolved): ceph-fue: fsx-mpi hangs in _sync_read
- infinite loop in _sync_read() due to a short read. see wip-client-sync.
06/14/2013
- 12:50 PM Bug #5360 (Rejected): ceph-fuse: failing smbtorture tests
- We're failing the maxfid test when samba is backed by a ceph-fuse mount. It seems to be an inconsistent (this is the ...
06/13/2013
- 07:43 PM Bug #5333: mds: segfault in MDLog::standby_trim_segments
- I think it's an old race. The standby MDS gets the pos of journal head, then reads the corresponding journal object. ...
- 02:02 PM Bug #5333: mds: segfault in MDLog::standby_trim_segments
- I see that Yan changed one line in this function recently (which shouldn't have had any impact), but other than that ...
06/12/2013
- 01:23 PM Bug #5333 (Resolved): mds: segfault in MDLog::standby_trim_segments
- ...
- 06:10 AM Bug #5290: mds: crash whilst trying to reconnect
- Hi Zheng,
Is this what you mean?
06/11/2013
- 08:55 AM Bug #5303 (Resolved): OSD segfaults on SIGINT
- This was a missed backport for an old fix. I pushed it to the cuttlefish branch and it will be included in .4. Thanks!
- 08:41 AM Bug #5303: OSD segfaults on SIGINT
- Without debugger:...
- 08:38 AM Bug #5303 (Resolved): OSD segfaults on SIGINT
- This is not the first time but interrupting the OSD with SIGINT (CTRL+C) causes a segmentation fault.
Cuttlefish 0... - 07:19 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- Removing the assert worked around the problem:...
- 06:32 AM Bug #5250: ceph-mds 0.61.2 aborts on start
- I noticed that resetting the MDS journal using ceph-mds -i 1 --reset-journal 0 -d hangs there....
06/10/2013
- 10:28 PM Bug #5290: mds: crash whilst trying to reconnect
- looks like session map corruption.
Damien, please upload the session map. you can find where is it by "ceph osd ma... - 02:16 AM Bug #5290 (Can't reproduce): mds: crash whilst trying to reconnect
- Hi,
Recently I experienced an issue with the mds servers in my cluster, the cluster storage would be absolutely fi... - 09:42 AM Bug #5287 (Resolved): the permission of file in CephFS
Also available in: Atom