Activity
From 02/10/2014 to 03/11/2014
03/11/2014
- 01:57 PM Bug #7685 (Can't reproduce): hung/failed teuthology test: cfuse_workunit_misc
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-07_23:00:50-fs-firefly-testing-basic-plana/122094
http://qa-p... - 01:46 PM Bug #7684 (Resolved): failed cfuse_workunit_kernel_untar_build.yaml test
- http://qa-proxy.ceph.com/teuthology/teuthology-2014-03-09_23:00:25-fs-firefly-testing-basic-plana/124157/
The teut... - 07:18 AM Bug #7474 (Won't Fix): Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
- This looks like it's the writeback deadlock when trying to flush from the client to the OSD on a single memory-constr...
- 07:17 AM Bug #6599 (Resolved): client: invalid iterator dereference in Client::trim_caps
- 07:12 AM Feature #5486: kclient: make it work with selinux
- Hmm, Sage notes that maybe it'll work now we support ACLs. Or maybe we can use a special mount option?
- 07:00 AM Bug #2187 (Can't reproduce): pjd chown/00.t failed test 97
- 06:59 AM Bug #2740 (Resolved): mds: crash in Objecter when shutting down too early
03/10/2014
- 12:58 PM Bug #1318: directories disappear across multiple rsyncs
- By “this” I meant files with different timestamps from what they were last set to, as in the first paragraph of comme...
- 12:51 PM Bug #1318: directories disappear across multiple rsyncs
- I'm afraid this still occurs quite often with ceph 0.77 and ceph.ko 3.13.6-gnu. I have a slightly better understandi...
03/06/2014
- 02:22 PM Feature #7324 (Resolved): qa: kcephfs + ACLs (new pjd tests?)
- 07:30 AM Bug #7613: mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
- can upload the core dump and ceph-mds binary to somewhere?
03/05/2014
- 09:59 AM Bug #7613: mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
- I shouldn't have said anything -- minutes later the problem is now happening again.
- 09:47 AM Bug #7613: mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
- Stopping the clients that were attempting to mount cephfs (by restarting them) appeared to let the mds start with no ...
- 08:48 AM Bug #7613 (Can't reproduce): mds/MDCache.cc: 216: FAILED assert(inode_map.count(in->vino()) == 0)
- ...
- 03:36 AM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- __queue_cap_release has code which limits the size of cap release message
03/04/2014
- 11:27 PM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- You think the msg pointer is invalid, and so it's overflowing?
I'm a little concerned at just closing this unless we... - 06:52 PM Bug #4722 (Can't reproduce): kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- it's more likely there is no pre-allocated message. variable 'msg' is pointing to the pre-allocated message list.
03/03/2014
- 10:48 PM Bug #2679 (Can't reproduce): POSIX file lock not released on process termination
- 09:20 PM Bug #2288: libcephfs: setxattr returns EEXIST following removexattr
- commit:48e55d9 is in master and contains the simpler handle_client_setxattr() fix discussed above. Thanks Zheng!
- 06:31 PM Bug #2288 (Resolved): libcephfs: setxattr returns EEXIST following removexattr
- 06:17 PM Bug #2445 (Can't reproduce): crash when removing a non-empty directory
- 06:16 PM Bug #1877 (Can't reproduce): ceph.ko (3.1.6) oopses upon cephfs set_layout of a symlink to a dir
- 06:55 AM Bug #1874 (Can't reproduce): Running `git gc` on a bare git repository hosted by ceph results in ...
- 06:31 AM Bug #1874: Running `git gc` on a bare git repository hosted by ceph results in a bus error.
- Hi,
I have since moved on, but (as it happens) am currently investigating the production use of Ceph at the Univer...
02/28/2014
- 03:08 PM Bug #5382: mds: failed objecter assert on shutdown
- Okay, leaving this alone for the moment — that patch is a good start, and I think the MDLog stuff is actually okay (b...
- 02:02 PM Bug #7565: Failed assert in check_rstats
- What's the bug with check_rstats? Is num_head_items just not expected to be valid at this stage of replay?
Either ... - 01:44 PM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- Unless this part has been fixed by a newer kernel, we still need to deal with it. In particular we were concerned tha...
02/27/2014
- 09:29 PM Cleanup #3742 (Resolved): Remove old Hadoop wrappers and configuration options
- 09:27 PM Bug #3318: java: lock access to CephStat, CephStatVFS from native
- Actually, yeh I'll look at this.
- 03:30 PM Bug #3318: java: lock access to CephStat, CephStatVFS from native
- Is this still an issue, Noah?
- 09:23 PM Bug #4861 (Rejected): Alter Java components to build against Java 1.6 (or 1.7)
- 09:23 PM Bug #4861: Alter Java components to build against Java 1.6 (or 1.7)
- Closing. I'm not sure what the problem is here.. it looks like I am saying that the code builds for a super old versi...
- 03:26 PM Bug #4861: Alter Java components to build against Java 1.6 (or 1.7)
- Do you know the state of the Java code right now, Noah? I wonder if this got done already or is still a bug requiring...
- 07:29 PM Bug #4023: kclient: d_revalidate is abusing d_parent
- The race still exists, but I don't think it's big problem. Because even if ceph_get_dentry_parent_inode() returns a w...
- 04:03 PM Bug #4023: kclient: d_revalidate is abusing d_parent
- Is this still a problem?
- 05:14 PM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- who cares 3.5 kernel?
- 03:28 PM Bug #4722: kernel BUG at fs/ceph/caps.c:1006 invalid opcode: 0000
- Sounds like this might require some protocol work and it's in the kernel client — high!
- 05:13 PM Bug #7565: Failed assert in check_rstats
- it's CDir::check_rstats() bug, not rstat corruption.
- 04:46 PM Bug #7565 (Resolved): Failed assert in check_rstats
This is odd, because it's happening very reproducibly, is not unique to the tip of master, but apparently isn't hap...- 04:51 PM Bug #1181: mds: old_inodes crash
- Snapshots
See also #4248, which may or may not have anything to do with this. - 04:50 PM Bug #926: mds: fix rename between snaprealms
- Snapshots
- 04:50 PM Bug #1552 (Duplicate): qa: file locking test fails
- #7326
- 04:48 PM Bug #2740: mds: crash in Objecter when shutting down too early
- I'm pretty sure this is fixed, but let's check it out and make sure.
- 04:45 PM Bug #3596: ceph-fuse: crash in mds rejoin
- Snapshots
- 04:44 PM Bug #2187: pjd chown/00.t failed test 97
- This is either an MDS or protocol bug since we've seen it across clients.
- 04:43 PM Bug #2863: client: does not tolerate traceless replies from mds
- uclient failure-case: low priority.
I believe we've established that the kclient does not suffer from this issue, ... - 04:42 PM Bug #2288: libcephfs: setxattr returns EEXIST following removexattr
- Confirmed MDS bug!
- 04:40 PM Bug #2679: POSIX file lock not released on process termination
- Let's see if we can reproduce this as it's some combination of kclient, MDS, or protocol bug.
- 04:38 PM Bug #1666: hadoop: time-related meta-data problems
- Also see #7564
But low priority, for this is Hadoop - 04:36 PM Bug #4212: mds: open_snap_parents isn't called all the times it needs to be
- Snapshots
- 04:36 PM Bug #4213: mds: old_parents is never cleaned up
- Snapshots
- 04:35 PM Fix #7564: synchronize MDS and client times in a way that makes pjd happy even under clock skew
- See also #1666
- 04:28 PM Fix #7564 (Duplicate): synchronize MDS and client times in a way that makes pjd happy even under ...
- See #854. We have ops happen on both the client and the MDS, and so sometimes one time wins and sometimes the other d...
- 04:29 PM Bug #854 (Duplicate): unsynchronized clocks between kernel-client/cmds cause PJD fstest failures
- I'm closing this in favor of fix ticket #7564.
- 04:13 PM Bug #1874: Running `git gc` on a bare git repository hosted by ceph results in a bus error.
- So basically two things could have gone wrong here:
1) The OSD replied with a bad tid (unlikely)
2) the client forg... - 04:02 PM Bug #4370: mds: high-cpu utilization in memorymodel:_sample
- Figure out if the current MemoryModel is actually useful for anything — I think it might not be. All the lovely ticke...
- 04:01 PM Bug #3935: kclient: Big directory access bugs (multiple), mixed 32- and 64-bit clients
- The hangs sound like generic cap and request waitlisting issues to to me. The empty directory is tickling something i...
- 03:57 PM Bug #4248: mds: replay does not correctly update CInode::first and ::last members
- I'm going to leave this at normal even though it's a snapshotting issue — the problem's diagnosed and it's a bug in t...
- 03:53 PM Bug #4134: mds: request locking hang under snaptests
- snapshots = low
- 03:52 PM Bug #3719 (Can't reproduce): pjd test 145 failed in the nightly runs
- These logs are gone.
- 03:45 PM Bug #4280: mds: crash on lookupsnap
- Snapshots = low priority
- 03:38 PM Bug #2445: crash when removing a non-empty directory
- Let's validate behavior here — there's a good chance Zheng or somebody fixed whatever bug caused this, and we want to...
- 03:32 PM Bug #1877: ceph.ko (3.1.6) oopses upon cephfs set_layout of a symlink to a dir
- Kernel client layout crash = high. Identify if this is still a problem, and if we can trigger it using the vxattrs as...
- 03:30 PM Bug #4738: libceph: unlink vs. readdir (and other dir orders)
- Need more info, samba, uclient, etc.
- 03:27 PM Bug #4732: uclient: client/Inode.cc: 126: FAILED assert(cap_refs[c] > 0)
- The blocker bug is low, so this one can't have a higher priority.
- 03:25 PM Bug #4920: client: does not respect O_NOFOLLOW
- uclient = low priority, for now.
- 03:25 PM Bug #4188: mds crashes when cow-ing entries in formerly snapshotted dir
- Snapshots = low priority. *sigh*
- 03:21 PM Bug #5360: ceph-fuse: failing smbtorture tests
- Samba against ceph-fuse (not even using libcephfs) = low priority.
- 03:20 PM Feature #5486: kclient: make it work with selinux
- I don't know anything about SELinux, nor its users. What needs to work for us to support SELinux, and how big of a st...
- 03:19 PM Bug #5762: teuthology: Failed MPI runs lead to a hung test instead of a failure
- It's a test which we can't use properly. High priority!
- 03:18 PM Bug #6458 (Need More Info): journaler: journal too short during replay
- I've bumped up #4708, so if that's the cause of this it'll be fixed when that is. If not, we need more info.
- 03:17 PM Fix #4708: MDS: journaler pre-zeroing is dangerous
- #6458 could be a result of this issue, so I'm bumping up the priority.
- 03:14 PM Bug #5950: kcephfs: cephfs set_layout -p 4 gets EINVAL
- We want to use the virtual xattrs moving forward, so downgrading a bug in the cephfs tool.
- 01:38 PM Bug #7474: Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
- I wasn't on 3.8, it was 3.11. Unfortunately I can't use the machines I was experimenting with for this purpose anymor...
- 01:19 PM Bug #7474: Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
- Zheng, do you have a specific bug you think this is so we can close it out?
- 01:24 PM Bug #6741: failed snaptest-2.sh; got ENOTEMPTY on should-be empty dir
- Downgrading: ceph-fuse and snapshots.
- 01:23 PM Bug #6609: teuthology rsync workunit failure
- I haven't noticed this in a while, but upgrading as it was a failure across both clients.
- 01:22 PM Bug #5864: cfuse_workunit_suites_ffsb suite on Centos hangs with *** Got Signal Interrupt ***
- This is passing in the nightlies, so if there is a bug it has to do with not only ceph-fuse, but ceph-fuse specifical...
- 01:20 PM Bug #7206 (Need More Info): Ceph MDS Hang on hadoop workloads
- 01:18 PM Bug #7485 (Resolved): Killing MDS during 'creating' breaks subsequent startup (no snaptable)
- We merged this to master in commit:9a040bfd46d141712c32aaa0fa8fc5de93336306, but I guess we missed closing out the ti...
- 06:50 AM Feature #7325: mds: tool to examine (later, manipulate) dirfrag objects
- Is this intended to be an online thing (modifying live MDS state), or something that operates on the RADOS objects (i...
- 06:35 AM Bug #5382: mds: failed objecter assert on shutdown
- There was an earlier patch that introduced an "I'm in dispatch" flag, and a more recent one (https://github.com/ceph/...
02/26/2014
- 05:01 PM Bug #4746: client: invalidate callback can deadlock
- Demoted due to ceph-fuse and FUSE interface work.
- 05:00 PM Bug #4829: client: handling part of MClientForward incorrectly?
- Demoting due to uclient and multi-mds.
- 04:58 PM Bug #5787: client/Client.cc: 2081: FAILED assert(!unclean) in put_inode
- Demoting due to uclient and Need More Info.
- 04:57 PM Bug #6473: multimds + ceph-fuse: fsstress gets ENOTEMPTY on final rm -r
- Demoting due to multi-mds.
- 04:57 PM Bug #5765: kclient: High CPU due to raw_spin_lock in ceph_cap_string
- Demoting due to performance, not correctness.
- 04:56 PM Bug #5021: ceph-fuse: crash on traceless reply
- Demoting due to uclient.
- 04:55 PM Bug #5382: mds: failed objecter assert on shutdown
- I'm pretty sure we had a discussion about your patch, but I can't find the comments and I don't remember the outcome....
- 04:48 PM Bug #6608: samba teuthology dbench failure
- Demoting priority on samba.
- 04:47 PM Bug #7011: ENOTEMPTY on ceph-fuse + snaptest-? test
- Demoting priority on ceph-fuse and snapshots.
- 04:47 PM Bug #6613: samba is crashing in teuthology
- Demoting priority on samba.
- 04:37 PM Feature #7326: qa: fix flock tests
- I don't remember which tests these are; the locktest ones that are racy, or something else?
- 04:35 PM Feature #7352: mds: make classes encode/decode-able
- We've already merged in the MDSTable and Journaler header dumping stuff; I think that's all the stuff that you were t...
- 04:29 PM Feature #4001 (Resolved): Implement the migration path from using the AnchorTable to using lookup...
- Again, Zheng got this done.
- 04:26 PM Cleanup #3742: Remove old Hadoop wrappers and configuration options
- This is already done, isn't it Noah? At least, the old stuff isn't where it used to be and I didn't see it with the n...
- 04:18 PM Feature #118: kclient: clean pages when throwing out dirty metadata on session teardown
- I can't find the referenced ticket anywhere. Anybody know what this is supposed to be and if it still applies? (I thi...
- 07:15 AM Bug #7530: mds: failed anchor assert on replay
- commit:7ba3200f1e91d803cdf84f96777641f7d18d3c01
- 01:08 AM Feature #7531 (Closed): MDS: support required feature sets like the OSD and monitor
- MDS map contains CompatSet::FeatureSet
02/25/2014
- 06:27 PM Bug #7530 (Resolved): mds: failed anchor assert on replay
- 09:14 AM Bug #7530: mds: failed anchor assert on replay
- config used was (suites/fs/thrash/): ceph/base.yaml ceph-thrash/default.yaml clusters/mds-1active-1standby.yaml debug...
- 09:09 AM Bug #7530: mds: failed anchor assert on replay
- Crashed on first try, log at debug-mds=10 attached
- 07:04 AM Bug #7530 (In Progress): mds: failed anchor assert on replay
- 07:25 AM Bug #7503: mds start and oops after access to cephfs
- fine, ok for the ticket #7531.This one should be closed.
- 07:02 AM Feature #3863 (In Progress): implement a tool to lookup inode numbers without holding their path
02/24/2014
- 10:31 PM Bug #7503 (Won't Fix): mds start and oops after access to cephfs
- Ah, it sounds like this is happening because the MDS doesn't currently have a good versioning system to prevent too-o...
- 10:30 PM Feature #7531 (Closed): MDS: support required feature sets like the OSD and monitor
- This'll be a little interesting because the MDS doesn't have local storage. Evaluate if feature sets are best stored ...
- 10:11 PM Bug #7530 (Resolved): mds: failed anchor assert on replay
- ...
- 05:10 PM Feature #4000 (Resolved): Design a migration path from using the AnchorTable to using lookup-by-ino
- 03:16 PM Feature #4000: Design a migration path from using the AnchorTable to using lookup-by-ino
- Did this already get done with Zheng's work to remove the AnchorTable?
- 05:10 PM Feature #7323 (Resolved): mds: fix and merge pending libcephfs changes
- 05:09 PM Feature #3999 (Resolved): update CDir encoding
- this was revved as part of zheng's omap stuff
- 12:58 AM Feature #7315 (Closed): review and merge zheng's dirfrag series
02/21/2014
- 03:15 PM Bug #7503: mds start and oops after access to cephfs
- Ok for explanation, and as already said, all that data was test data, so I can loose it without problems. I also full...
- 08:45 AM Bug #7503: mds start and oops after access to cephfs
- MDS is getting an ENFILE (object lost) from the OSD while trying to read the OMAP from one of its stray directory obj...
- 07:18 AM Bug #7503 (Won't Fix): mds start and oops after access to cephfs
- this is a follow up to http://tracker.ceph.com/issues/7367, which explain the scenario.
I now attach the mds.log
- 08:30 AM Bug #7485 (Fix Under Review): Killing MDS during 'creating' breaks subsequent startup (no snaptable)
- 08:29 AM Bug #7485: Killing MDS during 'creating' breaks subsequent startup (no snaptable)
- https://github.com/ceph/ceph/pull/1283
- 08:14 AM Bug #7485: Killing MDS during 'creating' breaks subsequent startup (no snaptable)
MDS -1 gid 1 starts in BOOTING, sends a beacon
MON prepare_beacon records its existence and puts it into state STA...
02/20/2014
- 06:23 AM Bug #7485 (Resolved): Killing MDS during 'creating' breaks subsequent startup (no snaptable)
Pretty easy to reproduce: start MDS for first time on fresh cluster (I'm using vstart here), ctrl-c it promptly, tr...
02/19/2014
- 10:14 PM Bug #6608: samba teuthology dbench failure
- We're still occasional samba test failures, but I haven't diagnosed them carefully enough to know if they're this fai...
- 07:23 PM Bug #7474: Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
- are you using 3.8 kernel? if you are, please try 3.12 or 3.13
- 01:07 AM Bug #7474 (Won't Fix): Kernel oops with cephfs [ceph_write_begin -> *x8 -> wait_on_page_read]
- I'm on Ubuntu 13.10 and I've installed the packages distributed with it (ceph-deploy 1.2.3-0ubuntu1 and `ceph` 0.67.4...
02/18/2014
- 11:07 PM Bug #6608: samba teuthology dbench failure
- still see the issue ?
02/17/2014
- 01:37 PM Bug #7422 (In Progress): client/barrier.h uses boost's interval set library, which is not availab...
- The barrier code has been disabled to fix the build. Matt said he will follow up. http://marc.info/?l=ceph-devel&m=...
- 01:34 PM Bug #7373 (Resolved): kcephfs nfs file create failes with EOPNOTSUPP
- 08:44 AM Bug #7424 (Rejected): Cannot read from zero-length file
- Pavel Veretennikov wrote:
> * Strange that it worked without permission. Where had it stored the data?
It was onl... - 07:32 AM Bug #7424: Cannot read from zero-length file
- * Strange that it worked without permission. Where had it stored the data?
- 07:31 AM Bug #7424: Cannot read from zero-length file
- Yes, the problem resolved after I gave client access to default data pool
rwx pool=data
Strange that it work... - 06:52 AM Bug #7424: Cannot read from zero-length file
- does client have permission permission to access the data pool? try using admin's keyring to mount the fs.
- 01:19 AM Bug #7424: Cannot read from zero-length file
- Ubuntu doesn't use SELinux as I know. /selinux lib is empty, only one related selinux package is present - libselinux...
02/16/2014
- 10:43 PM Bug #7372 (Closed): kcephfs: pjd tests fail
- 09:57 PM Bug #7372: kcephfs: pjd tests fail
- well, as far as i can tell, the pjd tests also fail on ext4 in the same way they do on ceph:...
02/15/2014
- 06:07 PM Bug #6791 (Won't Fix): mds assert after startup - CDir::commit error (want > commited version)
02/14/2014
- 06:25 AM Bug #7424: Cannot read from zero-length file
- do you have selinux enabled
- 02:16 AM Bug #7424 (Rejected): Cannot read from zero-length file
- Ubuntu 12.04 LTS 3.8.0-35-generic x64
Ceph 0.72.2-1precise from http://ceph.com/debian-emperor/
cluster b8... - 12:45 AM Bug #7422 (Resolved): client/barrier.h uses boost's interval set library, which is not available ...
- http://gitbuilder.sepia.ceph.com/gitbuilder-centos6-amd64/log.cgi?log=9cbbc883e225b08b3e31cd2cf6e766688795886b
Thi...
02/12/2014
- 12:10 PM Bug #5382: mds: failed objecter assert on shutdown
- I haven't looked at any of the code involved for real, but that sounds like a good plan to me. *thumbs up*
- 11:22 AM Bug #5382: mds: failed objecter assert on shutdown
- What's happening is that suicide() is getting called from another thread while the dispatch thread is inside _dispatc...
Also available in: Atom