Project

General

Profile

Activity

From 06/16/2016 to 07/15/2016

07/15/2016

07:18 PM Bug #16668: client: nlink count is not maintained correctly
I set up a ganesha + ceph test rig today and was able to reproduce the problem. Interestingly, it does not reproduce ... Jeff Layton
04:24 PM Bug #16592: Jewel: monitor asserts on "mon/MDSMonitor.cc: 2796: FAILED assert(info.state == MDSMa...
So, rambling brain dump of my current thoughts on this:
I haven't been able to reproduce this problem. There are t...
Patrick Donnelly
03:19 PM Backport #16697 (Fix Under Review): jewel: ceph-fuse is not linked to libtcmalloc
PR for jewel is https://github.com/ceph/ceph/pull/10303 Ken Dreyer
09:36 AM Backport #16697 (Resolved): jewel: ceph-fuse is not linked to libtcmalloc
https://github.com/ceph/ceph/pull/10303 Nathan Cutler
03:28 AM Bug #16655: ceph-fuse is not linked to libtcmalloc
https://github.com/ceph/ceph/pull/10303 Zheng Yan
02:25 AM Bug #16691: sepia LRC lost directories
what do you mean they are old? what does 'rados stat xxxx' show? Zheng Yan

07/14/2016

11:33 PM Documentation #16664 (Resolved): Standby Replay configuration doc is wrong
Greg Farnum
04:12 PM Documentation #16664: Standby Replay configuration doc is wrong
Backport: https://github.com/ceph/ceph/pull/10298
I can't mark this issue as Resolved for some reason.
Patrick Donnelly
11:18 PM Bug #16655: ceph-fuse is not linked to libtcmalloc
tcmalloc is also missing from @ldd /usr/bin/ceph-fuse@ in ceph-fuse-0.94.7-0.el7, FYI, so this has gone on for quite ... Ken Dreyer
01:39 PM Bug #16655 (Pending Backport): ceph-fuse is not linked to libtcmalloc
Kefu Chai
09:48 PM Bug #16691 (Resolved): sepia LRC lost directories
If you log in to the sepia long-running cluster, it has 37 directories whose objects it lost.
I spot-checked one o...
Greg Farnum
05:39 PM Bug #16640 (New): libcephfs: Java bindings failing to load on CentOS
Let's leave this open to work out if there is a change to the build we can make to avoid the java bindings requiring ... John Spray
04:01 PM Bug #16640: libcephfs: Java bindings failing to load on CentOS
Noah, John, I'm guessing the Java bindings ought to link to the versioned libcephfs_jni.so.1.0.0 instead of the unver... Ken Dreyer
05:38 PM Feature #4139 (Resolved): MDS: forward scrub: add scrub_stamp infrastructure and a function to sc...
I think Greg meant to mark this Resolved. Nathan Cutler
01:11 AM Feature #4139: MDS: forward scrub: add scrub_stamp infrastructure and a function to scrub a singl...
This bit has been done forever: we have admin socket interfaces to scrub a dentry or recursive folder. Greg Farnum
01:44 PM Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread
looks like ObjectCacher::bh_write_adjacencies() passed an empty list to ObjectCacher::bh_write_scattered(). Maybe the... Zheng Yan
01:25 PM Bug #16668: client: nlink count is not maintained correctly
It also occurred to me yesterday that I was using the path-based calls, whereas ganesha would likely be using the ll ... Jeff Layton
10:39 AM Bug #8255 (Resolved): mds: directory with missing object cannot be removed
This kind of issue should be handled cleanly (MDS will raise 'damaged' health alert, specifics in "damage ls") as of ... John Spray
01:14 AM Feature #12275 (Duplicate): Handle metadata migration during forward scrub
#4143 and #4144 Greg Farnum
01:03 AM Feature #12141: cephfs-data-scan: File size correction from backward scan
This was discussed elsewhere, but we need to be able to disable file size correction as well – via a config option at... Greg Farnum

07/13/2016

11:37 PM Bug #13271 (Resolved): Missing dentry in cache when doing readdirs under cache pressure (?????s i...
Zheng fixed this. Greg Farnum
11:28 PM Feature #14427: qa: run snapshot tests under thrashing
https://github.com/ceph/ceph/pull/9955 improves snapshots and https://github.com/ceph/ceph-qa-suite/pull/1073 enables... Greg Farnum
11:25 PM Bug #10834 (Closed): SAMBA VFS module: Timestamps revert back to 01-01-1970
Closing in favor of #16679, since this is really about birthtime and we're adding a real one. Greg Farnum
11:25 PM Bug #16679 (New): Samba: hook up to birthtime correctly
https://github.com/ceph/ceph/pull/9965 is adding birthtime to Ceph internally. Once done, we need to plug samba in to... Greg Farnum
11:20 PM Feature #12671: Enforce cache limit during dirfrag load during open_ino (during rejoin)
If we do #13688, we probably won't need this one or can put it off. Greg Farnum
11:17 PM Fix #5268 (New): mds: fix/clean up file size/mtime recovery code
Greg Farnum
11:15 PM Bug #15379 (Closed): ceph mds continiously crashes and going into laggy state (stray purging prob...
We have open tickets about improving purge, and the specific issue here seems to have been addressed. Greg Farnum
11:09 PM Feature #3314: client: client interfaces should take a set of group ids
This is a natural part of what I'm already doing for #16367. Greg Farnum
11:07 PM Bug #8090 (New): multimds: mds crash in check_rstats
Greg Farnum
05:28 AM Bug #8090: multimds: mds crash in check_rstats
There may no longer be an issue now that #8094 is resolved? Greg Farnum
11:06 PM Feature #7321 (Duplicate): qa: multimds thrasher
#10792 Greg Farnum
10:49 PM Bug #16668 (In Progress): client: nlink count is not maintained correctly
Noted on irc that cap handling that involves the root directory (so, anything in root and frequently things in its im... Greg Farnum
06:21 PM Bug #16668: client: nlink count is not maintained correctly
I rolled up a testcase for this:... Jeff Layton
12:44 PM Bug #16668: client: nlink count is not maintained correctly
MDS revokes CEPH_CAP_LINK_EXCL when unlinking files. It's odd, but I can't see how does it cause problem Zheng Yan
10:47 PM Feature #10498 (New): ObjectCacher: order wakeups when write calls block on throttling
Greg Farnum
10:38 PM Feature #15393 (Resolved): ceph-fuse: Request for logrotate for client side log files
ceph-fuse was included in Jewel! commit:98744fdf9bda9d3b14bbf7f528f05ba50a923f97 Greg Farnum
10:34 PM Feature #10060: uclient: warn about stuck cap flushes
This should be pretty simple by looking at each session->flushing_caps_tids! Greg Farnum
10:30 PM Feature #6511 (Rejected): MDS: add special purging options for testing
This is kind of vague now and will get caught up in our future purge fixes anyway. Greg Farnum
10:23 PM Feature #15067 (Resolved): mon: client: multifs: enable clients to map a filesystem name to a FSCID
Greg Farnum
10:22 PM Feature #15068 (In Progress): fsck: multifs: enable repair tools to read from one filesystem and ...
I think Doug is working on this as well as #15069?
(Reset if not.)
Greg Farnum
10:15 PM Bug #16640 (Resolved): libcephfs: Java bindings failing to load on CentOS
Greg Farnum
12:01 PM Bug #16640: libcephfs: Java bindings failing to load on CentOS
I suppose the convention of putting the unversioned libraries into -dev packages is based on the idea that built code... John Spray
09:57 PM Feature #6290 (Resolved): Journaler: warn and shut down if we hit end of journal too early
Looks like this got fixed in our Journaler refactor. Greg Farnum
09:50 PM Feature #16676: flush dirty data to journal on SIGTERM
We sort of assume that there's a standby and the client who will replay the op, but if we've lost the client it's (al... Greg Farnum
09:36 PM Feature #16676 (New): flush dirty data to journal on SIGTERM
When it receives SIGTERM, the MDS should commit unsafe data to its journal before terminating. Douglas Fuller
03:36 PM Feature #15615 (Fix Under Review): CephFSVolumeClient: List authorized IDs by share
https://github.com/ceph/ceph/pull/9864
https://github.com/ceph/ceph-qa-suite/pull/1080
Ramana Raja
03:31 PM Feature #15406 (Fix Under Review): Add versioning to CephFSVolumeClient interface
https://github.com/ceph/ceph/pull/9864
https://github.com/ceph/ceph-qa-suite/pull/1080
Ramana Raja
06:13 AM Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread
Alas, the gdb log does not give us much more to go on.
Thread 1 (Thread 0x7f891cdfa700 (LWP 5467)):
#0 0x00007f8...
Brad Hubbard
05:48 AM Cleanup #13868 (Resolved): mds: MDCache::cap_import_paths is never used
This member no longer turns up when grepping. Greg Farnum
05:43 AM Bug #7206 (Can't reproduce): Ceph MDS Hang on hadoop workloads
If this was a time issue, we fixed a bunch of weird stuff in the switch to solely client-directed mtime updates. Greg Farnum
05:42 AM Bug #6458 (Can't reproduce): journaler: journal too short during replay
The journal format is different now too; this is probably not useful any more. Greg Farnum
05:39 AM Bug #8405: multimds: FAILED assert(dir->is_frozen_tree_root())
I don't think we've run many multi-mds tests in a while so this is probably still an issue? Greg Farnum
05:28 AM Fix #8094 (Resolved): MDS: be accurate about stats in check_rstats
Zheng fixed this ages ago. Greg Farnum
05:26 AM Bug #10996 (Can't reproduce): dumpling MDS: failed MDLog assert
Dumpling is old and we don't seem to have seen the error again. Greg Farnum
05:18 AM Bug #14641 (Duplicate): don't let users specify 0 on stripe count or object size
Greg Farnum
05:10 AM Bug #8255: mds: directory with missing object cannot be removed
John, much of this is handled now with the metadata damaged flags. What's left? Greg Farnum
01:12 AM Feature #13688: mds: performance: journal inodes with capabilities to limit rejoin time on failover
This might have already been done...Zheng, maybe? Greg Farnum
12:57 AM Cleanup #3677 (Closed): libcephfs, mds: test creation/addition of data pools, create policy
Things have changed a lot and we definitely test adding multiple data pools now. Greg Farnum
12:51 AM Bug #4023: kclient: d_revalidate is abusing d_parent
Is this still an issue? Greg Farnum
12:50 AM Bug #6770: ceph fscache: write file more than a page size to orignal file cause cachfiles bug on EOF
fscache has been through a lot of changes; anybody know if this is still a problem? Greg Farnum
12:41 AM Bug #5950 (Rejected): kcephfs: cephfs set_layout -p 4 gets EINVAL
I think we actually got rid of the cephfs tool at last. Greg Farnum
12:20 AM Bug #7685 (Can't reproduce): hung/failed teuthology test: cfuse_workunit_misc
Greg Farnum
12:19 AM Bug #11294: samba: DISCONNECTED inode warning
This doesn't look anything like #11835 to me; I've not been tracking closely enough to know if we're still seeing han... Greg Farnum
12:01 AM Bug #12895: Failure in TestClusterFull.test_barrier
Is this still a problem? It looks to me like the code is still there but I don't think the test has been failing. Greg Farnum

07/12/2016

11:52 PM Bug #5360 (Rejected): ceph-fuse: failing smbtorture tests
We have other tickets about smbtorture but we also fixed a bunch; who knows which one this was. Greg Farnum
11:51 PM Feature #4906 (Resolved): ceph-fuse: use the Preforker class
See auto-associated revision 66f0704c; this got done years ago. Greg Farnum
11:48 PM Bug #5731 (Can't reproduce): failed pjd link permissions check
So much stuff has changed and we haven't linked any other failures to this ticket. Greg Farnum
11:43 PM Bug #11499 (Can't reproduce): ceph-fuse: don't try and remount during shutdown
We haven't seen this again. Greg Farnum
11:42 PM Fix #13126 (Resolved): qa: ceph-fuse flushes very slowly in some workunits
I spot-checked one wow that the ObjectCacher is coalescing IOs to a single object. It looks like things have gotten f... Greg Farnum
11:32 PM Bug #14735 (Resolved): ceph-fuse does not mount at boot on Debian Jessie
I don't think there are likely to be any more infernalis releases now that Jewel is out. Greg Farnum
11:28 PM Feature #16467: ceph-fuse: Exclude ceph.* xattr namespace in listxattr
This applies to the kernel client as well, right? Greg Farnum
11:16 PM Feature #15634 (Resolved): Enable fuse_use_invalidate_cb by default
This got merged beginning of June. Greg Farnum
09:33 PM Bug #16668: client: nlink count is not maintained correctly
I suspect the kclient has a similar problem. I'll test it out when I get a chance. I do agree that we probably ought ... Jeff Layton
08:53 PM Bug #16668 (Resolved): client: nlink count is not maintained correctly
Frank reported in #ceph-devel that we don't seem to update nlink correctly from the Client. Looking through the sourc... Greg Farnum
08:31 PM Bug #16655: ceph-fuse is not linked to libtcmalloc
Okay, I guess it was just introduced in some autotools refactor or update then. Thanks! Greg Farnum
08:23 PM Bug #16655: ceph-fuse is not linked to libtcmalloc
Greg Farnum wrote:
> But the fix is on top of master, which should already work?
It looks like the bug is in t...
Nathan Cutler
05:47 PM Bug #16655: ceph-fuse is not linked to libtcmalloc
The link is definitely missing in v10.2.2.... Ken Dreyer
05:35 PM Bug #16655: ceph-fuse is not linked to libtcmalloc
I'm a little confused about the cause here. Ken says
>confirmed that /usr/bin/ceph-fuse is linked to libtcmalloc.so....
Greg Farnum
10:12 AM Bug #16655: ceph-fuse is not linked to libtcmalloc
https://github.com/ceph/ceph/pull/10258 John Spray
08:27 AM Bug #16655 (Fix Under Review): ceph-fuse is not linked to libtcmalloc
Kefu Chai
03:43 AM Bug #16655 (Resolved): ceph-fuse is not linked to libtcmalloc
For ceph-fuse binary at http://download.ceph.com/rpm-jewel/el7/x86_64/ceph-fuse-10.2.2-0.el7.x86_64.rpm
[root@zh...
Zheng Yan
08:29 PM Documentation #16664 (Fix Under Review): Standby Replay configuration doc is wrong
Nathan Cutler
07:46 PM Documentation #16664: Standby Replay configuration doc is wrong
PR: https://github.com/ceph/ceph/pull/10268 Patrick Donnelly
07:33 PM Documentation #16664 (Resolved): Standby Replay configuration doc is wrong
The config settings here are wrong:
http://docs.ceph.com/docs/master/cephfs/standby/
The settings should be pre...
Patrick Donnelly
04:09 PM Bug #16640: libcephfs: Java bindings failing to load on CentOS
I saw this before with Debian. It looks like it's now showing with with rhelish stuff. The non-devel package includes... Noah Watkins
11:25 AM Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread
Just another update after further investigation and discussion in the mailing list.
1. I have tried to run the app...
Goncalo Borges
10:42 AM Bug #16643 (Won't Fix): MDS memory leak in hammer integration testing
MDS leaks are ignored by default in the valgrind task, so presumably you're only seeing this because something else f... John Spray
10:32 AM Feature #16656: mount.ceph: enable consumption of ceph keyring files
I'll go ahead and grab this one. Not a high priority but definitely a nice-to-have from a usability perspective. Jeff Layton
09:42 AM Feature #16656 (Resolved): mount.ceph: enable consumption of ceph keyring files
Jeff pointed this out in doc review:
> we really ought to fix up the mount helper to use the same sort of keyring ...
John Spray
09:46 AM Feature #16570 (Fix Under Review): MDS health warning for failure to enforce cache size limit
https://github.com/ceph/ceph/pull/10245 John Spray

07/11/2016

06:46 PM Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread
Just as a quick update, we're waiting on some more information from Goncalo concerning the possibility of nodes runni... Patrick Donnelly
01:54 PM Feature #15619 (In Progress): Repair InoTable during forward scrub
Vishal Kanaujia

07/09/2016

07:45 AM Bug #16643 (Won't Fix): MDS memory leak in hammer integration testing
Lots of Leak_PossiblyLost and Leak_DefinitelyLost in
smithfarm@teuthology:/a/smithfarm-2016-07-08_15:27:38-fs-ham...
Nathan Cutler

07/08/2016

10:36 PM Feature #16419: add statx-like interface to libcephfs
Possibly. The thing is that the btime should only ever change due to an deliberate setattr call. It's unlike the othe... Jeff Layton
10:20 PM Feature #16419: add statx-like interface to libcephfs
We need to be able to serve an accurate btime. I suppose we could break our rules and assume it won't get changed in ... Greg Farnum
09:54 PM Feature #16419: add statx-like interface to libcephfs
Aside from the stuff Greg noticed in his latest review pass, I noticed a number of flaws in the original patchset and... Jeff Layton
08:02 PM Feature #16419: add statx-like interface to libcephfs
Changing the description since this has ballooned a bit in scope. We want to add btime support and a change_attribute... Jeff Layton
10:21 PM Bug #16640 (Won't Fix): libcephfs: Java bindings failing to load on CentOS
http://qa-proxy.ceph.com/teuthology/jspray-2016-07-08_05:19:56-fs-master-distro-basic-mira/302088/teuthology.log
<...
John Spray
12:49 PM Feature #16631 (New): ObjectCacher cache size stats for ceph-fuse
Currently the perf stats from ObjectCacher don't include the actual size of the cache (get_stat_clean, get_stat_dirty... John Spray
08:26 AM Bug #16588 (Fix Under Review): ceph mds dump show incorrect number of metadata pools.
https://github.com/ceph/ceph/pull/10202 Xiaoxi Chen
07:28 AM Backport #16625 (In Progress): jewel: Failing file operations on kernel based cephfs mount point ...
Nathan Cutler
07:18 AM Backport #16625 (Resolved): jewel: Failing file operations on kernel based cephfs mount point lea...
https://github.com/ceph/ceph/pull/10199 Nathan Cutler
07:27 AM Backport #16626 (In Progress): hammer: Failing file operations on kernel based cephfs mount point...
Nathan Cutler
07:18 AM Backport #16626 (Resolved): hammer: Failing file operations on kernel based cephfs mount point le...
https://github.com/ceph/ceph/pull/10198 Nathan Cutler
07:06 AM Bug #16013: Failing file operations on kernel based cephfs mount point leaves unaccessible file b...
*master PR*: https://github.com/ceph/ceph/pull/8778 Nathan Cutler
07:05 AM Bug #16013 (Pending Backport): Failing file operations on kernel based cephfs mount point leaves ...
Nathan Cutler

07/07/2016

09:53 PM Backport #16621 (Resolved): jewel: mds: `session evict` tell command blocks forever with async me...
https://github.com/ceph/ceph/pull/10501 Loïc Dachary
09:53 PM Backport #16620 (Resolved): jewel: Fix shutting down mds timed-out due to deadlock
https://github.com/ceph/ceph/pull/10500 Loïc Dachary
08:58 PM Bug #16592: Jewel: monitor asserts on "mon/MDSMonitor.cc: 2796: FAILED assert(info.state == MDSMa...
Should note that this is maybe related to: http://tracker.ceph.com/issues/15591 Patrick Donnelly
05:44 PM Bug #16610: Jewel: segfault in ObjectCacher::FlusherThread
Log is now here: /ceph/post/i16610/client.log Patrick Donnelly
02:04 PM Bug #16610 (Resolved): Jewel: segfault in ObjectCacher::FlusherThread
... Patrick Donnelly
03:10 PM Feature #15942: MDS: use FULL_TRY Objecter flag instead of relying on an exemption from full chec...
Related: https://github.com/ceph/ceph/pull/9087 John Spray
03:09 PM Cleanup #16144 (Resolved): Remove cephfs-data-scan tmap_upgrade
John Spray
03:08 PM Cleanup #16195 (In Progress): mds: Don't spam log with standby_replay_restart messages
John Spray
03:05 PM Bug #16288 (Pending Backport): mds: `session evict` tell command blocks forever with async messen...
John Spray
03:04 PM Bug #16396 (Pending Backport): Fix shutting down mds timed-out due to deadlock
John Spray
01:05 PM Feature #16570 (In Progress): MDS health warning for failure to enforce cache size limit
John Spray
01:04 PM Bug #15485 (Duplicate): drop /usr/bin/cephfs
John Spray
11:28 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
h3. original description
Ceph mds dump shows metadata pool count as 2, even though only one metadata pool is prese...
Nathan Cutler
08:54 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
Hi Xiaoxi,
You are right about the bug. The metadata_pool field should be left blank. I have changed the descripti...
Rohith Radhakrishnan
08:48 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
Rohith Radhakrishnan wrote:
> Ceph mds dump shows metadata_pool id as 0. When no FS is present, then metadata_pool ...
Rohith Radhakrishnan
08:34 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
Hmm, yes, this is because metadata_pool is initialized to 0 , this seems worth to fix.
The bug is , when no FS pr...
Xiaoxi Chen
06:50 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
ceph osd pool stats
*there are no pools!*
ems@rack2-client-3:~$ ceph mds dump
dumped fsmap epoch 3
fs_name ceph...
Rohith Radhakrishnan
06:42 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
on what basis is the pool id generated? There are no existing pools. So shouldn't the count start with 0 or 1?
Als...
Rohith Radhakrishnan

07/06/2016

03:33 PM Feature #15406 (In Progress): Add versioning to CephFSVolumeClient interface
Ramana Raja
06:29 AM Bug #16588 (Rejected): ceph mds dump show incorrect number of metadata pools.
This is not a bug.
The numbers following "data_pools" and "metadata_pool" are not count, but the pool ids.
root...
Xiaoxi Chen
03:44 AM Bug #16588: ceph mds dump show incorrect number of metadata pools.
Xiaoxi Chen

07/05/2016

09:00 PM Bug #16042 (Fix Under Review): MDS Deadlock on shutdown active rank while busy with metadata IO
PR: https://github.com/ceph/ceph/pull/10142 Patrick Donnelly
05:44 PM Bug #16592 (Need More Info): Jewel: monitor asserts on "mon/MDSMonitor.cc: 2796: FAILED assert(in...
We've seen a few reports on the ceph-user mailing lists of the latest jewel.... Greg Farnum
11:42 AM Bug #16588 (Resolved): ceph mds dump show incorrect number of metadata pools.
Ceph mds dump shows metadata_pool id as 0. When no FS is present, then metadata_pool id should be left blank.
ceph...
Rohith Radhakrishnan

07/02/2016

07:48 AM Backport #16320 (In Progress): jewel: fs: fuse mounted file systems fails SAMBA CTDB ping_pong rw...
Xiaoxi Chen
07:35 AM Backport #16313 (In Progress): jewel: client: FAILED assert(root_ancestor->qtree == __null)
Xiaoxi Chen
07:31 AM Backport #16215 (In Progress): jewel: client: crash in unmount when fuse_use_invalidate_cb is ena...
Xiaoxi Chen
07:29 AM Backport #16515 (In Progress): jewel: Session::check_access() is buggy
Xiaoxi Chen
07:26 AM Backport #16560 (In Progress): jewel: mds: enforce a dirfrag limit on entries
Xiaoxi Chen
07:22 AM Backport #16037: jewel: MDSMonitor::check_subs() is very buggy
QA suite backported in https://github.com/ceph/ceph-qa-suite/pull/1075 Xiaoxi Chen
07:11 AM Backport #16037 (In Progress): jewel: MDSMonitor::check_subs() is very buggy
Xiaoxi Chen

07/01/2016

08:28 PM Feature #15069 (In Progress): MDS: multifs: enable two filesystems to point to same pools if one ...
Douglas Fuller
08:19 PM Cleanup #16144 (Fix Under Review): Remove cephfs-data-scan tmap_upgrade
https://github.com/ceph/ceph/pull/10100 Douglas Fuller
07:39 PM Cleanup #16144 (In Progress): Remove cephfs-data-scan tmap_upgrade
Douglas Fuller
12:07 PM Bug #16556: LibCephFS.InterProcessLocking failing on master and jewel
Thanks Kefu, I guess the lockdep one is either a cephfs or msgr issue so we'll keep this ticket open to look into it. John Spray
03:39 AM Bug #16556: LibCephFS.InterProcessLocking failing on master and jewel
LibCephFS.Fchown is fixed by https://github.com/ceph/ceph/pull/10081,
but we still have...
Kefu Chai
11:11 AM Feature #15066: multifs: Allow filesystems to be assigned RADOS namespace as well as pool for met...
Just in case I lose it, the draft code for splitting messengers was here: https://github.com/jcsp/ceph/tree/wip-15399... John Spray
10:48 AM Feature #16570 (Resolved): MDS health warning for failure to enforce cache size limit

This can have many causes, but it is a sign that something is not wrong, and a possible precursor to the MDS dying ...
John Spray

06/30/2016

08:47 PM Bug #16042: MDS Deadlock on shutdown active rank while busy with metadata IO
I'm able to reproduce this with vstart.sh and `cp -a /usr ...`. I'm seeing this every 10 seconds:... Patrick Donnelly
04:35 PM Backport #16560 (Resolved): jewel: mds: enforce a dirfrag limit on entries
https://github.com/ceph/ceph/pull/10104 Nathan Cutler
02:21 PM Bug #16556: LibCephFS.InterProcessLocking failing on master and jewel
Jeff points out that we can also get it to blow up with just a passing test like bin/ceph_test_libcephfs --gtest_fil... John Spray
01:53 PM Bug #16556: LibCephFS.InterProcessLocking failing on master and jewel
... Kefu Chai
01:51 PM Bug #16556 (New): LibCephFS.InterProcessLocking failing on master and jewel
Maybe related to https://github.com/ceph/ceph/pull/9995 ?
Failures on master here: http://pulpito.ceph.com/jspray-...
John Spray
02:19 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
Ahh, the reason I could reproduce this yesterday is because the client box was running a v4.5 kernel. With a v4.7-rc5... Jeff Layton
11:47 AM Bug #16164 (Pending Backport): mds: enforce a dirfrag limit on entries
John Spray

06/29/2016

06:26 PM Support #16528: Stuck with CephFS with 1M files in one dir
Thank you!
Raised "mds cache size" to 3M and it took couple of minutes to list this dir.
elder one
05:40 PM Support #16528 (Closed): Stuck with CephFS with 1M files in one dir
Assuming your MDS server has enough memory (it probably does), turn up the "mds cache size" to a number larger than 1... Greg Farnum
04:48 PM Support #16528 (Closed): Stuck with CephFS with 1M files in one dir
I'm pretty much stukc with cephfs (jewel 10.2.2) with 1 million 0 byte files in one dir left behind from unsuccessful... elder one
06:24 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
The fio threads at this point are all sitting in ceph_get_caps:... Jeff Layton
05:38 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
Ok, the mds session evict command definitely did the trick. Once I issued that (while running a fio test in another s... Jeff Layton
05:38 PM Support #16526: cephfs client side quotas - nfs-ganesha
How are you evaluating that the quotas are ignored? There isn't any integration, certainly, but the Ceph client libra... Greg Farnum
02:56 PM Support #16526 (Resolved): cephfs client side quotas - nfs-ganesha
I am not sure If this is best logged on the nfs-ganesha project or here.
Ceph quotas are configured using virtual ...
sean redmond
10:45 AM Feature #16523 (Resolved): Assert directory fragmentation is occuring during stress tests
Currently we enable fragmentation and set a low (100) frag size limit, but nothing actually validates that there is a... John Spray
06:40 AM Backport #16515 (Resolved): jewel: Session::check_access() is buggy
https://github.com/ceph/ceph/pull/10105 Loïc Dachary
02:10 AM Bug #16358: Session::check_access() is buggy
Yes, it could happen for normal case (newly created file). We should backport it Zheng Yan
12:11 AM Bug #16358: Session::check_access() is buggy
Whoops, yes. Luckily only for users of hard links, but that's good enough reason! Greg Farnum
12:02 AM Bug #16358 (Pending Backport): Session::check_access() is buggy
Seems like this could be serious enough to backport (Zheng: this could happen in normal use, right?) John Spray
01:11 AM Bug #16367 (In Progress): libcephfs: UID parsing breaks root squash (Ganesha FSAL)
My basic approach here is to just stop automatically setting UID/GID within the Client class code base at all. It cur... Greg Farnum

06/28/2016

09:53 PM Bug #16358 (Resolved): Session::check_access() is buggy
Greg Farnum
08:55 PM Bug #16407 (Rejected): LibCephFS.UseUnmounted failed
Greg Farnum
01:10 AM Bug #16407: LibCephFS.UseUnmounted failed
@John Spray, This is my fault, please closed it.
Thanks.
Zezhu Zhang
07:57 PM Feature #11171: Path filtering on "dump cache" asok
For test, see https://github.com/ceph/ceph-qa-suite/pull/1066 Douglas Fuller
02:27 PM Bug #16397 (Can't reproduce): nfsd selinux denials causing knfs tests to fail
Ok, talked with Bruce (knfsd maintainer) and the SELinux folks and the consensus is that we have no clue as to why th... Jeff Layton
01:11 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
Anyway, the first AVC denial is here:
avc: denied { add_name } for pid=22038 comm="rpc.mountd" name="channel" ...
Jeff Layton
12:38 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
Ok, looking at the log, I do see the SELinux denials. I am new teuthology though...
So you have ubuntu boxes that ...
Jeff Layton
01:21 PM Cleanup #15923 (In Progress): MDS: remove TMAP2OMAP check and move Objecter into MDSRank
John Spray
01:21 PM Cleanup #16035 (In Progress): Remove "cephfs" CLI
John Spray

06/27/2016

08:01 PM Bug #16288 (In Progress): mds: `session evict` tell command blocks forever with async messenger (...
Still no reproducer, but
https://github.com/ceph/ceph/pull/9971
may help.
Douglas Fuller
01:44 PM Bug #16407: LibCephFS.UseUnmounted failed
Can you update us? Where are you you seeing the issue and is there a new fix PR? John Spray
09:13 AM Bug #16042: MDS Deadlock on shutdown active rank while busy with metadata IO
Could it be via following paths to call MDSDaemon::ms_handle_reset() like async msgr?
One mds thread: ... -> Simpl...
Zhi Zhang
03:44 AM Bug #16186: kclient: drops requests without poking system calls on reconnect
there is a 'ceph daemon mds.xxx session evict' command, which makes mds close client session. (use 'ceph daemon mds.x... Zheng Yan

06/25/2016

05:32 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
Ok, I tried reproducing this by issuing a stat() while outbound traffic from the client was blocked (on a v4.7-rc4 ke... Jeff Layton

06/24/2016

08:21 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
I don't suppose we have a way to reproduce this, do we? Maybe drive a lot of MDS ops and continually stop and restart... Jeff Layton
05:08 PM Feature #11171 (Fix Under Review): Path filtering on "dump cache" asok
https://github.com/ceph/ceph/pull/9925 Douglas Fuller
10:15 AM Bug #16042: MDS Deadlock on shutdown active rank while busy with metadata IO
Interesting, #16396 is with async messenger (and is probably the issue we're seeing in current master testing), but w... John Spray
03:12 AM Bug #16042: MDS Deadlock on shutdown active rank while busy with metadata IO
Hi guys,
Looks like this issue is very similar to this one here: http://tracker.ceph.com/issues/16396
Zhi Zhang
10:07 AM Feature #16468 (Resolved): kclient: Exclude ceph.* xattr namespace in listxattr
See this thread: http://www.spinics.net/lists/ceph-devel/msg30948.html
Some userspaces tools (notably rsync) try t...
John Spray
10:06 AM Feature #16467 (New): ceph-fuse: Exclude ceph.* xattr namespace in listxattr
See this thread: http://www.spinics.net/lists/ceph-devel/msg30948.html
Some userspaces tools (notably rsync) try t...
John Spray

06/23/2016

07:44 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
Well, if we have unsafe requests the MDS will in fact have committed them (assuming the MDS didn't crash or something... Greg Farnum
01:53 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
If the mds has torn down the client's session, then I don't see what can reasonably be done other than to return an e... Jeff Layton
06:33 PM Bug #16288: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClie...
Not to take away Doug's thunder, but I gather he's been unable to reproduce it. The AsyncMessenger may have already b... Greg Farnum
05:44 PM Bug #15921: segfault in cephfs-journal-tool (TestJournalRepair failure)
As far as I can tell, we don't even have the backtrace of the segfault in either of those logs, and the sha1 isn't av... Greg Farnum
01:20 PM Bug #16013 (Resolved): Failing file operations on kernel based cephfs mount point leaves unaccess...
Zheng Yan
11:59 AM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)
I don't know if I should open a new issue for this, but it looks like even with another ID something is still wrong:
...
Kenneth Waegeman
04:51 AM Bug #16396: Fix shutting down mds timed-out due to deadlock
https://github.com/ceph/ceph/pull/9884 Zhi Zhang

06/22/2016

09:09 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
But if we restart requests from scratch, we're dramatically re-ordering them. We can seemingly send files back in tim... Greg Farnum
09:01 PM Bug #16186: kclient: drops requests without poking system calls on reconnect
I think it is working the way it is supposed to work.
We skip unsafe requests because the mds already got them and...
Sage Weil
08:59 PM Bug #16407: LibCephFS.UseUnmounted failed
You appear to have closed your own PR. And generally speaking we pass around negative error numbers, so readdir() is ... Greg Farnum
08:44 AM Bug #16407: LibCephFS.UseUnmounted failed
https://github.com/ceph/ceph/pull/9860 huanwen ren
07:36 AM Bug #16407 (Rejected): LibCephFS.UseUnmounted failed
2016-06-22T15:03:06.176 INFO:tasks.workunit.client.0.plana146.stdout:[ RUN ] LibCephFS.StripeUnitGran
2016-06-2...
Zezhu Zhang
08:55 PM Support #16043 (Closed): MDS is crashed
Greg Farnum
07:40 PM Feature #16228: Create teuthology task for Samba ping_pong test
(Copied from #16417) See Greg's draft https://github.com/gregsfortytwo/ceph-qa-suite/tree/wip-pingpong John Spray
07:40 PM Feature #16417 (Duplicate): test pingpong on ceph-fuse
John Spray
05:10 PM Feature #16417 (Duplicate): test pingpong on ceph-fuse
See #12653. We should integrate pingpong into our nightly test suite, to verify consistency on the kernel client and ... Greg Farnum
06:10 PM Feature #16419: add statx-like interface to libcephfs
Yeah, that's what I mean. We have ceph_ll_getattr now (afaict), so we need something like a ceph_ll_getattrx (that na... Jeff Layton
06:01 PM Feature #16419: add statx-like interface to libcephfs
Jeff Layton wrote:
> What I'm thinking is that we should add something along the lines of what David Howells has pro...
Sage Weil
05:39 PM Feature #16419: add statx-like interface to libcephfs
What I'm thinking is that we should add something along the lines of what David Howells has proposed for the new stat... Jeff Layton
05:35 PM Feature #16419 (Resolved): add statx-like interface to libcephfs
samba, in particular, can make use of the birthtime for an inode. Have ceph track the btime in the inode and provide ... Jeff Layton
01:01 PM Feature #15615: CephFSVolumeClient: List authorized IDs by share
https://github.com/ceph/ceph/pull/9864 Ramana Raja

06/21/2016

02:03 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
Ahh, hmm -- just noticed the "add name" deinal too. Does the path "/proc/net/rpc/auth.unix.ip/channel" even exist? Ma... Jeff Layton
01:46 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
Looks unrelated to anything ceph-specific. My guess is that this is an selinux policy bug, since rpc.mountd should be... Jeff Layton
11:56 AM Bug #16397 (Resolved): nfsd selinux denials causing knfs tests to fail
http://pulpito.ceph.com/teuthology-2016-06-20_17:35:01-knfs-master-testing-basic-mira/267607/ John Spray
11:26 AM Support #16043: MDS is crashed
I execute... Andrey Matyashov
06:05 AM Support #16043: MDS is crashed
Yes, i try reset journal and sessions.
I run:...
Andrey Matyashov
01:34 AM Support #16043: MDS is crashed
Yep. So looking through the log, I now see
>mds.2.journal ESession.replay sessionmap 0 < 18884 close client.166758...
Greg Farnum
09:36 AM Bug #16396: Fix shutting down mds timed-out due to deadlock
-https://github.com/ceph/ceph/pull/9841- Zhi Zhang
09:31 AM Bug #16396 (Resolved): Fix shutting down mds timed-out due to deadlock
This issue was found in jewel when restarting/stopping mds. It took long time for mds to completely stop until mds th... Zhi Zhang
09:02 AM Bug #16288 (New): mds: `session evict` tell command blocks forever with async messenger (TestVolu...
Oops, I meant to paste to begin with. I think it was this one:
/a/jspray-2016-06-13_14:56:46-fs-wip-jcsp-testing-qu...
John Spray

06/20/2016

08:12 PM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Greg Farnum
07:57 PM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Yeah, I expect that Frank's report is the root cause, but wanted to see to make sure. :) Greg Farnum
08:56 AM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Now easier to read:... Kenneth Waegeman
08:55 AM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)

I have ceph mounted under /mnt/nfs/ceph:
[root@test2202 test]# pwd
/mnt/nfs/ceph/test
[root@test2202 test]# ls ...
Kenneth Waegeman
08:08 PM Bug #16288 (Need More Info): mds: `session evict` tell command blocks forever with async messenge...
Greg Farnum
08:08 PM Bug #16288: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClie...
John, do you have any logs? The only failure of this test I can find is http://qa-proxy.ceph.com/teuthology/teutholog... Greg Farnum
07:12 PM Bug #16288 (In Progress): mds: `session evict` tell command blocks forever with async messenger (...
Douglas Fuller
01:31 PM Bug #16042 (In Progress): MDS Deadlock on shutdown active rank while busy with metadata IO
Patrick Donnelly
09:35 AM Support #16043: MDS is crashed
Greg, I sent message with link to my debug log on your email. Service for ceph-post-file working has becomes unstable... Andrey Matyashov

06/17/2016

08:46 PM Bug #16164: mds: enforce a dirfrag limit on entries
PR here: https://github.com/ceph/ceph/pull/9789 Patrick Donnelly
05:50 PM Bug #16367 (Need More Info): libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Greg Farnum
05:49 PM Bug #16367: libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Can you please:
1) run ls -lha on the director you're testing in
2) do your tests
3) run ls -lha on all the releva...
Greg Farnum
03:15 PM Bug #16367 (Resolved): libcephfs: UID parsing breaks root squash (Ganesha FSAL)
Testing with ganesha 2.4-o-dev20 and libcephfs 10.2.1:
I did set root squash on in the ganesha.conf, but as root I c...
Kenneth Waegeman
05:14 PM Support #16043: MDS is crashed
Please set "debug mds = 20" and "debug mds log = 20" in your ceph.conf, turn it on, and then upload the mds log file ... Greg Farnum
04:04 AM Bug #16358 (Fix Under Review): Session::check_access() is buggy
https://github.com/ceph/ceph/pull/9769 Zheng Yan
03:53 AM Bug #16358 (Resolved): Session::check_access() is buggy
It calls CInode::make_path_string(path, false, in->get_projected_parent_dn()). The second argument 'false' makes the ... Zheng Yan

06/16/2016

03:14 PM Bug #16255: ceph-create-keys: sometimes blocks forever if mds "allow" is set
> The loop you're seeing presumably is only occurring when /etc/ceph/ceph.client-admin.keyring has been removed.
e...
Dietmar Maurer
03:05 PM Bug #16255: ceph-create-keys: sometimes blocks forever if mds "allow" is set
The difference between @"allow"@ and @"allow *"@ is that the @"*"@ is necessary in more recent versions to issue 'tel... John Spray
02:39 PM Fix #16276: Update TestSessionMap.test_mount_conn_close for async messenger
NB back out part of https://github.com/ceph/ceph-qa-suite/pull/1054 when fixing this, it's switched back to simple me... John Spray
02:29 PM Fix #16276: Update TestSessionMap.test_mount_conn_close for async messenger
http://pulpito.ceph.com/gregf-2016-06-10_19:20:53-fs-greg-fs-testing-610---basic-mira/250875/ Greg Farnum
02:39 PM Bug #16288: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClie...
NB back out part of https://github.com/ceph/ceph-qa-suite/pull/1054 when fixing this, it's switched back to simple me... John Spray
02:38 PM Bug #16288: mds: `session evict` tell command blocks forever with async messenger (TestVolumeClie...
This deadlocks and lockdep makes it crash in our nightlies; we should fix it quickly! :) Greg Farnum
02:37 PM Feature #14271 (Resolved): directory listing: do not reset when fragmenting
Greg Farnum
02:33 PM Support #16043: MDS is crashed
... Andrey Matyashov
02:31 PM Support #16043: MDS is crashed
I upgraded my cluster to 10.2.2, situation not changed. Andrey Matyashov
01:57 PM Support #16043 (Need More Info): MDS is crashed
This probably isn't an issue any more, but if it is upgrade to 10.2.2 and report back if it's still an issue. Greg Farnum
02:26 PM Feature #11171 (In Progress): Path filtering on "dump cache" asok
Douglas Fuller
02:21 PM Backport #16284 (Resolved): jewel: directory listing: do not reset when fragmenting
This was done as part of #16251. Greg Farnum
11:54 AM Bug #16298 (Resolved): mds: failure in tasks/migration.yaml
John Spray
11:15 AM Bug #16322: ceph mds getting killed for no reason
$gdb /usr/local/bin/ceph-mds
If gdb does not say "no debugging symbols found", the debug package is properly insta...
Zheng Yan
09:45 AM Bug #16322: ceph mds getting killed for no reason
Zheng Yan wrote:
> Your ceph-mds does not contain debuginfo, please install debuginfo package first. then start ceph...
Joao Castro
02:20 AM Bug #16322: ceph mds getting killed for no reason
Your ceph-mds does not contain debuginfo, please install debuginfo package first. then start ceph-mds manually with c... Zheng Yan
07:39 AM Backport #16136: jewel: MDSMonitor fixes
Original description:
These two commits:
https://github.com/ceph/ceph/pull/9418/commits/24b82bafffced97384135e5...
Loïc Dachary
 

Also available in: Atom