Project

General

Profile

Activity

From 09/16/2015 to 10/15/2015

10/15/2015

01:35 PM Bug #13364 (In Progress): LibCephFS locking tests are failing and/or lockdep asserting
Looking at this now, it's weird. Greg Farnum

10/14/2015

04:13 AM Bug #13486 (Closed): ceph-fuse crush
There are several patches in v0.80.10 that fix bugs you might have hit. Please upgrade and re-open if you see this ag... Greg Farnum
03:59 AM Bug #13486 (Closed): ceph-fuse crush
cluster b23b48bf-373a-489c-821a-31b60b5b5af0
health HEALTH_OK
monmap e1: 3 mons at {node1=192.168.0.2...
domain tian
12:32 AM Bug #13437 (Resolved): TestClientLimits failing: not seeing the expected MDS log from stuck clien...
Cherry picked onto infernalis. John Spray

10/13/2015

09:52 PM Bug #13437 (Pending Backport): TestClientLimits failing: not seeing the expected MDS log from stu...
Merged the fix to master, needs backport to infernalis John Spray
08:54 AM Bug #13437 (Fix Under Review): TestClientLimits failing: not seeing the expected MDS log from stu...
https://github.com/ceph/ceph-qa-suite/pull/624 Zheng Yan
07:58 AM Bug #13472 (Duplicate): ceph-fuse crash
mds log:
2015-10-13 15:26:14.736061 7fdf39c4e700 0 mds.0.server handle_client_file_setlock: start: 1073741826, leng...
domain tian

10/12/2015

01:41 PM Bug #13437: TestClientLimits failing: not seeing the expected MDS log from stuck client tid updates
Yeah, https://github.com/ceph/ceph/pull/4791 is in the infernalis branch! Greg Farnum
04:35 AM Bug #13437: TestClientLimits failing: not seeing the expected MDS log from stuck client tid updates
It's failing on master too: http://pulpito.ceph.com/teuthology-2015-10-09_23:04:01-fs-master---basic-multi/1098621/
...
Greg Farnum
06:30 AM Bug #13443 (Fix Under Review): Ceph-fuse won't start correctly when the option log_max_new in cep...
Nathan Cutler

10/10/2015

06:50 AM Bug #13443: Ceph-fuse won't start correctly when the option log_max_new in ceph.conf set to zero
I have made a pull request here https://github.com/ceph/ceph/pull/6224 Wenjun Huang
06:28 AM Bug #13443 (Resolved): Ceph-fuse won't start correctly when the option log_max_new in ceph.conf s...
We found a bug in ceph-fuse when we set *log_max_new = 0* in ceph.conf, which will lead ceph-fuse process hang when i... Wenjun Huang
01:56 AM Bug #12806 (Resolved): nfs restart failures
Zheng Yan
01:39 AM Bug #13437: TestClientLimits failing: not seeing the expected MDS log from stuck client tid updates
https://github.com/ceph/ceph-qa-suite/pull/615 Zheng Yan
01:32 AM Bug #13437: TestClientLimits failing: not seeing the expected MDS log from stuck client tid updates
the cause is that infernalis branch of ceph does not contain the client cap flush changes. but infernalis branch of c... Zheng Yan

10/09/2015

07:40 PM Bug #13439 (Rejected): TestDamage does not wait for MDS to be running
This wasn't running with the vstart branch fixes, which I assume is the cause of this. Greg Farnum
07:38 PM Bug #13439 (Rejected): TestDamage does not wait for MDS to be running
http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-05_23:04:01-fs-master---basic-multi/1091491/... Greg Farnum
07:13 PM Bug #13437 (Resolved): TestClientLimits failing: not seeing the expected MDS log from stuck clien...
Here's one of them: http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-06_23:04:02-fs-infernalis---basic-multi/10... Greg Farnum
02:31 AM Bug #13067 (Resolved): MDSRank unhealthy on hammer -> infernalis upgrade
Zheng Yan
02:31 AM Bug #13166 (Resolved): MDS: standby-replay does not change client_incarnation properly
Zheng Yan
02:30 AM Bug #13268: Secondary groups are not read /w SAMBA 4.2.2 & CEPHFS VFS module
development version of ceph Zheng Yan

10/08/2015

10:06 PM Bug #13334 (Resolved): delayed revoke warning in test_client_recovery test
https://github.com/ceph/ceph/pull/6170 Greg Farnum
09:42 PM Bug #13364: LibCephFS locking tests are failing and/or lockdep asserting
If somebody else wants to investigate this, feel free, but otherwise I'll try and get to it. Greg Farnum
09:42 PM Bug #13364: LibCephFS locking tests are failing and/or lockdep asserting
Sage says this passes when run against a vstart cluster. Greg Farnum
04:22 AM Bug #13364: LibCephFS locking tests are failing and/or lockdep asserting
https://github.com/ceph/ceph/pull/6200
fixes the path check.
These still fail:
[ RUN ] LibCephFS.InterP...
Sage Weil

10/07/2015

10:18 PM Bug #13364: LibCephFS locking tests are failing and/or lockdep asserting
Run: http://pulpito.ceph.com/teuthology-2015-10-07_05:00:08-smoke-master-distro-basic-multi/
Logs: http://qa-proxy.c...
Yuri Weinstein
08:58 PM Bug #13256 (Resolved): I/O error with cephfs accessing root .snap directory on v9.0.3
Greg Farnum
08:57 PM Bug #13256: I/O error with cephfs accessing root .snap directory on v9.0.3
Greg Farnum

10/06/2015

06:00 PM Bug #13364: LibCephFS locking tests are failing and/or lockdep asserting
Sage Weil
02:19 PM Support #13267 (Closed): mds heap stats cause warn message but working
Unless there are other symptoms, this log warning is not a problem. It's an indication that a client disconnected, wh... Greg Farnum

10/05/2015

04:09 PM Bug #13364 (Resolved): LibCephFS locking tests are failing and/or lockdep asserting
Run: http://qa-proxy.ceph.com/teuthology/teuthology-2015-10-04_05:00:08-smoke-master-distro-basic-multi
Job: 1088065...
Yuri Weinstein

10/02/2015

09:11 PM Bug #13334: delayed revoke warning in test_client_recovery test
Oh, it's even simpler. Can just switch the order of locker->tick and server->find_idle_sessions to get rid of this b... John Spray
09:04 PM Bug #13334: delayed revoke warning in test_client_recovery test
Yeah, this is racy because mds_session_timeout is 60s, and so is the threshold for emitting that warning.
Actually...
John Spray
05:59 AM Bug #13334 (Resolved): delayed revoke warning in test_client_recovery test
http://pulpito.ceph.com/teuthology-2015-09-29_23:04:01-fs-infernalis---basic-multi/1077881/... Greg Farnum
08:16 AM Bug #12297 (Resolved): ceph-fuse 0.94.2-1trusty segfaults / aborts
Loïc Dachary

10/01/2015

11:14 AM Feature #13316 (Closed): cephfs-data-scan: delete unlinked file objects
Oh, never mind. I just realised that we never have this case, because during scan_extents we'll always create the 0t... John Spray
11:09 AM Feature #13316 (Closed): cephfs-data-scan: delete unlinked file objects

Two types:
1 Non-0th data objects who have no 0th object
2 0th data objects that were not tagged in the latest fo...
John Spray
05:35 AM Support #13267: mds heap stats cause warn message but working
still waiting for solution. Sergey Mir

09/30/2015

09:42 PM Documentation #13311: explain user permission syntax, details
object_prefix is not mentioned in this doc at all Dan Mick
09:36 PM Documentation #13311: explain user permission syntax, details
joshd notes that the ceph-authtool manpage has more explanation; maybe at least a "see also", but perhaps some of the... Dan Mick
09:33 PM Documentation #13311 (Resolved): explain user permission syntax, details
AFAICT http://docs.ceph.com/docs/master/rados/operations/user-management/ doesn't explain:
# for an OSD, 'class-re...
Dan Mick
05:39 AM Support #13055: Problem with disconnect fuse by mds
thx it works. Sergey Mir

09/29/2015

03:02 PM Bug #13268: Secondary groups are not read /w SAMBA 4.2.2 & CEPHFS VFS module
I have cloned https://github.com/ceph/samba.git but it doesn't work with secondary groups Dennis Kramer
01:58 PM Bug #13268: Secondary groups are not read /w SAMBA 4.2.2 & CEPHFS VFS module
You mean the development version of SAMBA? Which version? Dennis Kramer
01:42 PM Bug #13268: Secondary groups are not read /w SAMBA 4.2.2 & CEPHFS VFS module
secondary groups should work if you use the newest developing code. I'am working on posix ACL supporting. Zheng Yan
06:58 AM Bug #13268 (Resolved): Secondary groups are not read /w SAMBA 4.2.2 & CEPHFS VFS module
Tested with a Windows 7 & 2008 client.
When I'm using the ceph_vfs module in SAMBA it seems secondary groups are n...
Dennis Kramer
02:17 PM Bug #13271: Missing dentry in cache when doing readdirs under cache pressure (?????s in ls-l)
I think this is 62dd63761701a7e0f7ce39f4071dcabc19bb1cf4, which doesn't appear to be in a release yet. See #12297.
...
Greg Farnum
02:02 PM Bug #13271 (New): Missing dentry in cache when doing readdirs under cache pressure (?????s in ls-l)
The original reporter confirmed that he saw this with 0.94 John Spray
01:22 PM Bug #13271: Missing dentry in cache when doing readdirs under cache pressure (?????s in ls-l)
it's fixed by a5984ba34cb684dae623df22e338f350c8765ba5. updating v0.87.1 should fix this problem Zheng Yan
01:09 PM Bug #13271 (Closed): Missing dentry in cache when doing readdirs under cache pressure (?????s in ...
above circumstance does not suppose to happen. because dirp->buffer holds references to inodes.
I found it's bug i...
Zheng Yan
10:53 AM Bug #13271 (Resolved): Missing dentry in cache when doing readdirs under cache pressure (?????s i...

Fuse sends us a series of readdirs, because it only accepts so many entries in each. We only do an MDS request on ...
John Spray
10:54 AM Feature #13272 (New): Run fuse client workloads under cache pressure

We should run at least some workloads in situations where we set client_cache_size to something fairly low, so that...
John Spray
06:44 AM Support #13267 (Closed): mds heap stats cause warn message but working
when im using command ceph mds.node2(or any of mds nodes) heap stats - i see output of memory usage,but in a log file... Sergey Mir
05:25 AM Bug #11783: protocol: flushing caps on MDS restart can go bad
I guess I should note that I only saw this the once and it also included a (slightly outdated) version of the vstart ... Greg Farnum
05:16 AM Bug #11783: protocol: flushing caps on MDS restart can go bad
I merged this by checking that it's working manually, but the testing isn't behaving properly so I haven't merged tha... Greg Farnum
05:21 AM Bug #13167 (Resolved): mds: replay gets stuck (on out-of-order journal replies?)
Greg Farnum

09/28/2015

01:33 PM Bug #13256 (Fix Under Review): I/O error with cephfs accessing root .snap directory on v9.0.3
https://github.com/ceph/ceph/pull/6095 Zheng Yan
11:52 AM Feature #13259 (New): Option to disable always writing backtraces to the default data pool
Currently, this is how we deal with hardlinks vs. layouts specifying non-default data pools: files in the non-default... John Spray

09/26/2015

05:07 PM Bug #13256 (Resolved): I/O error with cephfs accessing root .snap directory on v9.0.3
I am running a test Ceph cluster using Ceph v9.0.3 with all Kernels at 4.2.0 on Ubuntu Trusty. I have enabled snapsho... Eric Eastman

09/25/2015

02:03 PM Support #13211 (Closed): profiler and getting some memory info with it
I think this got handled in irc. Greg Farnum
01:39 PM Support #13211: profiler and getting some memory info with it
there is only root, no other users. even i make it from localhost there is same error - 1 mds.0.39 handle_command: re... Sergey Mir
08:24 AM Support #13211: profiler and getting some memory info with it
Greg Farnum wrote:
> Apparently you're not using a client with enough caps on the MDS to give it instructions. The c...
Sergey Mir
05:29 AM Support #13211: profiler and getting some memory info with it
i made it even from mds node which i trying to get heap stats...so i guess problem here is not in a client...
what...
Sergey Mir

09/24/2015

08:21 PM Bug #12674 (Resolved): Semi-reproducible crash of ceph-fuse
Nathan Cutler
08:12 PM Feature #13231: kclient: support SELinux
IMHO, it might be the missing hooks like security_inode_init_security() calls. Huamin Chen
08:03 PM Feature #13231: kclient: support SELinux
Greg, from the first 2nd test, ceph fs was able to set xattr (thanks to #1878). But ceph failed to set security.secur... Huamin Chen
07:43 PM Feature #13231: kclient: support SELinux
See #5486, #1878, and others in the tracker — I think CephFS is ready for support now, but SELinux needs to get modif... Greg Farnum
07:19 PM Feature #13231 (Duplicate): kclient: support SELinux
I cannot set selinux labbels on ceph mount.
Environment:
[root@host16-rack08 ~]# modinfo ceph
filename: /l...
Huamin Chen
06:35 PM Support #13211: profiler and getting some memory info with it
Apparently you're not using a client with enough caps on the MDS to give it instructions. The client.admin key that's... Greg Farnum
01:20 PM Support #13211: profiler and getting some memory info with it
same situation with mds memory check onto virtual machines(same osd and mons check goes fine):
root@node1:~# ceph ...
Sergey Mir
10:05 AM Feature #12334: nfs-ganesha: handle client cache pressure in NFS Ganesha FSAL
Pinged Matt & Adam about this yesterday, Matt's planning to work on it at some stage. In some cases we may want to s... John Spray

09/23/2015

04:30 PM Feature #12334: nfs-ganesha: handle client cache pressure in NFS Ganesha FSAL
It looks like the ganesha FSAL interface already includes the function `up_async_invalidate` for this sort of thing, ... John Spray
02:55 PM Support #13211: profiler and getting some memory info with it
p.s.
mon and ceph info works fine:
ceph tell osd.0 heap stats
ceph tell mon.mon00 heap stats
Sergey Mir
02:26 PM Support #13211 (Closed): profiler and getting some memory info with it
ceph version 0.94.3 (95cefea9fd9ab740263bf8bb4796fd864d9afe2b)
3.13.0-61-generic #100-Ubuntu 2015 x86_64 x86_64 x86...
Sergey Mir
07:17 AM Bug #12674: Semi-reproducible crash of ceph-fuse
I can confirm that the issue is gone with the latest ceph source from Git. Sorry for taking so long to build and test... Jörg Henne
03:20 AM Bug #13166: MDS: standby-replay does not change client_incarnation properly
For firely, standby-replay MDS also uses 0 as client_incarnation, its ID is MDS.x.0. Zheng Yan
01:15 AM Bug #13166: MDS: standby-replay does not change client_incarnation properly
Zheng, can you dig up a firefly test run and make sure the behavior of standby-replay daemons there is the same as it... Greg Farnum

09/22/2015

01:49 PM Bug #13129 (Rejected): qa: not starting ceph-mds daemons
Greg Farnum
08:33 AM Bug #13167 (Fix Under Review): mds: replay gets stuck (on out-of-order journal replies?)
Zheng Yan
08:33 AM Bug #13167: mds: replay gets stuck (on out-of-order journal replies?)
https://github.com/ceph/ceph/pull/6025 Zheng Yan
02:40 AM Bug #12674: Semi-reproducible crash of ceph-fuse
It's likely been fixed by pull request https://github.com/ceph/ceph/pull/4753 (it's large change, we haven't back-por... Zheng Yan
02:30 AM Support #13055: Problem with disconnect fuse by mds
It's likely been fixed by pull request https://github.com/ceph/ceph/pull/4753. could you try compiling ceph-fuse from... Zheng Yan

09/21/2015

11:14 PM Feature #12204 (Resolved): ceph-fuse: warn and shut down when there is no MDS present
https://github.com/ceph/ceph/pull/5416
Thanks Yuan Zhou!
Greg Farnum
11:13 PM Bug #12971 (Resolved): TestQuotaFull fails
https://github.com/ceph/ceph/pull/5942 Greg Farnum
11:10 PM Bug #11835 (Resolved): FuseMount.umount_wait can hang
Greg Farnum
10:51 PM Bug #12506 (Resolved): "Fuse mount failed to populate" error
Greg Farnum
09:18 PM Bug #13167: mds: replay gets stuck (on out-of-order journal replies?)
We should be detecting holes in the journal and shutting down with a nice message or clear assert or something instea... Greg Farnum
12:59 PM Bug #13167 (Duplicate): mds: replay gets stuck (on out-of-order journal replies?)
Write_pos of journal seems to be pointing to somewhere in object 200.00000002, But size of object 200.00000001 is 313... Zheng Yan
08:31 PM Feature #13193 (Duplicate): qa: test behavior on cache pools
John Spray
06:48 PM Feature #13193 (Duplicate): qa: test behavior on cache pools
We don't run the FS on any cache pools right now. We should start doing so on an appropriate assortment (TBD). Probab... Greg Farnum
01:44 PM Bug #12674: Semi-reproducible crash of ceph-fuse
Please also note that this crash can be triggered with a simple operation of the style
> find . -name \*.txt | xa...
Jörg Henne
01:20 PM Bug #12674: Semi-reproducible crash of ceph-fuse
RP https://github.com/ceph/ceph/pull/4753 may fix this issue. could you try compiling ceph-fuse from the newest ceph ... Zheng Yan
11:10 AM Bug #12674: Semi-reproducible crash of ceph-fuse
> Status changed from New to Need More Info
What kind of info are you looking for? As I've just hit another situat...
Jörg Henne
01:32 PM Bug #13166: MDS: standby-replay does not change client_incarnation properly
Zheng Yan
07:44 AM Bug #13166 (Fix Under Review): MDS: standby-replay does not change client_incarnation properly
https://github.com/ceph/ceph/pull/6003 Zheng Yan
01:25 PM Support #13055: Problem with disconnect fuse by mds
attached last logs.
last error msg is different from previous, so maybe i should attach other when they happen?
Sergey Mir
01:16 PM Support #13055: Problem with disconnect fuse by mds
I can't open the issue neither. But this seems to be duplicate of #12674 Zheng Yan
01:08 PM Support #13055: Problem with disconnect fuse by mds
We use clients with kernel version 3.13.0 (current client host which got that error have 3.13.0-58-generic, others ha... Sergey Mir
03:05 AM Support #13055: Problem with disconnect fuse by mds
We never saw this backtrace before. Which version of kernel are you using? Please apply the attached debug patch to t... Zheng Yan

09/20/2015

05:57 PM Backport #12097 (In Progress): kernel_untar_build fails on EL7
Nathan Cutler

09/19/2015

12:07 AM Bug #13167 (Resolved): mds: replay gets stuck (on out-of-order journal replies?)
ubuntu-2015-09-17_16:55:52-fs-greg-fs-testing---basic-multi/1061690/ceph-mds.a.log
This MDS went in and out of rep...
Greg Farnum
12:04 AM Bug #13166: MDS: standby-replay does not change client_incarnation properly
If we need more logs, I copied the standby MDS log to ubuntu-2015-09-17_16:55:52-fs-greg-fs-testing---basic-multi/106... Greg Farnum

09/18/2015

11:52 PM Bug #13166: MDS: standby-replay does not change client_incarnation properly
Obvious fix is to have MDSRank check the incarnation and update, but I want us to look more deeply at how the replayi... Greg Farnum
11:24 PM Bug #13166: MDS: standby-replay does not change client_incarnation properly
Hmm, I don't think we should actually be doing operations as mds.0.0 when we're a standby for the real mds.0.0 either... Greg Farnum
11:22 PM Bug #13166: MDS: standby-replay does not change client_incarnation properly
Greg Farnum
10:59 PM Bug #13166 (Resolved): MDS: standby-replay does not change client_incarnation properly
... Greg Farnum
01:02 PM Support #13055: Problem with disconnect fuse by mds
here it is:
(gdb) bt
#0??0x00007f9afcad779b in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#...
Sergey Mir
12:50 PM Support #13055: Problem with disconnect fuse by mds
coredump file is useless without execute file. could you use gdb to check the coredump file and send backtrace to us ... Zheng Yan
06:16 AM Support #13055: Problem with disconnect fuse by mds
http://148.251.180.58/files/cores/core.ceph-fuse.15542.tar.gz - core dump
attached file-log
Sergey Mir
05:40 AM Support #13055: Problem with disconnect fuse by mds
Greetings.
We have the same problem after compiling and merging from git. Still ceph-fuse disconnects. after some ti...
Sergey Mir
01:36 AM Bug #12506 (In Progress): "Fuse mount failed to populate" error
https://github.com/ceph/ceph/pull/5966#issuecomment-141302916 Greg Farnum

09/17/2015

04:03 PM Bug #12506 (Fix Under Review): "Fuse mount failed to populate" error
https://github.com/ceph/ceph/pull/5966 Zheng Yan
12:48 PM Bug #12506: "Fuse mount failed to populate" error
this timeout only happens for jobs that contain clusters/standby-replay.yaml. I reproduce this issue locally by set "... Zheng Yan
02:52 PM Bug #13129 (Need More Info): qa: not starting ceph-mds daemons
That commit is now about 16 hours old, and these jobs were started about 24 hours ago — although they didn't get mach... Greg Farnum
07:52 AM Bug #13129: qa: not starting ceph-mds daemons
the bug is introduced by commit 685d76a77ca16ca601a99148ef507cfde1fb3593 "ceph: wait for CephFS to be healthy before ... Zheng Yan
12:56 PM Bug #13138: Segfault shutting down python-cephfs
Zheng Yan
12:48 PM Bug #13138 (Fix Under Review): Segfault shutting down python-cephfs
https://github.com/ceph/ceph/pull/5964 John Spray
10:36 AM Bug #13138: Segfault shutting down python-cephfs
So perhaps we weren't seeing this in automated tests because you have to keep the process alive for a while after shu... John Spray
10:17 AM Bug #13138 (Resolved): Segfault shutting down python-cephfs
... John Spray

09/16/2015

10:42 PM Bug #13129 (Rejected): qa: not starting ceph-mds daemons
http://pulpito.ceph.com/teuthology-2015-09-14_23:10:02-knfs-master-testing-basic-multi/1057697
http://pulpito.ceph.c...
Greg Farnum
10:23 PM Bug #12506: "Fuse mount failed to populate" error
This is continuing to cause trouble in the nightlies. http://pulpito.ceph.com/teuthology-2015-09-14_23:04:02-fs-maste... Greg Farnum
09:50 PM Fix #13126: qa: ceph-fuse flushes very slowly in some workunits
This is blocked by/equivalent to #13127 in the Ceph project. Greg Farnum
09:45 PM Fix #13126 (Resolved): qa: ceph-fuse flushes very slowly in some workunits
Our ffsb and fsync workunits both generate lots of small IOs at random offsets, each of which becomes its own Op to t... Greg Farnum
 

Also available in: Atom