Project

General

Profile

Activity

From 01/18/2017 to 02/16/2017

02/16/2017

09:58 PM Backport #18708 (In Progress): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_n...
Nathan Cutler
02:43 PM Backport #18708 (Fix Under Review): jewel: failed filelock.can_read(-1) assertion in Server::_dir...
https://github.com/ceph/ceph/pull/13459 Zheng Yan
09:16 PM Feature #17980 (In Progress): MDS should reject connections from OSD-blacklisted clients
There's a jcsp/wip-17980 with a first cut of this. John Spray
03:32 PM Feature #18490: client: implement delegation support in userland cephfs
Zheng asked a pointed question about this today, so to be clear...
This would be 100% an opportunistic thing. You ...
Jeff Layton
08:31 AM Bug #18953 (Fix Under Review): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
https://github.com/ceph/ceph/pull/13455 Zheng Yan
08:13 AM Bug #18953 (Resolved): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
clients should be able to acquire/release file locks when fs is full Zheng Yan

02/15/2017

10:54 PM Backport #18950 (Resolved): kraken: mds/StrayManager: avoid reusing deleted inode in StrayManager...
https://github.com/ceph/ceph/pull/14570 Loïc Dachary
10:54 PM Backport #18949 (Resolved): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManager:...
https://github.com/ceph/ceph/pull/14670 Loïc Dachary
10:47 PM Backport #18100 (Resolved): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jewel 10.2.3
Loïc Dachary
10:47 PM Backport #18679 (Resolved): jewel: failed to reconnect caps during snapshot tests
Loïc Dachary
06:23 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Jeff Layton wrote:
> Greg Farnum wrote:
> > Gah. I've run out of time to work on this right now. I've got a branch ...
Greg Farnum
11:53 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Greg Farnum wrote:
> Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsf...
Jeff Layton
01:39 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsfortytwo/ceph.git which... Greg Farnum
04:02 AM Bug #18941 (Fix Under Review): buffer overflow in test LibCephFS.DirLs
https://github.com/ceph/ceph/pull/13429 Zheng Yan
03:57 AM Bug #18941 (Resolved): buffer overflow in test LibCephFS.DirLs
http://pulpito.ceph.com/jspray-2017-02-14_02:39:19-fs-wip-jcsp-testing-20170214-distro-basic-smithi/812889 Zheng Yan
12:26 AM Feature #18490: client: implement delegation support in userland cephfs
John Spray wrote:
> So the "client completely unresponsive but only evict it when someone else wants its caps" case ...
Jeff Layton

02/14/2017

10:25 PM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
John Spray
10:24 PM Bug #18877 (Pending Backport): mds/StrayManager: avoid reusing deleted inode in StrayManager::_pu...
John Spray
10:21 PM Feature #18490: client: implement delegation support in userland cephfs
So the "client completely unresponsive but only evict it when someone else wants its caps" case is http://tracker.cep... John Spray
04:27 PM Feature #18490: client: implement delegation support in userland cephfs
I started taking a look at this. One thing we have to solve first, is that I don't think there is any automatic resol... Jeff Layton
06:25 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
Actually, does fail with stale. Instead the NFS mount command eventually times out.
mount.nfs: Connection timed out
c sights
06:23 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
Happens for me also:
Debian Jessie with backported kernel
Linux drbl 4.8.0-0.bpo.2-amd64 #1 SMP Debian 4.8.15-2~bpo...
c sights
11:10 AM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
John Spray
12:29 AM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
Oh yeah, the client does have code that's meant to be doing that, and on the client side it's a wait_for_latest. So ... John Spray

02/13/2017

07:05 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
That's odd; I thought clients validated pools before passing them to the mds. Maybe that's wrong or undesirable for o... Greg Farnum
12:29 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
Hmm, so this is happening because volume client creates a pool, then tries to use it as a layout at a time before its... John Spray
07:43 AM Bug #18914 (Resolved): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client....
http://qa-proxy.ceph.com/teuthology/teuthology-2017-02-12_10:10:02-fs-jewel---basic-smithi/808861/ Zheng Yan
02:38 PM Bug #18838 (Fix Under Review): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
https://github.com/ceph/ceph/pull/13394 Kefu Chai
12:04 PM Bug #18915: valgrind causes ceph-fuse mount_wait timeout
Probably duplicate of http://tracker.ceph.com/issues/18797 John Spray
08:15 AM Bug #18915 (New): valgrind causes ceph-fuse mount_wait timeout
http://pulpito.ceph.com/teuthology-2017-02-11_17:15:02-fs-master---basic-smithi/
http://qa-proxy.ceph.com/teutholo...
Zheng Yan
08:58 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
make check seem to get stuck after PASS: unittest_log on unittest_throttle.
_edit_ The machine has only very littl...
Jan Fajerski
07:22 AM Backport #18900 (Resolved): jewel: Test failure: test_open_inode
https://github.com/ceph/ceph/pull/14669 Loïc Dachary
07:22 AM Backport #18899 (Resolved): kraken: Test failure: test_open_inode
https://github.com/ceph/ceph/pull/14569 Loïc Dachary

02/11/2017

12:27 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
The commit is from the SUSE repo. Its part of the ses4 branch: https://github.com/SUSE/ceph/commits/ses4. Sorry shoul... Jan Fajerski
12:22 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
PPC clients! Wondering if you've tried running any of the automated tests (the unit tests, or teuthology suites?) on... John Spray

02/10/2017

10:23 PM Bug #18816: MDS crashes with log disabled
For some reason we still let people disable the MDS log. That's...bad. I think it only existed for some cheap benchma... Greg Farnum
10:18 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
Well, the problem is clearly indicated by the client... Greg Farnum
05:48 PM Bug #18661 (Pending Backport): Test failure: test_open_inode
John Spray
02:44 PM Bug #18883 (New): qa: failures in samba suite

First Samba run in ages:
http://pulpito.ceph.com/zack-2017-02-08_12:21:51-samba-master---basic-smithi/
Let's ge...
John Spray
12:20 PM Bug #18882 (New): StrayManager::advance_delayed() can use tens of seconds
I saw mds become laggy when running blogbench in a loop. The command I ran is "while `true`; do ls | xargs -P8 -n1 rm... Zheng Yan
03:26 AM Bug #18877: mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stray_logged
https://github.com/ceph/ceph/pull/13347 Zhi Zhang
03:25 AM Bug #18877 (Resolved): mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stra...
This issue was found by testing another PR (https://github.com/ceph/ceph/pull/12792), which makes MDS directly uses T... Zhi Zhang

02/09/2017

03:36 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
Also daemon commands don't return anything. That is for the client mds_requests and objecter_requests and ops_in_flig... Jan Fajerski
03:27 PM Bug #18872 (Resolved): write to cephfs mount hangs, ceph-fuse and kernel
When trying to write to a cephfs mount using 'dd' the client hangs indefinitely. The kernel client can be <ctrl-c>'ed... Jan Fajerski
09:37 AM Bug #18816: MDS crashes with log disabled
/// * Updated by Ahmed Akhuraidah in ML
The issue can be reproduced with upstream Ceph packages.
ahmed@ubcephno...
Shinobu Kinjo

02/07/2017

09:51 PM Bug #18850: Leak in MDCache::handle_dentry_unlink
Attached full valgrind from mds.a in jspray-2017-02-07_16:25:53-multimds-wip-jcsp-testing-20170206-testing-basic-smit... John Spray
09:49 PM Bug #18850 (Rejected): Leak in MDCache::handle_dentry_unlink

While there are various bits of valgrind noise going around at the moment, this one does look like a multimds speci...
John Spray
08:34 PM Bug #18845: valgrind failure in fs suite
also at shttp://pulpito.ceph.com/abhi-2017-02-07_15:12:56-fs-wip-luminous-2---basic-smithi/795353/ Abhishek Lekshmanan
03:38 PM Bug #18845 (New): valgrind failure in fs suite
Saw a valgrind failure on fs suite on the master branch as of fc2df15, run http://pulpito.ceph.com/abhi-2017-02-07_09... Abhishek Lekshmanan

02/06/2017

11:37 PM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
I've also seen this one rear its ugly head again in my last fs run.
I looked at this one:
/a/jspray-2017-02-06_11...
John Spray
11:07 PM Bug #18797: valgrind jobs hanging in fs suite
After noticing http://tracker.ceph.com/issues/18838, I now also notice that the suspect commit range where this start... John Spray
11:05 PM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
http://pulpito.ceph.com/jspray-2017-02-06_11:13:20-fs-wip-jcsp-testing-20170204-distro-basic-smithi/789908... John Spray
07:04 PM Bug #16397 (Resolved): nfsd selinux denials causing knfs tests to fail
I've not heard of any further reports of this since the new package went into production. I'm going to declare victor... Jeff Layton
02:20 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
OK we'll try to get that next time. Dan van der Ster
02:19 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Hmm, client's failing to participate in failover is probably not the same as #18757, as that one was the result of cl... John Spray
11:59 AM Bug #18830 (Fix Under Review): Coverity: bad iterator dereference in Locker::acquire_locks
https://github.com/ceph/ceph/pull/13272 John Spray
11:31 AM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
... John Spray

02/05/2017

05:04 AM Feature #10792 (Fix Under Review): qa: enable thrasher for MDS cluster size (vary max_mds)
PR: https://github.com/ceph/ceph/pull/13262 Patrick Donnelly
01:36 AM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
This exact error popped up in a branch of mine (http://pulpito.ceph.com/gregf-2017-02-04_03:30:50-fs-wip-17594---basi... Greg Farnum

02/04/2017

07:12 AM Bug #18816: MDS crashes with log disabled
test Ahmed Akhuraidah
06:51 AM Bug #18816 (Resolved): MDS crashes with log disabled
Note the "mds_log = false" below. If you do that, this happens:
Have crushed MDS daemon during executing different...
Ahmed Akhuraidah

02/03/2017

05:30 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Cache dump is here:
https://knowledgenetworkbc-my.sharepoint.com/personal/darrelle_knowledge_ca/_layouts/15/guesta...
Darrell Enns
07:24 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Does the mds cache dump need to be done while it is hung? It's a production system, so I wasn't able to leave it in a... Darrell Enns
06:57 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy Zheng Yan
01:40 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
please run 'ceph daemon mds.kb-ceph03 dump cache /tmp/cachedump.0' and upload /tmp/cachedump.0. Besides, please check... Zheng Yan
12:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Comparing logs I noticed that MDS clock is ~30s behind client. ntpd was dead on one of test servers... Will try to co... Henrik Korkuc
10:17 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Actually, one more thing: following this crash, the clients did not fail over to the standby MDS. Processes accessing... Dan van der Ster
09:57 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Excellent! Sorry that my search of the tracker didn't find that issue.
We'll apply that backport when it's ready.
Dan van der Ster
09:50 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
duplicate of http://tracker.ceph.com/issues/18578. The fix is pending backport Zheng Yan
09:29 AM Bug #18802 (New): Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc:...
A user just did:... Dan van der Ster

02/02/2017

07:13 PM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
I've had two occurrences in the past 3 weeks where filesystem activity hangs, with the MDS report a client "failing t... Darrell Enns
05:15 PM Bug #18797: valgrind jobs hanging in fs suite
They're going dead during teardown when teuthology tries to list cephtest/ but something went wrong much earlier with... John Spray
05:08 PM Bug #18797: valgrind jobs hanging in fs suite
Ignore the last passing link above, that was actually a hammer run (which shows up as fs-master for some reason).
...
John Spray
04:55 PM Bug #18797 (Duplicate): valgrind jobs hanging in fs suite

First one to show the issue was:
http://pulpito.ceph.com/teuthology-2017-01-28_17:15:01-fs-master---basic-smithi
...
John Spray
11:12 AM Bug #18754 (Fix Under Review): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
https://github.com/ceph/ceph/pull/13227/commits/3b899fb0c6153c30ad6ef782499f79ba5c7b2a22 Zheng Yan
09:53 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
forgot to add a timeline:
08:08:29 mounted
08:09:54 iptables up
08:10:50 ls stuck
08:16:02 iptables down
08:17:5...
Henrik Korkuc
08:25 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Attaching client and MDS logs. This time I was mounting from another server and firewalling both input and output to ... Henrik Korkuc
04:10 AM Bug #18755 (Fix Under Review): multimds: MDCache.cc: 4735: FAILED assert(in)
https://github.com/ceph/ceph/pull/13227/commits/097dc86b330392c4aa89f0033ce6e40528481857 Zheng Yan

02/01/2017

11:14 PM Bug #18757 (New): Jewel ceph-fuse does not recover after lost connection to MDS
Hmm, now that I actually read the log (like a reasonable person :-)) it is a little bit strange that the server is se... John Spray
10:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Yes, I am reproducing it artificially after short network blip caused permanent mount point hangs of multiple servers... Henrik Korkuc
07:59 PM Bug #18757 (Rejected): Jewel ceph-fuse does not recover after lost connection to MDS
Clients which lose connectivity to the MDS are evicted after a timeout given by the "mds session timeout" setting. E... John Spray
12:26 PM Bug #18757 (Resolved): Jewel ceph-fuse does not recover after lost connection to MDS
After ceph-fuse loses connection to MDS for few minutes, it does not recover - accessing mountpoint hangs processes.
...
Henrik Korkuc
06:53 PM Feature #18763 (New): Ganesha instances should restart themselves when blacklisted

If I evict the client session that a ganesha instance was using, then the libcephfs instance will start to see sess...
John Spray
02:25 PM Bug #18759 (Fix Under Review): multimds suite tries to run norstats tests against kclient
https://github.com/ceph/ceph/pull/13089 John Spray
01:11 PM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient

You get failures on the workunit like:
jspray-2017-02-01_02:51:18-multimds-wip-jcsp-testing-20170201-testing-basic...
John Spray
03:33 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
Zheng Yan
03:32 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
... Zheng Yan
01:11 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
There are actually 3 assertion failures (on 3 different MDS) in this run!... Patrick Donnelly
01:07 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
... Patrick Donnelly
02:37 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
Zheng Yan
02:19 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
ceph-mds.a.log... Zheng Yan
12:55 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
... Patrick Donnelly

01/31/2017

01:34 PM Bug #17801 (Resolved): Cleanly reject "session evict" command when in replay
Nathan Cutler
01:34 PM Backport #18010 (Resolved): jewel: Cleanly reject "session evict" command when in replay
Nathan Cutler
01:25 PM Bug #17193: truncate can cause unflushed snapshot data lose
Re-enable test: https://github.com/ceph/ceph/pull/13200 Nathan Cutler
01:17 PM Bug #17193 (Resolved): truncate can cause unflushed snapshot data lose
Nathan Cutler
01:17 PM Backport #18103 (Resolved): jewel: truncate can cause unflushed snapshot data lose
Nathan Cutler
01:16 PM Bug #18408 (Resolved): lookup of /.. in jewel returns -ENOENT
Nathan Cutler
01:16 PM Backport #18413 (Resolved): jewel: lookup of /.. in jewel returns -ENOENT
Nathan Cutler
01:15 PM Bug #18519 (Resolved): speed up readdir by skipping unwanted dn
Nathan Cutler
01:15 PM Backport #18520 (Resolved): jewel: speed up readdir by skipping unwanted dn
Nathan Cutler
01:14 PM Backport #18565 (Resolved): jewel: MDS crashes on missing metadata object
Nathan Cutler
01:14 PM Backport #18551 (Resolved): jewel: ceph-fuse crash during snapshot tests
Nathan Cutler
01:12 PM Backport #18282 (Resolved): jewel: monitor cannot start because of "FAILED assert(info.state == M...
Nathan Cutler
01:12 PM Bug #18086 (Resolved): cephfs: fix missing ll_get for ll_walk
Nathan Cutler
01:11 PM Backport #18195 (Resolved): jewel: cephfs: fix missing ll_get for ll_walk
Nathan Cutler
01:11 PM Bug #17954 (Resolved): standby-replay daemons can sometimes miss events
Nathan Cutler
01:10 PM Backport #18192 (Resolved): jewel: standby-replay daemons can sometimes miss events
Nathan Cutler
01:01 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Current status of lab cluster is:
* Fixed the "missing dirfrag object" damage with a script that removed the offe...
John Spray
11:59 AM Bug #18743 (Resolved): Scrub considers dirty backtraces to be damaged, puts in damage table even ...

Two things are wrong here:
* When running scrub_path /teuthology-archive on the lab filesystem, I get a flurry ...
John Spray
07:40 AM Bug #9935 (Resolved): client: segfault on ceph_rmdir path "/"
Nathan Cutler
07:39 AM Backport #18611 (Resolved): jewel: client: segfault on ceph_rmdir path "/"
Nathan Cutler

01/30/2017

07:28 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Would that have caused ESTALE? Dan Mick
01:41 PM Bug #18730 (Closed): mds: backtrace issues getxattr for every file with cap on rejoin

In Server::handle_client_reconnect, a inode numbers that had client caps but were not in cache are passed into MDCa...
John Spray

01/28/2017

11:17 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Dan Mick wrote:
> [...]
teuthology-2016-12-18_02:01:14-rbd-master-distro-basic-smithi is not in root directory,...
Zheng Yan
04:37 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Tried them again tonight after repairing the broken stray object, and they worked this time. <shrug>
I guess the ...
Dan Mick

01/27/2017

10:55 PM Bug #17236 (Resolved): MDS goes damaged on blacklist (failed to read JournalPointer: -108 ((108) ...
Nathan Cutler
10:54 PM Bug #17832 (Resolved): "[ FAILED ] LibCephFS.InterProcessLocking" in jewel v10.2.4
Nathan Cutler
10:53 PM Bug #17270 (Resolved): [cephfs] fuse client crash when adding a new osd
Nathan Cutler
10:53 PM Bug #17275 (Resolved): MDS long-time blocked ops. ceph-fuse locks up with getattr of file
Nathan Cutler
10:52 PM Bug #17611 (Resolved): mds: false "failing to respond to cache pressure" warning
Nathan Cutler
10:51 PM Bug #17716 (Resolved): MDS: false "failing to respond to cache pressure" warning
Nathan Cutler
10:51 PM Bug #18131 (Resolved): ceph-fuse not clearing setuid/setgid bits on chown
Nathan Cutler
10:48 PM Bug #18361 (Resolved): Test failure: test_session_reject (tasks.cephfs.test_sessionmap.TestSessio...
Nathan Cutler
06:56 PM Bug #18717 (Resolved): multimds: FAILED assert(0 == "got export_cancel in weird state")
... Patrick Donnelly
06:31 PM Backport #18708 (Resolved): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_none...
https://github.com/ceph/ceph/pull/13459 Nathan Cutler
06:31 PM Backport #18707 (Resolved): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_non...
https://github.com/ceph/ceph/pull/13555 Nathan Cutler
06:31 PM Backport #18706 (Resolved): kraken: fragment space check can cause replayed request fail
https://github.com/ceph/ceph/pull/14568 Nathan Cutler
06:31 PM Backport #18705 (Resolved): jewel: fragment space check can cause replayed request fail
https://github.com/ceph/ceph/pull/14668 Nathan Cutler
04:44 PM Backport #18700: kraken: client: fix the cross-quota rename boundary check conditions
master one: https://github.com/ceph/ceph/pull/12489 Nathan Cutler
04:34 PM Backport #18700 (Resolved): kraken: client: fix the cross-quota rename boundary check conditions
https://github.com/ceph/ceph/pull/14567 John Spray
04:40 PM Bug #18660 (Pending Backport): fragment space check can cause replayed request fail
John Spray
04:38 PM Bug #11124 (Resolved): MDSMonitor: refuse to do "fs new" on metadata pools containing objects
John Spray
04:37 PM Bug #18578 (Pending Backport): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
John Spray
04:37 PM Backport #18699: jewel: client: fix the cross-quota rename boundary check conditions
master one: https://github.com/ceph/ceph/pull/12489 Nathan Cutler
04:34 PM Backport #18699 (Resolved): jewel: client: fix the cross-quota rename boundary check conditions
https://github.com/ceph/ceph/pull/14667 John Spray
01:18 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
They all return ESTALE. Not sure what else I need to be doing
Dan Mick
01:07 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
... Dan Mick
01:04 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Duh, ls was fine:
ls -ld * | sort -n -k 5
drwxrwxr-x 1 1001 1001 18446744057908416832 Jan 23 02:23 teuthology-201...
Dan Mick
01:00 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
I would have sworn Greg directed me to try that, but perhaps we didn't include 'force'. Shrug. Thanks for the help.... Dan Mick

01/26/2017

11:28 PM Bug #18691 (New): multimds thrash: FAILED assert(_head.empty())
... Patrick Donnelly
05:30 PM Backport #18100 (In Progress): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jewel ...
Here it is: https://github.com/ceph/ceph/pull/13139 John Spray
09:18 AM Backport #18100 (Need More Info): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jew...
@John - In http://tracker.ceph.com/issues/17837#note-18 you mentioned that you backported this to jewel. If you still... Nathan Cutler
04:57 PM Bug #18661 (Fix Under Review): Test failure: test_open_inode
https://github.com/ceph/ceph/pull/13137 John Spray
04:16 PM Bug #18661: Test failure: test_open_inode
You're right, that piece of test code is racy :-/ Need to get the remote python code running inside open_background ... John Spray
09:14 AM Bug #18661: Test failure: test_open_inode
Zheng Yan
08:32 AM Bug #18661: Test failure: test_open_inode
... Zheng Yan
09:23 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Zheng Yan
07:13 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
"ceph daemon mds.mira049 scrub_path / repair recursive force" will find and fix any other issue. But it will take ver... Zheng Yan
07:06 AM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Fixed by:
ceph daemon mds.mira049 scrub_path /teuthology-archive/sage-2016-11-12_02:26:45-rados-wip-sage-testing--...
Zheng Yan
09:09 AM Backport #18192 (In Progress): jewel: standby-replay daemons can sometimes miss events
Nathan Cutler
09:08 AM Backport #18195 (In Progress): jewel: cephfs: fix missing ll_get for ll_walk
Nathan Cutler
09:03 AM Bug #18675 (Fix Under Review): client: during multimds thrashing FAILED assert(session->requests....
https://github.com/ceph/ceph/pull/13124 Zheng Yan
09:03 AM Backport #18282 (In Progress): jewel: monitor cannot start because of "FAILED assert(info.state =...
Nathan Cutler
06:28 AM Backport #18652 (Resolved): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessionma...
Loïc Dachary
05:49 AM Bug #18680 (Resolved): multimds: cluster can assign active mds beyond max_mds during failures
From: http://pulpito.ceph.com/pdonnell-2017-01-25_22:42:21-multimds:thrash-wip-multimds-tests-testing-basic-mira/7483... Patrick Donnelly
04:18 AM Backport #18551 (In Progress): jewel: ceph-fuse crash during snapshot tests
Nathan Cutler
04:12 AM Backport #18565 (In Progress): jewel: MDS crashes on missing metadata object
Nathan Cutler

01/25/2017

11:52 PM Backport #18678: kraken: failed to reconnect caps during snapshot tests
http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-07_17:15:02-fs-master---basic-smithi/698957/
Loïc Dachary
11:20 PM Backport #18678 (In Progress): kraken: failed to reconnect caps during snapshot tests
https://github.com/ceph/ceph/pull/13112 John Spray
11:18 PM Backport #18678 (Resolved): kraken: failed to reconnect caps during snapshot tests
https://github.com/ceph/ceph/pull/13112 John Spray
11:52 PM Backport #18679: jewel: failed to reconnect caps during snapshot tests
http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-07_17:15:02-fs-master---basic-smithi/698957/ Loïc Dachary
11:23 PM Backport #18679 (In Progress): jewel: failed to reconnect caps during snapshot tests
https://github.com/ceph/ceph/pull/13113 John Spray
11:22 PM Backport #18679 (Resolved): jewel: failed to reconnect caps during snapshot tests
https://github.com/ceph/ceph/pull/13113 John Spray
11:35 PM Bug #18574 (Resolved): cephfs test failures (ceph.com/qa is broken, should be download.ceph.com/qa)
John Spray
11:34 PM Backport #18604 (Resolved): kraken: cephfs test failures (ceph.com/qa is broken, should be downlo...
John Spray
11:32 PM Bug #18309 (Resolved): TestVolumeClient.test_evict_client failure creating pidfile
John Spray
11:32 PM Backport #18439 (Resolved): kraken: TestVolumeClient.test_evict_client failure creating pidfile
John Spray
11:31 PM Backport #18540 (Resolved): kraken: Test failure: test_session_reject (tasks.cephfs.test_sessionm...
John Spray
11:29 PM Backport #18612 (Resolved): kraken: client: segfault on ceph_rmdir path "/"
John Spray
11:28 PM Backport #18531 (Resolved): kraken: speed up readdir by skipping unwanted dn
John Spray
11:27 PM Bug #18311 (Resolved): Decode errors on backtrace will crash MDS
John Spray
11:26 PM Backport #18463 (Resolved): kraken: Decode errors on backtrace will crash MDS
John Spray
10:11 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
I don't know how to repair this or even identify other instances. Dan Mick
06:57 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
I'll plan to leave this open for the next week or two and we can see if any failures crop up between now and then. If... Jeff Layton
05:21 PM Bug #18675 (Resolved): client: during multimds thrashing FAILED assert(session->requests.empty())
... Patrick Donnelly
05:07 PM Backport #17705: jewel: ceph_volume_client: recovery of partial auth update is broken
The test commits from https://github.com/ceph/ceph-qa-suite/pull/1221 were moved into the main ceph/ceph.git PR after... Nathan Cutler
02:04 PM Backport #17705 (Resolved): jewel: ceph_volume_client: recovery of partial auth update is broken
John Spray
03:49 PM Bug #18670 (Duplicate): cephfs get strange permission deny when playing with git.
Aha, yes , remove the caps works.
Thanks for pointing me out such quick:)
Xiaoxi Chen
03:24 PM Bug #18670: cephfs get strange permission deny when playing with git.
It is likely that you are hitting http://tracker.ceph.com/issues/17858
Try modifying your client's auth caps to re...
John Spray
03:12 PM Bug #18670 (Duplicate): cephfs get strange permission deny when playing with git.
Can reproduce it really stable , via git pull <any_projct>. The failed file changed between each attempt, but always... Xiaoxi Chen
02:16 PM Bug #18646 (Fix Under Review): mds: rejoin_import_cap FAILED assert(session)
Zheng Yan
02:16 PM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
https://github.com/ceph/ceph/pull/12974/commits/eef9e568dcccc0bb84322527c45028d3dc275c6b Zheng Yan
02:06 PM Backport #17974 (Resolved): jewel: ceph/Client segfaults in handle_mds_map when switching mds
John Spray
02:06 PM Bug #17858 (Resolved): Cannot create deep directories when caps contain "path=/somepath"
John Spray
02:05 PM Backport #18008 (Resolved): jewel: Cannot create deep directories when caps contain "path=/somepath"
John Spray
02:05 PM Bug #17216 (Resolved): ceph_volume_client: recovery of partial auth update is broken
John Spray
02:04 PM Backport #18615 (Resolved): jewel: segfault in handle_client_caps
John Spray
02:04 PM Bug #17800 (Resolved): ceph_volume_client.py : Error: Can't handle arrays of non-strings
John Spray
02:03 PM Backport #18026 (Resolved): jewel: ceph_volume_client.py : Error: Can't handle arrays of non-strings
John Spray
02:03 PM Bug #17798 (Resolved): Clients without pool-changing caps shouldn't be allowed to change pool_nam...
John Spray
02:03 PM Backport #17956 (Resolved): jewel: Clients without pool-changing caps shouldn't be allowed to cha...
John Spray
02:02 PM Backport #18603 (Resolved): jewel: cephfs test failures (ceph.com/qa is broken, should be downloa...
John Spray
01:59 PM Backport #18462 (Resolved): jewel: Decode errors on backtrace will crash MDS
John Spray
12:32 PM Bug #18663 (Fix Under Review): teuthology teardown hangs if kclient umount fails
https://github.com/ceph/ceph/pull/13099 John Spray
12:09 PM Bug #18663 (Resolved): teuthology teardown hangs if kclient umount fails

http://qa-proxy.ceph.com/teuthology/jspray-2017-01-25_02:52:36-multimds-wip-jcsp-testing-20170124-testing-basic-smi...
John Spray
10:53 AM Bug #18662: TestClientLimits hang on teuthology teardown
... John Spray
08:53 AM Bug #18662 (New): TestClientLimits hang on teuthology teardown
http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-16_17:25:01-kcephfs-master-testing-basic-mira/722735/teutholog... Zheng Yan
08:43 AM Bug #18661 (Resolved): Test failure: test_open_inode
http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-15_10:10:01-fs-jewel---basic-smithi/719557/ Zheng Yan
07:01 AM Bug #18660 (Fix Under Review): fragment space check can cause replayed request fail
https://github.com/ceph/ceph/pull/13095 Zheng Yan
06:57 AM Bug #18660 (Resolved): fragment space check can cause replayed request fail
Zheng Yan

01/24/2017

11:36 PM Bug #18600 (Fix Under Review): multimds suite tries to run quota tests against kclient, fails
https://github.com/ceph/ceph/pull/13089 John Spray
10:09 PM Bug #16397 (Fix Under Review): nfsd selinux denials causing knfs tests to fail
Patch to unpin from ubuntu, waiting for that test to go through https://github.com/ceph/ceph/pull/13088 John Spray
10:04 PM Bug #16397: nfsd selinux denials causing knfs tests to fail
... John Spray
09:41 PM Feature #16219: test: smallfile benchmark tool
Just saw this, I don't know teuthology but I wrote smallfile. Ben Turner in QE has automated Gluster performance reg... Ben England
04:14 PM Backport #18652 (In Progress): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessio...
Nathan Cutler
02:21 PM Backport #18652 (Fix Under Review): jewel: Test failure: test_session_reject (tasks.cephfs.test_s...
https://github.com/ceph/ceph/pull/13085 John Spray
02:18 PM Backport #18652 (Resolved): jewel: Test failure: test_session_reject (tasks.cephfs.test_sessionma...
https://github.com/ceph/ceph/pull/13085 John Spray
02:18 PM Bug #18361: Test failure: test_session_reject (tasks.cephfs.test_sessionmap.TestSessionMap)
Oops, this needed backporting to jewel as well. John Spray
01:28 PM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
I've been thinking a bit about how we handle eviction in the multimds case, and whether we perhaps ought to centraliz... John Spray
06:40 AM Bug #18646: mds: rejoin_import_cap FAILED assert(session)
Yes, it's reasonable. If client did not close the session volunteerly. It's likely the session was killed (due to tim... Zheng Yan
04:43 AM Bug #18646 (Resolved): mds: rejoin_import_cap FAILED assert(session)
... Patrick Donnelly

01/23/2017

09:25 PM Bug #18641 (Can't reproduce): mds: stalled clients apparently due to stale sessions
4/16 clients building the kernel with 2 active MDS blocked on IO. After digging into the ceph-fuse log, I found that ... Patrick Donnelly
05:35 PM Backport #18615: jewel: segfault in handle_client_caps
Thanks for the backport, Alexey! Nathan Cutler
05:34 PM Backport #18615 (In Progress): jewel: segfault in handle_client_caps
Nathan Cutler
08:44 AM Backport #18615 (Fix Under Review): jewel: segfault in handle_client_caps
Alexey Sheplyakov
08:43 AM Backport #18615: jewel: segfault in handle_client_caps
https://github.com/ceph/ceph/pull/13060 Alexey Sheplyakov
02:15 PM Bug #18600 (In Progress): multimds suite tries to run quota tests against kclient, fails
John Spray
02:13 PM Feature #18537: libcephfs cache invalidation upcalls
iirc, we still have the file handle cache, even if we disable caching in general, right?
Is it the case that we wo...
John Spray
11:54 AM Backport #18602 (Resolved): hammer: cephfs test failures (ceph.com/qa is broken, should be downlo...
Nathan Cutler
09:58 AM Bug #17370: knfs ffsb hang on master
new one http://qa-proxy.ceph.com/teuthology/teuthology-2017-01-21_17:35:01-knfs-master-testing-basic-mira/736357/
...
Zheng Yan
06:00 AM Bug #17594 (In Progress): cephfs: permission checking not working (MDS should enforce POSIX permi...
Greg Farnum

01/22/2017

06:20 AM Bug #16768 (Fix Under Review): multimds: check_rstat assertion failure
https://github.com/ceph/ceph/pull/13052 Zheng Yan

01/20/2017

06:26 PM Bug #17531 (Resolved): mds fails to respawn if executable has changed
Patrick Donnelly
06:25 PM Bug #17670 (Resolved): multimds: mds entering up:replay and processing down mds aborts
Patrick Donnelly
06:24 PM Bug #17518 (Resolved): monitor assertion failure when deactivating mds in (invalid) fscid 0
Patrick Donnelly
05:36 PM Backport #18612 (In Progress): kraken: client: segfault on ceph_rmdir path "/"
Nathan Cutler
03:43 PM Backport #18612 (Resolved): kraken: client: segfault on ceph_rmdir path "/"
https://github.com/ceph/ceph/pull/13030 Nathan Cutler
05:16 PM Backport #18611 (In Progress): jewel: client: segfault on ceph_rmdir path "/"
Nathan Cutler
03:43 PM Backport #18611 (Resolved): jewel: client: segfault on ceph_rmdir path "/"
https://github.com/ceph/ceph/pull/13029 Nathan Cutler
05:03 PM Backport #18531 (In Progress): kraken: speed up readdir by skipping unwanted dn
Nathan Cutler
05:02 PM Backport #18531: kraken: speed up readdir by skipping unwanted dn
deleted description Nathan Cutler
03:42 PM Backport #18531 (New): kraken: speed up readdir by skipping unwanted dn
Nathan Cutler
04:12 PM Backport #18604 (In Progress): kraken: cephfs test failures (ceph.com/qa is broken, should be dow...
Nathan Cutler
03:41 PM Backport #18604 (Resolved): kraken: cephfs test failures (ceph.com/qa is broken, should be downlo...
https://github.com/ceph/ceph/pull/13024 Nathan Cutler
04:03 PM Backport #18603 (In Progress): jewel: cephfs test failures (ceph.com/qa is broken, should be down...
Nathan Cutler
03:41 PM Backport #18603 (Resolved): jewel: cephfs test failures (ceph.com/qa is broken, should be downloa...
https://github.com/ceph/ceph/pull/13023 Nathan Cutler
03:52 PM Backport #18602 (In Progress): hammer: cephfs test failures (ceph.com/qa is broken, should be dow...
Nathan Cutler
03:41 PM Backport #18602 (Resolved): hammer: cephfs test failures (ceph.com/qa is broken, should be downlo...
https://github.com/ceph/ceph/pull/13022 Nathan Cutler
03:45 PM Backport #18616 (Resolved): kraken: segfault in handle_client_caps
https://github.com/ceph/ceph/pull/14566 Nathan Cutler
03:45 PM Backport #18615 (Resolved): jewel: segfault in handle_client_caps
https://github.com/ceph/ceph/pull/13060 Nathan Cutler
03:12 PM Bug #18600 (Resolved): multimds suite tries to run quota tests against kclient, fails
http://pulpito.ceph.com/jspray-2017-01-19_17:59:40-multimds-wip-jcsp-testing-20170119b-testing-basic-smithi/731274/
...
John Spray
02:29 PM Bug #18487 (Resolved): Crash in MDCache::split_dir -- FAILED assert(dir->is_auth())
John Spray
02:24 PM Bug #16768: multimds: check_rstat assertion failure
Hmm, well it didn't grab the logs for some reason but I did get the crashing MDS's log before the test tore down. It... John Spray
12:27 PM Bug #16768: multimds: check_rstat assertion failure
I noticed that failure while the test was still stuck trying to unmount the kernel client, so I went in and killed th... John Spray
12:19 PM Bug #16768: multimds: check_rstat assertion failure
Another instance:
http://qa-proxy.ceph.com/teuthology/jspray-2017-01-19_17:59:40-multimds-wip-jcsp-testing-20170119b...
John Spray
12:29 PM Bug #8090 (Duplicate): multimds: mds crash in check_rstats
This is very similar to #16768 (which is happening in more recent runs), so I'm going to close this. John Spray

01/19/2017

08:16 AM Bug #9935 (Pending Backport): client: segfault on ceph_rmdir path "/"
John Spray
08:10 AM Bug #8808 (Resolved): multimds: stale NFS file handle on delete
https://github.com/ceph/ceph/commit/530bd6e67a0842733edff7ac036f18ed990788f9 Zheng Yan
07:52 AM Bug #4829: client: handling part of MClientForward incorrectly?
I can't see why we shouldn't encode cap releases after receiving MClientForward. The old releases is for the old MDS,... Zheng Yan
07:31 AM Bug #18487 (Fix Under Review): Crash in MDCache::split_dir -- FAILED assert(dir->is_auth())
https://github.com/ceph/ceph/pull/12994 Zheng Yan
05:18 AM Bug #18589 (Duplicate): ceph_volume_client.py doesn't create enough mds caps
Assuming that you encounted this issue with kernel client, the bug was http://tracker.ceph.com/issues/17191, which wa... John Spray
02:12 AM Bug #18578 (Fix Under Review): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
https://github.com/ceph/ceph/pull/12973 Zheng Yan

01/18/2017

06:37 PM Bug #18589: ceph_volume_client.py doesn't create enough mds caps
This report applies to mounting with the kernel, not ceph-fuse, right? I think that makes this a kernel issue where i... Greg Farnum
06:02 PM Bug #18589: ceph_volume_client.py doesn't create enough mds caps
Pull request at https://github.com/ceph/ceph/pull/12985 Huamin Chen
05:48 PM Bug #18589 (Duplicate): ceph_volume_client.py doesn't create enough mds caps
In _authorize_ceph() at https://github.com/ceph/ceph/blob/master/src/pybind/ceph_volume_client.py#L1032, the caps is ... Huamin Chen
02:47 PM Bug #18579: Fuse client has "opening" session to nonexistent MDS rank after MDS cluster shrink
I'll take this one. Patrick Donnelly
09:23 AM Bug #18579: Fuse client has "opening" session to nonexistent MDS rank after MDS cluster shrink
Logs in /home/jspray/18579 (should be world readable) on teuthology John Spray
08:58 AM Bug #18579 (Resolved): Fuse client has "opening" session to nonexistent MDS rank after MDS cluste...
... John Spray
08:30 AM Bug #18578 (Resolved): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
An user said he encountered this assertion while running vdbench Zheng Yan
 

Also available in: Atom