Project

General

Profile

Activity

From 01/31/2017 to 03/01/2017

03/01/2017

11:30 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
I have that already. I did set the beacon_grace to 600s to walk around the bug and bring the cluster back.
Seems r...
Xiaoxi Chen
11:16 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
I have that already. I did set the beacon_grace to 600s to walk around the bug and bring the cluster back.
In firs...
Xiaoxi Chen
03:52 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
May be related to http://tracker.ceph.com/issues/18730, although perhaps not since that one shouldn't be causing the ... John Spray
03:36 PM Bug #19118 (Resolved): MDS heartbeat timeout during rejoin, when working with large amount of cap...
We set an alarm every OPTION(mds_beacon_grace, OPT_FLOAT, 15) seconds, if mds_rank doesnt finish its task within this... Xiaoxi Chen
05:36 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Zheng Yan wrote:
> probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0...
Darrell Enns
03:31 PM Feature #16523 (In Progress): Assert directory fragmentation is occuring during stress tests
John Spray
11:09 AM Bug #18883: qa: failures in samba suite
open ticket for samba build http://tracker.ceph.com/issues/19117 Zheng Yan
08:21 AM Bug #17828 (Need More Info): libceph setxattr returns 0 without setting the attr
ceph.quota.max_files is hidden xattr. It doesn't show in listxattr. you need to get it explictly (getfattr -n ceph.qu... Zheng Yan

02/28/2017

02:12 PM Bug #19103: cephfs: Out of space handling
Looking again now that I'm a few coffees into my day -- all the cephfs enospc stuff is just aimed at providing a slic... John Spray
08:58 AM Bug #19103: cephfs: Out of space handling
Believe it or not I sneezed and somehow that caused me to select some affected versions... John Spray
08:58 AM Bug #19103: cephfs: Out of space handling
Also, I'm not sure we actually need to do sync writes when the cluster is near full -- we already have machinery that... John Spray
08:56 AM Bug #19103: cephfs: Out of space handling
Could the "ENOSPC on failsafe_full_ratio" behaviour be the default? It seems like any application layer that wants t... John Spray
12:13 AM Bug #19103 (Won't Fix): cephfs: Out of space handling

Cephfs needs to be more careful on a cluster with almost full OSDs. There is a delay in OSDs reporting stats, a MO...
David Zafman
01:37 PM Feature #19109 (Resolved): Use data pool's 'df' for statfs instead of global stats, if there is o...

The client sends a MStatfs to the mon to get the info for a statfs system call. Currently the mon gives it the glo...
John Spray
09:03 AM Bug #17828: libceph setxattr returns 0 without setting the attr
This ticket didn't get noticed because it was filed in the 'mgr' component instead of the 'fs' component.
Chris: d...
John Spray

02/27/2017

09:14 PM Bug #19101 (Closed): "samba3error [Unknown error/failure. Missing torture_fail() or torture_asser...
This is jewel point v10.2.6
Run: http://pulpito.ceph.com/yuriw-2017-02-24_20:42:46-samba-jewel---basic-smithi/
Jo...
Yuri Weinstein
12:47 PM Bug #18883: qa: failures in samba suite
both ubuntu and centos samba packages have no dependency to libcephfs. It works when there happen to be libcephfs1. Zheng Yan
11:21 AM Bug #18883: qa: failures in samba suite
Which packages were you seeing the linkage issue on? The centos ones? John Spray
09:25 AM Bug #18883: qa: failures in samba suite
... Zheng Yan
07:56 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
I updated PR to do _closed_mds_session(s).
As for config option, I would expect client to reconnect automagically ...
Henrik Korkuc

02/24/2017

02:29 PM Feature #19075 (Fix Under Review): Extend 'p' mds auth cap to cover quotas and all layout fields
https://github.com/ceph/ceph/pull/13628 John Spray
02:21 PM Feature #19075 (Resolved): Extend 'p' mds auth cap to cover quotas and all layout fields
Re. mailing list thread "quota change restriction" http://marc.info/?l=ceph-devel&m=148769159329755&w=2
We should ...
John Spray
02:12 PM Bug #18600 (Resolved): multimds suite tries to run quota tests against kclient, fails
Zheng Yan
02:11 PM Bug #17990 (Resolved): newly created directory may get fragmented before it gets journaled
Zheng Yan
02:10 PM Bug #16768 (Resolved): multimds: check_rstat assertion failure
Zheng Yan
02:10 PM Bug #18159 (Resolved): "Unknown mount option mds_namespace"
Zheng Yan
02:09 PM Bug #18646 (Resolved): mds: rejoin_import_cap FAILED assert(session)
Zheng Yan
09:51 AM Bug #18953 (Resolved): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
Zheng Yan
09:50 AM Bug #18663 (Resolved): teuthology teardown hangs if kclient umount fails
Zheng Yan
09:48 AM Bug #18675 (Resolved): client: during multimds thrashing FAILED assert(session->requests.empty())
Zheng Yan
01:47 AM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient
Zheng Yan

02/23/2017

11:24 PM Feature #9754: A 'fence and evict' client eviction command
Underway on jcsp/wip-17980 along with #17980 John Spray
12:07 PM Feature #9754 (In Progress): A 'fence and evict' client eviction command
John Spray
09:42 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
you can use 'ceph daemon client.xxx kick_stale_sessions' to recover this issue. Maybe we should add config option to ... Zheng Yan
09:30 AM Backport #19045 (Resolved): kraken: buffer overflow in test LibCephFS.DirLs
https://github.com/ceph/ceph/pull/14571 Loïc Dachary
09:30 AM Backport #19044 (Resolved): jewel: buffer overflow in test LibCephFS.DirLs
https://github.com/ceph/ceph/pull/14671 Loïc Dachary

02/22/2017

02:45 PM Bug #19033: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
Thank you for the very detailed report John Spray
02:44 PM Bug #19033 (Fix Under Review): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs ...
John Spray
01:04 PM Bug #19033: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
h1. Fix proposal
https://github.com/ceph/ceph/pull/13587
Honggang Yang
12:22 PM Bug #19033 (Resolved): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
h1. 1. Problem
After I have set about 400 64KB xattr kv pair to a file,
mds is crashed. Every time I try to star...
Honggang Yang
10:05 AM Bug #18941 (Pending Backport): buffer overflow in test LibCephFS.DirLs
It's a rare thing, but let's backport so that we don't have to re-diagnose it in the future. John Spray
09:48 AM Bug #18964 (Resolved): mon: fs new is no longer idempotent
John Spray
09:36 AM Bug #19022 (Fix Under Review): Crash in Client::queue_cap_snap when thrashing
https://github.com/ceph/ceph/pull/13579 Zheng Yan
09:36 AM Bug #18914 (Fix Under Review): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume...
https://github.com/ceph/ceph/pull/13580 Zheng Yan

02/21/2017

05:15 PM Feature #18490: client: implement delegation support in userland cephfs
Greg Farnum wrote:
> I guess I'm not sure what you're going for with the Fb versus Fc here. Sure, if you have Fwb an...
Jeff Layton
04:14 PM Feature #18490: client: implement delegation support in userland cephfs
I guess I'm not sure what you're going for with the Fb versus Fc here. Sure, if you have Fwb and then get an Fr read ... Greg Farnum
04:04 PM Feature #18490: client: implement delegation support in userland cephfs
Greg Farnum wrote:
> > BTW: CEPH_CAPFILE_BUFFER does also imply CEPH_CAP_FILE_CACHE, doesn't it?
>
> No, I don't ...
Jeff Layton
01:21 AM Feature #18490: client: implement delegation support in userland cephfs
> BTW: CEPH_CAPFILE_BUFFER does also imply CEPH_CAP_FILE_CACHE, doesn't it?
No, I don't think it does. In practice...
Greg Farnum
05:15 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
Of course, you're right. John Spray
02:43 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
I think cephfs python bind calls ceph_setxattr instead of ceph_ll_setxattr. There is no such code in Client::setxattr Zheng Yan
12:31 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
The thing that's confusing me is that Client::ll_setxattr has this block:... John Spray
09:05 AM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
The error is because MDS had outdated osdmap and thought the newly creately pool does not exist. (MDS has code that m... Zheng Yan
02:15 PM Bug #19022 (In Progress): Crash in Client::queue_cap_snap when thrashing
Zheng Yan
09:30 AM Backport #18707 (In Progress): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_...
Nathan Cutler
09:28 AM Backport #18708 (Resolved): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_none...
Nathan Cutler

02/20/2017

11:56 PM Bug #19022: Crash in Client::queue_cap_snap when thrashing
http://pulpito.ceph.com/jspray-2017-02-20_15:59:37-fs-master---basic-smithi 837749 and 837668 John Spray
11:55 PM Bug #19022 (Resolved): Crash in Client::queue_cap_snap when thrashing
Seen on master. Mysterious regression.... John Spray
02:04 PM Bug #18883: qa: failures in samba suite
Latest run:
http://pulpito.ceph.com/jspray-2017-02-20_12:27:44-samba-master-testing-basic-smithi/
Now we're seein...
John Spray
08:01 AM Bug #18883: qa: failures in samba suite
fix for the ceph-fuse bug: https://github.com/ceph/ceph/pull/13532 Zheng Yan
11:13 AM Bug #1656 (Won't Fix): Hadoop client unit test failures
I'm told that there is a newer hdfs test suite that we would adopt if refreshing the hdfs support, so this ticket is ... John Spray
08:00 AM Bug #18995 (Fix Under Review): ceph-fuse always fails If pid file is non-empty and run as daemon
https://github.com/ceph/ceph/pull/13532 Zheng Yan
07:58 AM Bug #18995 (Resolved): ceph-fuse always fails If pid file is non-empty and run as daemon
It always fails with message like:... Zheng Yan

02/19/2017

09:50 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
I created https://github.com/ceph/ceph/pull/13522
This resolves hang and allows work with mountpoint in this test ...
Henrik Korkuc

02/17/2017

06:04 PM Bug #18883: qa: failures in samba suite
Testing fix for the fuse thing here: https://github.com/ceph/ceph/pull/13498
Haven't looked into the smbtorture fa...
John Spray
05:52 PM Bug #18883: qa: failures in samba suite
The weird "failed to lock pidfile" ones are all with the "mount/native.yaml" fragment John Spray
05:44 PM Bug #18883: qa: failures in samba suite
Some of them are something different, too:
http://qa-proxy.ceph.com/teuthology/zack-2017-02-08_12:21:51-samba-mast...
John Spray
05:32 PM Bug #18883: qa: failures in samba suite
Weird, that was supposed to be fixed so that ceph-fuse never tries to create a pidfile: https://github.com/ceph/ceph/... John Spray
11:23 AM Bug #18883: qa: failures in samba suite
all failure have similar errors... Zheng Yan
04:35 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
make check finishes with 2 failed suites.
* FAIL: test/osd/osd-scrub-repair.sh
* FAIL: test/osd/osd-scrub-snaps.s...
Jan Fajerski
11:28 AM Bug #17939: non-local cephfs quota changes not visible until some IO is done
I just realised that rstats have the same problem. Client A is adding data to a Manila share, and Client B doesn't se... Dan van der Ster
02:36 AM Bug #18964 (Fix Under Review): mon: fs new is no longer idempotent
PR: https://github.com/ceph/ceph/pull/13471 Patrick Donnelly
02:33 AM Bug #18964 (Resolved): mon: fs new is no longer idempotent
... Patrick Donnelly

02/16/2017

09:58 PM Backport #18708 (In Progress): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_n...
Nathan Cutler
02:43 PM Backport #18708 (Fix Under Review): jewel: failed filelock.can_read(-1) assertion in Server::_dir...
https://github.com/ceph/ceph/pull/13459 Zheng Yan
09:16 PM Feature #17980 (In Progress): MDS should reject connections from OSD-blacklisted clients
There's a jcsp/wip-17980 with a first cut of this. John Spray
03:32 PM Feature #18490: client: implement delegation support in userland cephfs
Zheng asked a pointed question about this today, so to be clear...
This would be 100% an opportunistic thing. You ...
Jeff Layton
08:31 AM Bug #18953 (Fix Under Review): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
https://github.com/ceph/ceph/pull/13455 Zheng Yan
08:13 AM Bug #18953 (Resolved): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
clients should be able to acquire/release file locks when fs is full Zheng Yan

02/15/2017

10:54 PM Backport #18950 (Resolved): kraken: mds/StrayManager: avoid reusing deleted inode in StrayManager...
https://github.com/ceph/ceph/pull/14570 Loïc Dachary
10:54 PM Backport #18949 (Resolved): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManager:...
https://github.com/ceph/ceph/pull/14670 Loïc Dachary
10:47 PM Backport #18100 (Resolved): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jewel 10.2.3
Loïc Dachary
10:47 PM Backport #18679 (Resolved): jewel: failed to reconnect caps during snapshot tests
Loïc Dachary
06:23 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Jeff Layton wrote:
> Greg Farnum wrote:
> > Gah. I've run out of time to work on this right now. I've got a branch ...
Greg Farnum
11:53 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Greg Farnum wrote:
> Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsf...
Jeff Layton
01:39 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsfortytwo/ceph.git which... Greg Farnum
04:02 AM Bug #18941 (Fix Under Review): buffer overflow in test LibCephFS.DirLs
https://github.com/ceph/ceph/pull/13429 Zheng Yan
03:57 AM Bug #18941 (Resolved): buffer overflow in test LibCephFS.DirLs
http://pulpito.ceph.com/jspray-2017-02-14_02:39:19-fs-wip-jcsp-testing-20170214-distro-basic-smithi/812889 Zheng Yan
12:26 AM Feature #18490: client: implement delegation support in userland cephfs
John Spray wrote:
> So the "client completely unresponsive but only evict it when someone else wants its caps" case ...
Jeff Layton

02/14/2017

10:25 PM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
John Spray
10:24 PM Bug #18877 (Pending Backport): mds/StrayManager: avoid reusing deleted inode in StrayManager::_pu...
John Spray
10:21 PM Feature #18490: client: implement delegation support in userland cephfs
So the "client completely unresponsive but only evict it when someone else wants its caps" case is http://tracker.cep... John Spray
04:27 PM Feature #18490: client: implement delegation support in userland cephfs
I started taking a look at this. One thing we have to solve first, is that I don't think there is any automatic resol... Jeff Layton
06:25 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
Actually, does fail with stale. Instead the NFS mount command eventually times out.
mount.nfs: Connection timed out
c sights
06:23 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
Happens for me also:
Debian Jessie with backported kernel
Linux drbl 4.8.0-0.bpo.2-amd64 #1 SMP Debian 4.8.15-2~bpo...
c sights
11:10 AM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
John Spray
12:29 AM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
Oh yeah, the client does have code that's meant to be doing that, and on the client side it's a wait_for_latest. So ... John Spray

02/13/2017

07:05 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
That's odd; I thought clients validated pools before passing them to the mds. Maybe that's wrong or undesirable for o... Greg Farnum
12:29 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
Hmm, so this is happening because volume client creates a pool, then tries to use it as a layout at a time before its... John Spray
07:43 AM Bug #18914 (Resolved): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client....
http://qa-proxy.ceph.com/teuthology/teuthology-2017-02-12_10:10:02-fs-jewel---basic-smithi/808861/ Zheng Yan
02:38 PM Bug #18838 (Fix Under Review): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
https://github.com/ceph/ceph/pull/13394 Kefu Chai
12:04 PM Bug #18915: valgrind causes ceph-fuse mount_wait timeout
Probably duplicate of http://tracker.ceph.com/issues/18797 John Spray
08:15 AM Bug #18915 (New): valgrind causes ceph-fuse mount_wait timeout
http://pulpito.ceph.com/teuthology-2017-02-11_17:15:02-fs-master---basic-smithi/
http://qa-proxy.ceph.com/teutholo...
Zheng Yan
08:58 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
make check seem to get stuck after PASS: unittest_log on unittest_throttle.
_edit_ The machine has only very littl...
Jan Fajerski
07:22 AM Backport #18900 (Resolved): jewel: Test failure: test_open_inode
https://github.com/ceph/ceph/pull/14669 Loïc Dachary
07:22 AM Backport #18899 (Resolved): kraken: Test failure: test_open_inode
https://github.com/ceph/ceph/pull/14569 Loïc Dachary

02/11/2017

12:27 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
The commit is from the SUSE repo. Its part of the ses4 branch: https://github.com/SUSE/ceph/commits/ses4. Sorry shoul... Jan Fajerski
12:22 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
PPC clients! Wondering if you've tried running any of the automated tests (the unit tests, or teuthology suites?) on... John Spray

02/10/2017

10:23 PM Bug #18816: MDS crashes with log disabled
For some reason we still let people disable the MDS log. That's...bad. I think it only existed for some cheap benchma... Greg Farnum
10:18 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
Well, the problem is clearly indicated by the client... Greg Farnum
05:48 PM Bug #18661 (Pending Backport): Test failure: test_open_inode
John Spray
02:44 PM Bug #18883 (New): qa: failures in samba suite

First Samba run in ages:
http://pulpito.ceph.com/zack-2017-02-08_12:21:51-samba-master---basic-smithi/
Let's ge...
John Spray
12:20 PM Bug #18882 (New): StrayManager::advance_delayed() can use tens of seconds
I saw mds become laggy when running blogbench in a loop. The command I ran is "while `true`; do ls | xargs -P8 -n1 rm... Zheng Yan
03:26 AM Bug #18877: mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stray_logged
https://github.com/ceph/ceph/pull/13347 Zhi Zhang
03:25 AM Bug #18877 (Resolved): mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stra...
This issue was found by testing another PR (https://github.com/ceph/ceph/pull/12792), which makes MDS directly uses T... Zhi Zhang

02/09/2017

03:36 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
Also daemon commands don't return anything. That is for the client mds_requests and objecter_requests and ops_in_flig... Jan Fajerski
03:27 PM Bug #18872 (Resolved): write to cephfs mount hangs, ceph-fuse and kernel
When trying to write to a cephfs mount using 'dd' the client hangs indefinitely. The kernel client can be <ctrl-c>'ed... Jan Fajerski
09:37 AM Bug #18816: MDS crashes with log disabled
/// * Updated by Ahmed Akhuraidah in ML
The issue can be reproduced with upstream Ceph packages.
ahmed@ubcephno...
Shinobu Kinjo

02/07/2017

09:51 PM Bug #18850: Leak in MDCache::handle_dentry_unlink
Attached full valgrind from mds.a in jspray-2017-02-07_16:25:53-multimds-wip-jcsp-testing-20170206-testing-basic-smit... John Spray
09:49 PM Bug #18850 (Rejected): Leak in MDCache::handle_dentry_unlink

While there are various bits of valgrind noise going around at the moment, this one does look like a multimds speci...
John Spray
08:34 PM Bug #18845: valgrind failure in fs suite
also at shttp://pulpito.ceph.com/abhi-2017-02-07_15:12:56-fs-wip-luminous-2---basic-smithi/795353/ Abhishek Lekshmanan
03:38 PM Bug #18845 (New): valgrind failure in fs suite
Saw a valgrind failure on fs suite on the master branch as of fc2df15, run http://pulpito.ceph.com/abhi-2017-02-07_09... Abhishek Lekshmanan

02/06/2017

11:37 PM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
I've also seen this one rear its ugly head again in my last fs run.
I looked at this one:
/a/jspray-2017-02-06_11...
John Spray
11:07 PM Bug #18797: valgrind jobs hanging in fs suite
After noticing http://tracker.ceph.com/issues/18838, I now also notice that the suspect commit range where this start... John Spray
11:05 PM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
http://pulpito.ceph.com/jspray-2017-02-06_11:13:20-fs-wip-jcsp-testing-20170204-distro-basic-smithi/789908... John Spray
07:04 PM Bug #16397 (Resolved): nfsd selinux denials causing knfs tests to fail
I've not heard of any further reports of this since the new package went into production. I'm going to declare victor... Jeff Layton
02:20 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
OK we'll try to get that next time. Dan van der Ster
02:19 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Hmm, client's failing to participate in failover is probably not the same as #18757, as that one was the result of cl... John Spray
11:59 AM Bug #18830 (Fix Under Review): Coverity: bad iterator dereference in Locker::acquire_locks
https://github.com/ceph/ceph/pull/13272 John Spray
11:31 AM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
... John Spray

02/05/2017

05:04 AM Feature #10792 (Fix Under Review): qa: enable thrasher for MDS cluster size (vary max_mds)
PR: https://github.com/ceph/ceph/pull/13262 Patrick Donnelly
01:36 AM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
This exact error popped up in a branch of mine (http://pulpito.ceph.com/gregf-2017-02-04_03:30:50-fs-wip-17594---basi... Greg Farnum

02/04/2017

07:12 AM Bug #18816: MDS crashes with log disabled
test Ahmed Akhuraidah
06:51 AM Bug #18816 (Resolved): MDS crashes with log disabled
Note the "mds_log = false" below. If you do that, this happens:
Have crushed MDS daemon during executing different...
Ahmed Akhuraidah

02/03/2017

05:30 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Cache dump is here:
https://knowledgenetworkbc-my.sharepoint.com/personal/darrelle_knowledge_ca/_layouts/15/guesta...
Darrell Enns
07:24 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Does the mds cache dump need to be done while it is hung? It's a production system, so I wasn't able to leave it in a... Darrell Enns
06:57 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy Zheng Yan
01:40 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
please run 'ceph daemon mds.kb-ceph03 dump cache /tmp/cachedump.0' and upload /tmp/cachedump.0. Besides, please check... Zheng Yan
12:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Comparing logs I noticed that MDS clock is ~30s behind client. ntpd was dead on one of test servers... Will try to co... Henrik Korkuc
10:17 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Actually, one more thing: following this crash, the clients did not fail over to the standby MDS. Processes accessing... Dan van der Ster
09:57 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
Excellent! Sorry that my search of the tracker didn't find that issue.
We'll apply that backport when it's ready.
Dan van der Ster
09:50 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
duplicate of http://tracker.ceph.com/issues/18578. The fix is pending backport Zheng Yan
09:29 AM Bug #18802 (New): Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc:...
A user just did:... Dan van der Ster

02/02/2017

07:13 PM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
I've had two occurrences in the past 3 weeks where filesystem activity hangs, with the MDS report a client "failing t... Darrell Enns
05:15 PM Bug #18797: valgrind jobs hanging in fs suite
They're going dead during teardown when teuthology tries to list cephtest/ but something went wrong much earlier with... John Spray
05:08 PM Bug #18797: valgrind jobs hanging in fs suite
Ignore the last passing link above, that was actually a hammer run (which shows up as fs-master for some reason).
...
John Spray
04:55 PM Bug #18797 (Duplicate): valgrind jobs hanging in fs suite

First one to show the issue was:
http://pulpito.ceph.com/teuthology-2017-01-28_17:15:01-fs-master---basic-smithi
...
John Spray
11:12 AM Bug #18754 (Fix Under Review): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
https://github.com/ceph/ceph/pull/13227/commits/3b899fb0c6153c30ad6ef782499f79ba5c7b2a22 Zheng Yan
09:53 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
forgot to add a timeline:
08:08:29 mounted
08:09:54 iptables up
08:10:50 ls stuck
08:16:02 iptables down
08:17:5...
Henrik Korkuc
08:25 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Attaching client and MDS logs. This time I was mounting from another server and firewalling both input and output to ... Henrik Korkuc
04:10 AM Bug #18755 (Fix Under Review): multimds: MDCache.cc: 4735: FAILED assert(in)
https://github.com/ceph/ceph/pull/13227/commits/097dc86b330392c4aa89f0033ce6e40528481857 Zheng Yan

02/01/2017

11:14 PM Bug #18757 (New): Jewel ceph-fuse does not recover after lost connection to MDS
Hmm, now that I actually read the log (like a reasonable person :-)) it is a little bit strange that the server is se... John Spray
10:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
Yes, I am reproducing it artificially after short network blip caused permanent mount point hangs of multiple servers... Henrik Korkuc
07:59 PM Bug #18757 (Rejected): Jewel ceph-fuse does not recover after lost connection to MDS
Clients which lose connectivity to the MDS are evicted after a timeout given by the "mds session timeout" setting. E... John Spray
12:26 PM Bug #18757 (Resolved): Jewel ceph-fuse does not recover after lost connection to MDS
After ceph-fuse loses connection to MDS for few minutes, it does not recover - accessing mountpoint hangs processes.
...
Henrik Korkuc
06:53 PM Feature #18763 (New): Ganesha instances should restart themselves when blacklisted

If I evict the client session that a ganesha instance was using, then the libcephfs instance will start to see sess...
John Spray
02:25 PM Bug #18759 (Fix Under Review): multimds suite tries to run norstats tests against kclient
https://github.com/ceph/ceph/pull/13089 John Spray
01:11 PM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient

You get failures on the workunit like:
jspray-2017-02-01_02:51:18-multimds-wip-jcsp-testing-20170201-testing-basic...
John Spray
03:33 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
Zheng Yan
03:32 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
... Zheng Yan
01:11 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
There are actually 3 assertion failures (on 3 different MDS) in this run!... Patrick Donnelly
01:07 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
... Patrick Donnelly
02:37 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
Zheng Yan
02:19 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
ceph-mds.a.log... Zheng Yan
12:55 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
... Patrick Donnelly

01/31/2017

01:34 PM Bug #17801 (Resolved): Cleanly reject "session evict" command when in replay
Nathan Cutler
01:34 PM Backport #18010 (Resolved): jewel: Cleanly reject "session evict" command when in replay
Nathan Cutler
01:25 PM Bug #17193: truncate can cause unflushed snapshot data lose
Re-enable test: https://github.com/ceph/ceph/pull/13200 Nathan Cutler
01:17 PM Bug #17193 (Resolved): truncate can cause unflushed snapshot data lose
Nathan Cutler
01:17 PM Backport #18103 (Resolved): jewel: truncate can cause unflushed snapshot data lose
Nathan Cutler
01:16 PM Bug #18408 (Resolved): lookup of /.. in jewel returns -ENOENT
Nathan Cutler
01:16 PM Backport #18413 (Resolved): jewel: lookup of /.. in jewel returns -ENOENT
Nathan Cutler
01:15 PM Bug #18519 (Resolved): speed up readdir by skipping unwanted dn
Nathan Cutler
01:15 PM Backport #18520 (Resolved): jewel: speed up readdir by skipping unwanted dn
Nathan Cutler
01:14 PM Backport #18565 (Resolved): jewel: MDS crashes on missing metadata object
Nathan Cutler
01:14 PM Backport #18551 (Resolved): jewel: ceph-fuse crash during snapshot tests
Nathan Cutler
01:12 PM Backport #18282 (Resolved): jewel: monitor cannot start because of "FAILED assert(info.state == M...
Nathan Cutler
01:12 PM Bug #18086 (Resolved): cephfs: fix missing ll_get for ll_walk
Nathan Cutler
01:11 PM Backport #18195 (Resolved): jewel: cephfs: fix missing ll_get for ll_walk
Nathan Cutler
01:11 PM Bug #17954 (Resolved): standby-replay daemons can sometimes miss events
Nathan Cutler
01:10 PM Backport #18192 (Resolved): jewel: standby-replay daemons can sometimes miss events
Nathan Cutler
01:01 PM Bug #18532: mds: forward scrub failing to repair dir stats (was: subdir with corrupted dirstat is...
Current status of lab cluster is:
* Fixed the "missing dirfrag object" damage with a script that removed the offe...
John Spray
11:59 AM Bug #18743 (Resolved): Scrub considers dirty backtraces to be damaged, puts in damage table even ...

Two things are wrong here:
* When running scrub_path /teuthology-archive on the lab filesystem, I get a flurry ...
John Spray
07:40 AM Bug #9935 (Resolved): client: segfault on ceph_rmdir path "/"
Nathan Cutler
07:39 AM Backport #18611 (Resolved): jewel: client: segfault on ceph_rmdir path "/"
Nathan Cutler
 

Also available in: Atom