Activity
From 02/01/2017 to 03/02/2017
03/02/2017
- 11:52 AM Bug #17656: cephfs: high concurrent causing slow request
- jichao sun wrote:
> I have the same problem too!!!
- 08:27 AM Bug #17656: cephfs: high concurrent causing slow request
- I have the same problem too!!!
- 08:33 AM Bug #18995 (Resolved): ceph-fuse always fails If pid file is non-empty and run as daemon
- 08:11 AM Bug #18730: mds: backtrace issues getxattr for every file with cap on rejoin
- Zheng Yan wrote:
> I think we should design a new mechanism to track in-use inodes (current method isn't scalable be... - 03:40 AM Bug #18730: mds: backtrace issues getxattr for every file with cap on rejoin
- I think we should design a new mechanism to track in-use inodes (current method isn't scalable because it journals al...
03/01/2017
- 11:30 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
- I have that already. I did set the beacon_grace to 600s to walk around the bug and bring the cluster back.
Seems r... - 11:16 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
- I have that already. I did set the beacon_grace to 600s to walk around the bug and bring the cluster back.
In firs... - 03:52 PM Bug #19118: MDS heartbeat timeout during rejoin, when working with large amount of caps/inodes
- May be related to http://tracker.ceph.com/issues/18730, although perhaps not since that one shouldn't be causing the ...
- 03:36 PM Bug #19118 (Resolved): MDS heartbeat timeout during rejoin, when working with large amount of cap...
- We set an alarm every OPTION(mds_beacon_grace, OPT_FLOAT, 15) seconds, if mds_rank doesnt finish its task within this...
- 05:36 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Zheng Yan wrote:
> probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0... - 03:31 PM Feature #16523 (In Progress): Assert directory fragmentation is occuring during stress tests
- 11:09 AM Bug #18883: qa: failures in samba suite
- open ticket for samba build http://tracker.ceph.com/issues/19117
- 08:21 AM Bug #17828 (Need More Info): libceph setxattr returns 0 without setting the attr
- ceph.quota.max_files is hidden xattr. It doesn't show in listxattr. you need to get it explictly (getfattr -n ceph.qu...
02/28/2017
- 02:12 PM Bug #19103: cephfs: Out of space handling
- Looking again now that I'm a few coffees into my day -- all the cephfs enospc stuff is just aimed at providing a slic...
- 08:58 AM Bug #19103: cephfs: Out of space handling
- Believe it or not I sneezed and somehow that caused me to select some affected versions...
- 08:58 AM Bug #19103: cephfs: Out of space handling
- Also, I'm not sure we actually need to do sync writes when the cluster is near full -- we already have machinery that...
- 08:56 AM Bug #19103: cephfs: Out of space handling
- Could the "ENOSPC on failsafe_full_ratio" behaviour be the default? It seems like any application layer that wants t...
- 12:13 AM Bug #19103 (Won't Fix): cephfs: Out of space handling
Cephfs needs to be more careful on a cluster with almost full OSDs. There is a delay in OSDs reporting stats, a MO...- 01:37 PM Feature #19109 (Resolved): Use data pool's 'df' for statfs instead of global stats, if there is o...
The client sends a MStatfs to the mon to get the info for a statfs system call. Currently the mon gives it the glo...- 09:03 AM Bug #17828: libceph setxattr returns 0 without setting the attr
- This ticket didn't get noticed because it was filed in the 'mgr' component instead of the 'fs' component.
Chris: d...
02/27/2017
- 09:14 PM Bug #19101 (Closed): "samba3error [Unknown error/failure. Missing torture_fail() or torture_asser...
- This is jewel point v10.2.6
Run: http://pulpito.ceph.com/yuriw-2017-02-24_20:42:46-samba-jewel---basic-smithi/
Jo... - 12:47 PM Bug #18883: qa: failures in samba suite
- both ubuntu and centos samba packages have no dependency to libcephfs. It works when there happen to be libcephfs1.
- 11:21 AM Bug #18883: qa: failures in samba suite
- Which packages were you seeing the linkage issue on? The centos ones?
- 09:25 AM Bug #18883: qa: failures in samba suite
- ...
- 07:56 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- I updated PR to do _closed_mds_session(s).
As for config option, I would expect client to reconnect automagically ...
02/24/2017
- 02:29 PM Feature #19075 (Fix Under Review): Extend 'p' mds auth cap to cover quotas and all layout fields
- https://github.com/ceph/ceph/pull/13628
- 02:21 PM Feature #19075 (Resolved): Extend 'p' mds auth cap to cover quotas and all layout fields
- Re. mailing list thread "quota change restriction" http://marc.info/?l=ceph-devel&m=148769159329755&w=2
We should ... - 02:12 PM Bug #18600 (Resolved): multimds suite tries to run quota tests against kclient, fails
- 02:11 PM Bug #17990 (Resolved): newly created directory may get fragmented before it gets journaled
- 02:10 PM Bug #16768 (Resolved): multimds: check_rstat assertion failure
- 02:10 PM Bug #18159 (Resolved): "Unknown mount option mds_namespace"
- 02:09 PM Bug #18646 (Resolved): mds: rejoin_import_cap FAILED assert(session)
- 09:51 AM Bug #18953 (Resolved): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
- 09:50 AM Bug #18663 (Resolved): teuthology teardown hangs if kclient umount fails
- 09:48 AM Bug #18675 (Resolved): client: during multimds thrashing FAILED assert(session->requests.empty())
- 01:47 AM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient
02/23/2017
- 11:24 PM Feature #9754: A 'fence and evict' client eviction command
- Underway on jcsp/wip-17980 along with #17980
- 12:07 PM Feature #9754 (In Progress): A 'fence and evict' client eviction command
- 09:42 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- you can use 'ceph daemon client.xxx kick_stale_sessions' to recover this issue. Maybe we should add config option to ...
- 09:30 AM Backport #19045 (Resolved): kraken: buffer overflow in test LibCephFS.DirLs
- https://github.com/ceph/ceph/pull/14571
- 09:30 AM Backport #19044 (Resolved): jewel: buffer overflow in test LibCephFS.DirLs
- https://github.com/ceph/ceph/pull/14671
02/22/2017
- 02:45 PM Bug #19033: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
- Thank you for the very detailed report
- 02:44 PM Bug #19033 (Fix Under Review): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs ...
- 01:04 PM Bug #19033: cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
- h1. Fix proposal
https://github.com/ceph/ceph/pull/13587 - 12:22 PM Bug #19033 (Resolved): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs to a file
- h1. 1. Problem
After I have set about 400 64KB xattr kv pair to a file,
mds is crashed. Every time I try to star... - 10:05 AM Bug #18941 (Pending Backport): buffer overflow in test LibCephFS.DirLs
- It's a rare thing, but let's backport so that we don't have to re-diagnose it in the future.
- 09:48 AM Bug #18964 (Resolved): mon: fs new is no longer idempotent
- 09:36 AM Bug #19022 (Fix Under Review): Crash in Client::queue_cap_snap when thrashing
- https://github.com/ceph/ceph/pull/13579
- 09:36 AM Bug #18914 (Fix Under Review): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume...
- https://github.com/ceph/ceph/pull/13580
02/21/2017
- 05:15 PM Feature #18490: client: implement delegation support in userland cephfs
- Greg Farnum wrote:
> I guess I'm not sure what you're going for with the Fb versus Fc here. Sure, if you have Fwb an... - 04:14 PM Feature #18490: client: implement delegation support in userland cephfs
- I guess I'm not sure what you're going for with the Fb versus Fc here. Sure, if you have Fwb and then get an Fr read ...
- 04:04 PM Feature #18490: client: implement delegation support in userland cephfs
- Greg Farnum wrote:
> > BTW: CEPH_CAPFILE_BUFFER does also imply CEPH_CAP_FILE_CACHE, doesn't it?
>
> No, I don't ... - 01:21 AM Feature #18490: client: implement delegation support in userland cephfs
- > BTW: CEPH_CAPFILE_BUFFER does also imply CEPH_CAP_FILE_CACHE, doesn't it?
No, I don't think it does. In practice... - 05:15 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- Of course, you're right.
- 02:43 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- I think cephfs python bind calls ceph_setxattr instead of ceph_ll_setxattr. There is no such code in Client::setxattr
- 12:31 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- The thing that's confusing me is that Client::ll_setxattr has this block:...
- 09:05 AM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- The error is because MDS had outdated osdmap and thought the newly creately pool does not exist. (MDS has code that m...
- 02:15 PM Bug #19022 (In Progress): Crash in Client::queue_cap_snap when thrashing
- 09:30 AM Backport #18707 (In Progress): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_...
- 09:28 AM Backport #18708 (Resolved): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_none...
02/20/2017
- 11:56 PM Bug #19022: Crash in Client::queue_cap_snap when thrashing
- http://pulpito.ceph.com/jspray-2017-02-20_15:59:37-fs-master---basic-smithi 837749 and 837668
- 11:55 PM Bug #19022 (Resolved): Crash in Client::queue_cap_snap when thrashing
- Seen on master. Mysterious regression....
- 02:04 PM Bug #18883: qa: failures in samba suite
- Latest run:
http://pulpito.ceph.com/jspray-2017-02-20_12:27:44-samba-master-testing-basic-smithi/
Now we're seein... - 08:01 AM Bug #18883: qa: failures in samba suite
- fix for the ceph-fuse bug: https://github.com/ceph/ceph/pull/13532
- 11:13 AM Bug #1656 (Won't Fix): Hadoop client unit test failures
- I'm told that there is a newer hdfs test suite that we would adopt if refreshing the hdfs support, so this ticket is ...
- 08:00 AM Bug #18995 (Fix Under Review): ceph-fuse always fails If pid file is non-empty and run as daemon
- https://github.com/ceph/ceph/pull/13532
- 07:58 AM Bug #18995 (Resolved): ceph-fuse always fails If pid file is non-empty and run as daemon
- It always fails with message like:...
02/19/2017
- 09:50 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- I created https://github.com/ceph/ceph/pull/13522
This resolves hang and allows work with mountpoint in this test ...
02/17/2017
- 06:04 PM Bug #18883: qa: failures in samba suite
- Testing fix for the fuse thing here: https://github.com/ceph/ceph/pull/13498
Haven't looked into the smbtorture fa... - 05:52 PM Bug #18883: qa: failures in samba suite
- The weird "failed to lock pidfile" ones are all with the "mount/native.yaml" fragment
- 05:44 PM Bug #18883: qa: failures in samba suite
- Some of them are something different, too:
http://qa-proxy.ceph.com/teuthology/zack-2017-02-08_12:21:51-samba-mast... - 05:32 PM Bug #18883: qa: failures in samba suite
- Weird, that was supposed to be fixed so that ceph-fuse never tries to create a pidfile: https://github.com/ceph/ceph/...
- 11:23 AM Bug #18883: qa: failures in samba suite
- all failure have similar errors...
- 04:35 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- make check finishes with 2 failed suites.
* FAIL: test/osd/osd-scrub-repair.sh
* FAIL: test/osd/osd-scrub-snaps.s... - 11:28 AM Bug #17939: non-local cephfs quota changes not visible until some IO is done
- I just realised that rstats have the same problem. Client A is adding data to a Manila share, and Client B doesn't se...
- 02:36 AM Bug #18964 (Fix Under Review): mon: fs new is no longer idempotent
- PR: https://github.com/ceph/ceph/pull/13471
- 02:33 AM Bug #18964 (Resolved): mon: fs new is no longer idempotent
- ...
02/16/2017
- 09:58 PM Backport #18708 (In Progress): jewel: failed filelock.can_read(-1) assertion in Server::_dir_is_n...
- 02:43 PM Backport #18708 (Fix Under Review): jewel: failed filelock.can_read(-1) assertion in Server::_dir...
- https://github.com/ceph/ceph/pull/13459
- 09:16 PM Feature #17980 (In Progress): MDS should reject connections from OSD-blacklisted clients
- There's a jcsp/wip-17980 with a first cut of this.
- 03:32 PM Feature #18490: client: implement delegation support in userland cephfs
- Zheng asked a pointed question about this today, so to be clear...
This would be 100% an opportunistic thing. You ... - 08:31 AM Bug #18953 (Fix Under Review): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
- https://github.com/ceph/ceph/pull/13455
- 08:13 AM Bug #18953 (Resolved): mds applies 'fs full' check for CEPH_MDS_OP_SETFILELOCK
- clients should be able to acquire/release file locks when fs is full
02/15/2017
- 10:54 PM Backport #18950 (Resolved): kraken: mds/StrayManager: avoid reusing deleted inode in StrayManager...
- https://github.com/ceph/ceph/pull/14570
- 10:54 PM Backport #18949 (Resolved): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManager:...
- https://github.com/ceph/ceph/pull/14670
- 10:47 PM Backport #18100 (Resolved): jewel: ceph-mon crashed after upgrade from hammer 0.94.7 to jewel 10.2.3
- 10:47 PM Backport #18679 (Resolved): jewel: failed to reconnect caps during snapshot tests
- 06:23 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
- Jeff Layton wrote:
> Greg Farnum wrote:
> > Gah. I've run out of time to work on this right now. I've got a branch ... - 11:53 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
- Greg Farnum wrote:
> Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsf... - 01:39 AM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
- Gah. I've run out of time to work on this right now. I've got a branch at git@github.com:gregsfortytwo/ceph.git which...
- 04:02 AM Bug #18941 (Fix Under Review): buffer overflow in test LibCephFS.DirLs
- https://github.com/ceph/ceph/pull/13429
- 03:57 AM Bug #18941 (Resolved): buffer overflow in test LibCephFS.DirLs
- http://pulpito.ceph.com/jspray-2017-02-14_02:39:19-fs-wip-jcsp-testing-20170214-distro-basic-smithi/812889
- 12:26 AM Feature #18490: client: implement delegation support in userland cephfs
- John Spray wrote:
> So the "client completely unresponsive but only evict it when someone else wants its caps" case ...
02/14/2017
- 10:25 PM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
- 10:24 PM Bug #18877 (Pending Backport): mds/StrayManager: avoid reusing deleted inode in StrayManager::_pu...
- 10:21 PM Feature #18490: client: implement delegation support in userland cephfs
- So the "client completely unresponsive but only evict it when someone else wants its caps" case is http://tracker.cep...
- 04:27 PM Feature #18490: client: implement delegation support in userland cephfs
- I started taking a look at this. One thing we have to solve first, is that I don't think there is any automatic resol...
- 06:25 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
- Actually, does fail with stale. Instead the NFS mount command eventually times out.
mount.nfs: Connection timed out - 06:23 PM Bug #7750: Attempting to mount a kNFS export of a sub-directory of a CephFS filesystem fails with...
- Happens for me also:
Debian Jessie with backported kernel
Linux drbl 4.8.0-0.bpo.2-amd64 #1 SMP Debian 4.8.15-2~bpo... - 11:10 AM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
- 12:29 AM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- Oh yeah, the client does have code that's meant to be doing that, and on the client side it's a wait_for_latest. So ...
02/13/2017
- 07:05 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- That's odd; I thought clients validated pools before passing them to the mds. Maybe that's wrong or undesirable for o...
- 12:29 PM Bug #18914: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client.TestVolumeC...
- Hmm, so this is happening because volume client creates a pool, then tries to use it as a layout at a time before its...
- 07:43 AM Bug #18914 (Resolved): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume_client....
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-02-12_10:10:02-fs-jewel---basic-smithi/808861/
- 02:38 PM Bug #18838 (Fix Under Review): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
- https://github.com/ceph/ceph/pull/13394
- 12:04 PM Bug #18915: valgrind causes ceph-fuse mount_wait timeout
- Probably duplicate of http://tracker.ceph.com/issues/18797
- 08:15 AM Bug #18915 (New): valgrind causes ceph-fuse mount_wait timeout
- http://pulpito.ceph.com/teuthology-2017-02-11_17:15:02-fs-master---basic-smithi/
http://qa-proxy.ceph.com/teutholo... - 08:58 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- make check seem to get stuck after PASS: unittest_log on unittest_throttle.
_edit_ The machine has only very littl... - 07:22 AM Backport #18900 (Resolved): jewel: Test failure: test_open_inode
- https://github.com/ceph/ceph/pull/14669
- 07:22 AM Backport #18899 (Resolved): kraken: Test failure: test_open_inode
- https://github.com/ceph/ceph/pull/14569
02/11/2017
- 12:27 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- The commit is from the SUSE repo. Its part of the ses4 branch: https://github.com/SUSE/ceph/commits/ses4. Sorry shoul...
- 12:22 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- PPC clients! Wondering if you've tried running any of the automated tests (the unit tests, or teuthology suites?) on...
02/10/2017
- 10:23 PM Bug #18816: MDS crashes with log disabled
- For some reason we still let people disable the MDS log. That's...bad. I think it only existed for some cheap benchma...
- 10:18 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- Well, the problem is clearly indicated by the client...
- 05:48 PM Bug #18661 (Pending Backport): Test failure: test_open_inode
- 02:44 PM Bug #18883 (New): qa: failures in samba suite
First Samba run in ages:
http://pulpito.ceph.com/zack-2017-02-08_12:21:51-samba-master---basic-smithi/
Let's ge...- 12:20 PM Bug #18882 (New): StrayManager::advance_delayed() can use tens of seconds
- I saw mds become laggy when running blogbench in a loop. The command I ran is "while `true`; do ls | xargs -P8 -n1 rm...
- 03:26 AM Bug #18877: mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stray_logged
- https://github.com/ceph/ceph/pull/13347
- 03:25 AM Bug #18877 (Resolved): mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stra...
- This issue was found by testing another PR (https://github.com/ceph/ceph/pull/12792), which makes MDS directly uses T...
02/09/2017
- 03:36 PM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- Also daemon commands don't return anything. That is for the client mds_requests and objecter_requests and ops_in_flig...
- 03:27 PM Bug #18872 (Resolved): write to cephfs mount hangs, ceph-fuse and kernel
- When trying to write to a cephfs mount using 'dd' the client hangs indefinitely. The kernel client can be <ctrl-c>'ed...
- 09:37 AM Bug #18816: MDS crashes with log disabled
- /// * Updated by Ahmed Akhuraidah in ML
The issue can be reproduced with upstream Ceph packages.
ahmed@ubcephno...
02/07/2017
- 09:51 PM Bug #18850: Leak in MDCache::handle_dentry_unlink
- Attached full valgrind from mds.a in jspray-2017-02-07_16:25:53-multimds-wip-jcsp-testing-20170206-testing-basic-smit...
- 09:49 PM Bug #18850 (Rejected): Leak in MDCache::handle_dentry_unlink
While there are various bits of valgrind noise going around at the moment, this one does look like a multimds speci...- 08:34 PM Bug #18845: valgrind failure in fs suite
- also at shttp://pulpito.ceph.com/abhi-2017-02-07_15:12:56-fs-wip-luminous-2---basic-smithi/795353/
- 03:38 PM Bug #18845 (New): valgrind failure in fs suite
- Saw a valgrind failure on fs suite on the master branch as of fc2df15, run http://pulpito.ceph.com/abhi-2017-02-07_09...
02/06/2017
- 11:37 PM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
- I've also seen this one rear its ugly head again in my last fs run.
I looked at this one:
/a/jspray-2017-02-06_11... - 11:07 PM Bug #18797: valgrind jobs hanging in fs suite
- After noticing http://tracker.ceph.com/issues/18838, I now also notice that the suspect commit range where this start...
- 11:05 PM Bug #18838 (Resolved): valgrind: Leak_StillReachable in libceph-common __tracepoints__init
- http://pulpito.ceph.com/jspray-2017-02-06_11:13:20-fs-wip-jcsp-testing-20170204-distro-basic-smithi/789908...
- 07:04 PM Bug #16397 (Resolved): nfsd selinux denials causing knfs tests to fail
- I've not heard of any further reports of this since the new package went into production. I'm going to declare victor...
- 02:20 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- OK we'll try to get that next time.
- 02:19 PM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- Hmm, client's failing to participate in failover is probably not the same as #18757, as that one was the result of cl...
- 11:59 AM Bug #18830 (Fix Under Review): Coverity: bad iterator dereference in Locker::acquire_locks
- https://github.com/ceph/ceph/pull/13272
- 11:31 AM Bug #18830 (Resolved): Coverity: bad iterator dereference in Locker::acquire_locks
- ...
02/05/2017
- 05:04 AM Feature #10792 (Fix Under Review): qa: enable thrasher for MDS cluster size (vary max_mds)
- PR: https://github.com/ceph/ceph/pull/13262
- 01:36 AM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
- This exact error popped up in a branch of mine (http://pulpito.ceph.com/gregf-2017-02-04_03:30:50-fs-wip-17594---basi...
02/04/2017
- 07:12 AM Bug #18816: MDS crashes with log disabled
- test
- 06:51 AM Bug #18816 (Resolved): MDS crashes with log disabled
- Note the "mds_log = false" below. If you do that, this happens:
Have crushed MDS daemon during executing different...
02/03/2017
- 05:30 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Cache dump is here:
https://knowledgenetworkbc-my.sharepoint.com/personal/darrelle_knowledge_ca/_layouts/15/guesta... - 07:24 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Does the mds cache dump need to be done while it is hung? It's a production system, so I wasn't able to leave it in a...
- 06:57 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- probably fixed by https://github.com/ceph/ceph-client/commit/10a2699426a732cbf3fc9e835187e8b914f0c61cy
- 01:40 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- please run 'ceph daemon mds.kb-ceph03 dump cache /tmp/cachedump.0' and upload /tmp/cachedump.0. Besides, please check...
- 12:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Comparing logs I noticed that MDS clock is ~30s behind client. ntpd was dead on one of test servers... Will try to co...
- 10:17 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- Actually, one more thing: following this crash, the clients did not fail over to the standby MDS. Processes accessing...
- 09:57 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- Excellent! Sorry that my search of the tracker didn't find that issue.
We'll apply that backport when it's ready. - 09:50 AM Bug #18802: Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc: 6003:...
- duplicate of http://tracker.ceph.com/issues/18578. The fix is pending backport
- 09:29 AM Bug #18802 (New): Jewel fuse client not connecting to new MDS after failover (was: mds/Server.cc:...
- A user just did:...
02/02/2017
- 07:13 PM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
- I've had two occurrences in the past 3 weeks where filesystem activity hangs, with the MDS report a client "failing t...
- 05:15 PM Bug #18797: valgrind jobs hanging in fs suite
- They're going dead during teardown when teuthology tries to list cephtest/ but something went wrong much earlier with...
- 05:08 PM Bug #18797: valgrind jobs hanging in fs suite
- Ignore the last passing link above, that was actually a hammer run (which shows up as fs-master for some reason).
... - 04:55 PM Bug #18797 (Duplicate): valgrind jobs hanging in fs suite
First one to show the issue was:
http://pulpito.ceph.com/teuthology-2017-01-28_17:15:01-fs-master---basic-smithi
...- 11:12 AM Bug #18754 (Fix Under Review): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- https://github.com/ceph/ceph/pull/13227/commits/3b899fb0c6153c30ad6ef782499f79ba5c7b2a22
- 09:53 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- forgot to add a timeline:
08:08:29 mounted
08:09:54 iptables up
08:10:50 ls stuck
08:16:02 iptables down
08:17:5... - 08:25 AM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Attaching client and MDS logs. This time I was mounting from another server and firewalling both input and output to ...
- 04:10 AM Bug #18755 (Fix Under Review): multimds: MDCache.cc: 4735: FAILED assert(in)
- https://github.com/ceph/ceph/pull/13227/commits/097dc86b330392c4aa89f0033ce6e40528481857
02/01/2017
- 11:14 PM Bug #18757 (New): Jewel ceph-fuse does not recover after lost connection to MDS
- Hmm, now that I actually read the log (like a reasonable person :-)) it is a little bit strange that the server is se...
- 10:30 PM Bug #18757: Jewel ceph-fuse does not recover after lost connection to MDS
- Yes, I am reproducing it artificially after short network blip caused permanent mount point hangs of multiple servers...
- 07:59 PM Bug #18757 (Rejected): Jewel ceph-fuse does not recover after lost connection to MDS
- Clients which lose connectivity to the MDS are evicted after a timeout given by the "mds session timeout" setting. E...
- 12:26 PM Bug #18757 (Resolved): Jewel ceph-fuse does not recover after lost connection to MDS
- After ceph-fuse loses connection to MDS for few minutes, it does not recover - accessing mountpoint hangs processes.
... - 06:53 PM Feature #18763 (New): Ganesha instances should restart themselves when blacklisted
If I evict the client session that a ganesha instance was using, then the libcephfs instance will start to see sess...- 02:25 PM Bug #18759 (Fix Under Review): multimds suite tries to run norstats tests against kclient
- https://github.com/ceph/ceph/pull/13089
- 01:11 PM Bug #18759 (Resolved): multimds suite tries to run norstats tests against kclient
You get failures on the workunit like:
jspray-2017-02-01_02:51:18-multimds-wip-jcsp-testing-20170201-testing-basic...- 03:33 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- 03:32 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- ...
- 01:11 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- There are actually 3 assertion failures (on 3 different MDS) in this run!...
- 01:07 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
- ...
- 02:37 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
- 02:19 AM Bug #18717: multimds: FAILED assert(0 == "got export_cancel in weird state")
- ceph-mds.a.log...
- 12:55 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- ...
Also available in: Atom