Activity
From 04/11/2017 to 05/10/2017
05/10/2017
- 01:25 PM Bug #19903 (Resolved): LibCephFS.ClearSetuid fails
- all libcephfs/test.sh failures in http://pulpito.ceph.com/pdonnell-2017-05-10_02:32:15-fs-wip-pdonnell-integration-di...
- 08:59 AM Bug #19891 (Fix Under Review): Test failure: test_full_different_file
- https://github.com/ceph/ceph/pull/15026/
- 08:12 AM Bug #19891: Test failure: test_full_different_file
- mds didn't get osd op reply for purge queue log prezero operations. It seems osd dropped these requests...
- 02:22 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
- 02:21 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
- seems like test case issue. asok command can use several seconds, which is enough for deleting all files.
- 01:02 AM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
- 12:35 AM Bug #19890 (Fix Under Review): src/test/pybind/test_cephfs.py fails
- 12:34 AM Bug #19890: src/test/pybind/test_cephfs.py fails
- https://github.com/ceph/ceph/pull/15018
05/09/2017
- 09:47 PM Bug #19896: client: test failure for O_RDWR file open
- This problem also appears to be affecting a few other tests:...
- 09:42 PM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
- We have a test failure in test_cephfs.test_open (and test_cephfs.test_mount_unmount): http://pulpito.ceph.com/pdonnel...
- 01:37 PM Bug #19450 (Resolved): PurgeQueue read journal crash
- 01:23 PM Bug #19893 (Resolved): test_rebuild_simple_altpool fails
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:25:02-kcephfs-master-testing-basic-smithi/1113182/
- 01:15 PM Bug #19892 (Resolved): Test failure: test_purge_queue_op_rate fails
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:25:02-kcephfs-master-testing-basic-smithi/1107011/
- 01:05 PM Bug #19891 (Resolved): Test failure: test_full_different_file
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:15:02-fs-master---basic-smithi/1106624/
- 12:52 PM Bug #19890 (Resolved): src/test/pybind/test_cephfs.py fails
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:15:05-fs-master---basic-smithi/1113004/teuthology.log
... - 02:49 AM Bug #19426 (Can't reproduce): knfs blogbench hang
05/08/2017
- 03:24 PM Backport #19846 (In Progress): jewel: write to cephfs mount hangs, ceph-fuse and kernel
- 02:00 PM Backport #19845 (In Progress): kraken: write to cephfs mount hangs, ceph-fuse and kernel
- 09:46 AM Bug #19426: knfs blogbench hang
- Jeff Layton wrote:
> Sorry I didn't see this sooner. Is this still cropping up?
>
> So what might be helpful the ... - 09:38 AM Bug #19426: knfs blogbench hang
- It seems the crash no longer happen after rebase the testing branch against 4.11 kernel
- 09:25 AM Bug #19828 (Fix Under Review): mds: valgrind InvalidRead detected in Locker
- https://github.com/ceph/ceph/pull/14991
- 03:27 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
- what does 'written in part' mean? application wrote ~23G, failed to write the rest, or application wrote 26G but the ...
- 02:59 AM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
- "ceph: try getting buffer capability for readahead/fadvise" has backported into 4.9.x
05/04/2017
- 07:06 PM Feature #19862 (New): mds: add LTTnG tracepoints for each type of MDS operation
- It would be nice to know the latency of different file system operations so we can see which operations:
- scale poo... - 09:47 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
- upload client log file
- 09:41 AM Bug #19854 (Duplicate): ceph-fuse write a big file,The file is only written in part
- application write a big file( 26GB) to cephfs, the file is only written in part(23GB);
ceph version: 10.2.6 (656b5b6...
05/03/2017
- 08:20 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
- https://github.com/ceph/ceph/pull/15000
- 08:20 PM Backport #19845 (Resolved): kraken: write to cephfs mount hangs, ceph-fuse and kernel
- https://github.com/ceph/ceph/pull/14998
- 05:46 PM Feature #19362: mds: add perf counters for each type of MDS operation
- https://github.com/ceph/ceph/pull/14938
- 03:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Zheng Yan wrote:
> elder one wrote:
> > One difference I noticed between 4.4 and 4.9 kernels
> > - with 4.4 kerne... - 02:27 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
>
>... - 02:06 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> One difference I noticed between 4.4 and 4.9 kernels
> - with 4.4 kernel on cephfs directory si... - 11:04 AM Backport #18699 (Resolved): jewel: client: fix the cross-quota rename boundary check conditions
- 07:47 AM Bug #18872 (Pending Backport): write to cephfs mount hangs, ceph-fuse and kernel
- 05:25 AM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
- This is pretty infeasible right now -- nothing tracks which bits of a file are allocated or exist. The closest it com...
- 03:27 AM Bug #19828 (Resolved): mds: valgrind InvalidRead detected in Locker
- ...
- 01:55 AM Bug #17819 (Can't reproduce): MDS crashed while performing snapshot creation and deletion in a loop
- 01:53 AM Bug #17819: MDS crashed while performing snapshot creation and deletion in a loop
- run the test overnight, can't reproduce the crash.
05/02/2017
- 09:40 AM Bug #17408 (Can't reproduce): Possible un-needed wait on rstats when listing dir?
- can't reproduce. probably fixed by ...
- 09:21 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
Also upgraded ceph to 1... - 08:29 AM Feature #19820 (Fix Under Review): ceph-fuse: use userspace permission check by default
- https://github.com/ceph/ceph/pull/14907
- 07:39 AM Feature #19820 (Resolved): ceph-fuse: use userspace permission check by default
- 07:32 AM Bug #19812: client: not swapping directory caps efficiently leads to very slow create chains
- The reason of slow creation/deletion is that ceph-fuse sends a getattr request (check permission of test directory) b...
- 06:17 AM Feature #19819 (Rejected): Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
- The purpose of this -- is to have standard interface for examining allocated extents of the files. For example, xfs_i...
04/28/2017
- 10:09 PM Bug #19812 (New): client: not swapping directory caps efficiently leads to very slow create chains
- https://www.mail-archive.com/ceph-users@lists.ceph.com/msg34200.html
In short: if you have a ceph-fuse and a kerne...
04/27/2017
- 07:27 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
- PR: https://github.com/ceph/ceph/pull/14822
- 03:21 AM Bug #19789 (New): FAIL: test_evict_client (tasks.cephfs.test_misc.TestMisc)
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-24_03:15:02-fs-master---basic-smithi/
04/25/2017
- 09:03 AM Bug #19755 (Fix Under Review): MDS became unresponsive when truncating a very large file
- https://github.com/ceph/ceph/pull/14769
- 03:39 AM Bug #19755 (Resolved): MDS became unresponsive when truncating a very large file
- We were trying to copy a very large file (7TB exactly) between two directories through Samba/CephFS, and cancelled it...
- 06:52 AM Backport #19763 (Resolved): kraken: non-local cephfs quota changes not visible until some IO is done
- https://github.com/ceph/ceph/pull/16108
- 06:52 AM Backport #19762 (Resolved): jewel: non-local cephfs quota changes not visible until some IO is done
- https://github.com/ceph/ceph/pull/15466
04/24/2017
- 09:14 PM Bug #19583 (Resolved): mds: change_attr not inc in Server::handle_set_vxattr
- 09:13 PM Fix #19691 (Resolved): Remove journaler_allow_split_entries option
- 09:13 PM Bug #18816 (Resolved): MDS crashes with log disabled
- 09:12 PM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
- The userspace piece (https://github.com/ceph/ceph/pull/14317) has merged.
Zheng: please resolve the ticket when th... - 09:11 PM Feature #17855 (Resolved): Don't evict a slow client if it's the only client
- 01:43 PM Bug #19706 (In Progress): Laggy mon daemons causing MDS failover (symptom: failed to set counters...
- Should disable this check for the misc workunit, it was mainly intended for the workunits that have more sustained lo...
- 12:27 PM Feature #18425 (Resolved): mds: add the option to use tcmalloc directly
- 10:13 AM Bug #17939 (Pending Backport): non-local cephfs quota changes not visible until some IO is done
- 07:08 AM Support #16884 (Closed): rename() doesn't work between directories
- 07:02 AM Support #16884: rename() doesn't work between directories
- 03:01 AM Bug #19635 (Fix Under Review): Deadlock on two ceph-fuse clients accessing the same file
- https://github.com/ceph/ceph/pull/14743
04/22/2017
- 02:10 AM Bug #19583 (Fix Under Review): mds: change_attr not inc in Server::handle_set_vxattr
- https://github.com/ceph/ceph/pull/14726
04/21/2017
- 03:41 AM Bug #19734 (Resolved): mds: subsystems like ceph_subsys_mds_balancer do not log correctly
- Our logging for subsystems is relying on a removed configuration macro:...
- 03:36 AM Bug #19589 (Fix Under Review): greedyspill.lua: :18: attempt to index a nil value (field '?')
- https://github.com/ceph/ceph/pull/14704
04/20/2017
- 09:54 PM Backport #19709 (In Progress): jewel: Enable MDS to start when session ino info is corrupt
- 12:13 PM Backport #19709 (Resolved): jewel: Enable MDS to start when session ino info is corrupt
- https://github.com/ceph/ceph/pull/14700
- 09:49 PM Backport #19679 (In Progress): jewel: MDS: damage reporting by ino number is useless
- 09:29 PM Backport #19677 (In Progress): jewel: Jewel ceph-fuse does not recover after lost connection to MDS
- 05:55 PM Backport #19466 (Need More Info): jewel: mds: log rotation doesn't work if mds has respawned
- Needs 3ba63063 1fb15a21
Of these two, 3ba63063 is non-trivial. - 11:32 AM Backport #19466 (In Progress): jewel: mds: log rotation doesn't work if mds has respawned
- 05:06 PM Backport #19620 (Resolved): kraken: MDS server crashes due to inconsistent metadata.
- 05:06 PM Backport #19483 (Resolved): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" co...
- 05:05 PM Backport #19335 (Resolved): kraken: MDS heartbeat timeout during rejoin, when working with large ...
- 05:04 PM Backport #19045 (Resolved): kraken: buffer overflow in test LibCephFS.DirLs
- 05:03 PM Backport #18950 (Resolved): kraken: mds/StrayManager: avoid reusing deleted inode in StrayManager...
- 05:02 PM Backport #18899 (Resolved): kraken: Test failure: test_open_inode
- 05:01 PM Backport #18706 (Resolved): kraken: fragment space check can cause replayed request fail
- 04:59 PM Backport #18700 (Resolved): kraken: client: fix the cross-quota rename boundary check conditions
- 04:58 PM Bug #18306 (Resolved): segfault in handle_client_caps
- 04:58 PM Backport #18616 (Resolved): kraken: segfault in handle_client_caps
- 04:57 PM Bug #18179 (Resolved): MDS crashes on missing metadata object
- 04:57 PM Backport #18566 (Resolved): kraken: MDS crashes on missing metadata object
- 04:56 PM Bug #18396 (Resolved): Test Failure: kcephfs test_client_recovery.TestClientRecovery
- 07:57 AM Bug #18396: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-13_05:20:02-kcephfs-kraken-testing-basic-smithi/1019312/
- 04:56 PM Backport #18562 (Resolved): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- 04:55 PM Bug #18460 (Resolved): ceph-fuse crash during snapshot tests
- 04:55 PM Backport #18552 (Resolved): kraken: ceph-fuse crash during snapshot tests
- 02:53 PM Bug #19707 (Duplicate): Hadoop tests fail due to missing upstream tarball
- Indeed it is -- I hadn't seen that other ticket because it was in the wrong project.
- 02:04 PM Bug #19707: Hadoop tests fail due to missing upstream tarball
- Dup of #19456?
- 09:57 AM Bug #19707 (Duplicate): Hadoop tests fail due to missing upstream tarball
- http://pulpito.ceph.com/teuthology-2017-04-03_03:45:03-hadoop-master---basic-mira/...
- 01:49 PM Bug #19712 (New): some kcephfs tests become very slow
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-16_04:20:02-kcephfs-jewel-testing-basic-smithi/
http://pulpit... - 01:40 PM Backport #19675 (In Progress): jewel: cephfs: Test failure: test_data_isolated (tasks.cephfs.test...
- 01:24 PM Backport #19673 (In Progress): jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv...
- 01:22 PM Backport #19671 (In Progress): jewel: MDS assert failed when shutting down
- 01:15 PM Backport #19668 (In Progress): jewel: MDS goes readonly writing backtrace for a file whose data p...
- 01:04 PM Bug #18872 (Fix Under Review): write to cephfs mount hangs, ceph-fuse and kernel
- Turns out this is an issue of ceph leaking arch-dependend flags on the wire. See kernel ml [PATCH] ceph: Fix file ope...
- 12:13 PM Backport #19710 (Resolved): kraken: Enable MDS to start when session ino info is corrupt
- https://github.com/ceph/ceph/pull/16107
- 12:08 PM Backport #19666 (In Progress): jewel: fs:The mount point break off when mds switch hanppened.
- 11:42 AM Backport #19665 (In Progress): jewel: C_MDSInternalNoop::complete doesn't free itself
- 11:37 AM Backport #19619 (In Progress): jewel: MDS server crashes due to inconsistent metadata.
- 11:34 AM Backport #19482 (In Progress): jewel: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" ...
- 10:57 AM Backport #19334 (In Progress): jewel: MDS heartbeat timeout during rejoin, when working with larg...
- 10:55 AM Backport #19044 (In Progress): jewel: buffer overflow in test LibCephFS.DirLs
- 10:54 AM Backport #18949 (In Progress): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManag...
- 10:49 AM Backport #18900 (In Progress): jewel: Test failure: test_open_inode
- 10:47 AM Backport #18705 (In Progress): jewel: fragment space check can cause replayed request fail
- 10:44 AM Backport #18699 (In Progress): jewel: client: fix the cross-quota rename boundary check conditions
- 10:22 AM Bug #16842: mds: replacement MDS crashes on InoTable release
- Created ticket for the workaround and marked it for backport here: http://tracker.ceph.com/issues/19708
I should h... - 10:21 AM Fix #19708 (Resolved): Enable MDS to start when session ino info is corrupt
This was a mitigation for issue #16842, which is itself a mystery.
Creating ticket to backport it.
Fix on mas...- 09:03 AM Bug #18816 (Fix Under Review): MDS crashes with log disabled
- I'm proposing that we rip out this configuration option, it's a trap for the unwary:
https://github.com/ceph/ceph/pu... - 08:30 AM Bug #19706 (Can't reproduce): Laggy mon daemons causing MDS failover (symptom: failed to set coun...
- http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-15_03:15:10-fs-master---basic-smithi/1027137/
04/19/2017
- 10:24 PM Feature #17834 (Fix Under Review): MDS Balancer overrides
- https://github.com/ceph/ceph/pull/14598
- 04:02 PM Bug #15467 (Won't Fix): After "mount -l", ceph-fuse does not work
- This appears to have happened on pre-jewel code, so it's unlikely anyone is interested in investigating.
- 10:11 AM Fix #19691 (Fix Under Review): Remove journaler_allow_split_entries option
- https://github.com/ceph/ceph/pull/14636
- 10:06 AM Fix #19691 (Resolved): Remove journaler_allow_split_entries option
- This has been broken in practice since at least when the MDS journal format (JournalStream etc) was changed, as the r...
04/18/2017
- 08:38 PM Feature #17980 (Fix Under Review): MDS should reject connections from OSD-blacklisted clients
- https://github.com/ceph/ceph/pull/14610
- 08:38 PM Feature #9754 (Fix Under Review): A 'fence and evict' client eviction command
- See #17980 patch
- 07:39 PM Backport #19680 (Resolved): kraken: MDS: damage reporting by ino number is useless
- https://github.com/ceph/ceph/pull/16106
- 07:39 PM Backport #19679 (Resolved): jewel: MDS: damage reporting by ino number is useless
- https://github.com/ceph/ceph/pull/14699
- 07:38 PM Backport #19678 (Resolved): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
- https://github.com/ceph/ceph/pull/16105
- 07:38 PM Backport #19677 (Resolved): jewel: Jewel ceph-fuse does not recover after lost connection to MDS
- https://github.com/ceph/ceph/pull/14698
- 07:38 PM Backport #19676 (Resolved): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_v...
- https://github.com/ceph/ceph/pull/16104
- 07:38 PM Backport #19675 (Resolved): jewel: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_vo...
- https://github.com/ceph/ceph/pull/14685
- 07:38 PM Backport #19674 (Resolved): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr kv p...
- https://github.com/ceph/ceph/pull/16103
- 07:38 PM Backport #19673 (Resolved): jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv pa...
- https://github.com/ceph/ceph/pull/14684
- 07:37 PM Backport #19672 (Resolved): kraken: MDS assert failed when shutting down
- https://github.com/ceph/ceph/pull/16102
- 07:37 PM Backport #19671 (Resolved): jewel: MDS assert failed when shutting down
- https://github.com/ceph/ceph/pull/14683
- 07:37 PM Backport #19669 (Resolved): kraken: MDS goes readonly writing backtrace for a file whose data poo...
- https://github.com/ceph/ceph/pull/16101
- 07:37 PM Backport #19668 (Resolved): jewel: MDS goes readonly writing backtrace for a file whose data pool...
- https://github.com/ceph/ceph/pull/14682
- 07:37 PM Backport #19667 (Resolved): kraken: fs:The mount point break off when mds switch hanppened.
- https://github.com/ceph/ceph/pull/16100
- 07:37 PM Backport #19666 (Resolved): jewel: fs:The mount point break off when mds switch hanppened.
- https://github.com/ceph/ceph/pull/14679
- 07:37 PM Backport #19665 (Resolved): jewel: C_MDSInternalNoop::complete doesn't free itself
- https://github.com/ceph/ceph/pull/14677
- 07:37 PM Backport #19664 (Resolved): kraken: C_MDSInternalNoop::complete doesn't free itself
- https://github.com/ceph/ceph/pull/16099
- 04:20 PM Bug #16842: mds: replacement MDS crashes on InoTable release
- backport?
- 01:29 PM Feature #10792 (New): qa: enable thrasher for MDS cluster size (vary max_mds)
- The thrasher exists, this ticket is now for switching it on and getting the resulting runs green.
- 01:28 PM Feature #15068 (Resolved): fsck: multifs: enable repair tools to read from one filesystem and wri...
- 01:28 PM Feature #15069 (Resolved): MDS: multifs: enable two filesystems to point to same pools if one of ...
- 01:28 PM Fix #15134 (New): multifs: test case exercising mds_thrash for multiple filesystems
- (tweaking ticket to be for creating a test case that uses the new/smarter thrashing code in a multi-fs way)
- 01:26 PM Bug #18579 (Resolved): Fuse client has "opening" session to nonexistent MDS rank after MDS cluste...
- 01:26 PM Bug #18914 (Pending Backport): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume...
- 01:26 PM Feature #19075 (Resolved): Extend 'p' mds auth cap to cover quotas and all layout fields
- 01:24 PM Bug #19501 (Pending Backport): C_MDSInternalNoop::complete doesn't free itself
- 01:23 PM Bug #19566 (Resolved): MDS crash on mgr message during shutdown
- 12:51 PM Bug #19640 (Resolved): ceph-fuse should only return writeback errors once per file handle
- 11:49 AM Feature #18509 (Pending Backport): MDS: damage reporting by ino number is useless
04/17/2017
- 02:08 PM Bug #19426: knfs blogbench hang
- Sorry I didn't see this sooner. Is this still cropping up?
So what might be helpful the next time this happens loc... - 01:17 PM Bug #19640 (Fix Under Review): ceph-fuse should only return writeback errors once per file handle
- https://github.com/ceph/ceph/pull/14589
- 01:16 PM Bug #19640 (Resolved): ceph-fuse should only return writeback errors once per file handle
- Currently if someone fsyncs and sees the error, we give them the same error again on fclose.
We should match the n... - 11:13 AM Bug #19635 (In Progress): Deadlock on two ceph-fuse clients accessing the same file
- This bug happens in following sequence of events
- Reuqest1 (from client1) create file1 (mds issues caps Asx to cl... - 12:15 AM Bug #19635: Deadlock on two ceph-fuse clients accessing the same file
- Those requests are getting hung up on the iauth and ixattr locks on the inode for the ".syn" file the test script cre...
04/16/2017
- 11:02 PM Bug #19635: Deadlock on two ceph-fuse clients accessing the same file
- I was wondering if d463107473 ("mds: finish lock waiters in the same order that they were added.") could have been th...
- 09:48 PM Bug #19635 (Resolved): Deadlock on two ceph-fuse clients accessing the same file
- See Dan's reproducer script, and thread "[ceph-users] fsping, why you no work no mo?"
https://raw.githubusercontent.... - 01:19 PM Support #16738 (Closed): mount.ceph: unknown mount options: rbytes and norbytes
04/15/2017
- 06:47 PM Bug #19437 (Pending Backport): fs:The mount point break off when mds switch hanppened.
- 06:46 PM Bug #19033 (Pending Backport): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs ...
- 06:45 PM Bug #18757 (Pending Backport): Jewel ceph-fuse does not recover after lost connection to MDS
- Let's backport this for the benefit of people running cephfs today
- 06:41 PM Bug #19401 (Pending Backport): MDS goes readonly writing backtrace for a file whose data pool has...
- 03:21 PM Support #16738: mount.ceph: unknown mount options: rbytes and norbytes
- This issue can be closed. I have not experienced this issue anymore and the original systems the issue was found is n...
- 11:15 AM Bug #19022 (Resolved): Crash in Client::queue_cap_snap when thrashing
04/14/2017
- 10:00 PM Backport #19620 (In Progress): kraken: MDS server crashes due to inconsistent metadata.
- 09:59 PM Bug #19406: MDS server crashes due to inconsistent metadata.
- *master PR*: https://github.com/ceph/ceph/pull/14234
- 09:57 PM Backport #19483 (In Progress): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it"...
- 09:55 PM Backport #19335 (In Progress): kraken: MDS heartbeat timeout during rejoin, when working with lar...
- 09:54 PM Backport #19045 (In Progress): kraken: buffer overflow in test LibCephFS.DirLs
- 09:52 PM Backport #18950 (In Progress): kraken: mds/StrayManager: avoid reusing deleted inode in StrayMana...
- 09:49 PM Backport #18899 (In Progress): kraken: Test failure: test_open_inode
- 09:48 PM Backport #18706 (In Progress): kraken: fragment space check can cause replayed request fail
- 09:46 PM Backport #18700 (In Progress): kraken: client: fix the cross-quota rename boundary check conditions
- 09:44 PM Backport #18616 (In Progress): kraken: segfault in handle_client_caps
- 09:43 PM Backport #18566 (In Progress): kraken: MDS crashes on missing metadata object
- 09:42 PM Backport #18562 (In Progress): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
- 09:40 PM Backport #18552 (In Progress): kraken: ceph-fuse crash during snapshot tests
- 09:39 PM Bug #18166 (Resolved): monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE...
- 09:39 PM Bug #18166: monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE_STANDBY)"
- kraken backport is unnecessary (fix already in v11.2.0)
- 09:39 PM Backport #18283 (Resolved): kraken: monitor cannot start because of "FAILED assert(info.state == ...
- Already included in v11.2.0...
- 08:07 PM Support #16738: mount.ceph: unknown mount options: rbytes and norbytes
- Ceph: v10.2.7
Linux Kernel: 4.9.21-040921-generic on Ubuntu 16.04.1
The "rbytes" option in fstab seems to be work... - 01:42 PM Bug #16886 (Can't reproduce): multimds: kclient hang (?) in tests
- 12:50 PM Bug #19630 (Fix Under Review): StrayManager::num_stray is inaccurate
- https://github.com/ceph/ceph/pull/14554
- 09:26 AM Bug #19630 (Resolved): StrayManager::num_stray is inaccurate
- 09:50 AM Bug #19204 (Pending Backport): MDS assert failed when shutting down
- 09:48 AM Feature #19551 (Resolved): CephFS MDS health messages should be logged in the cluster log
- 08:28 AM Bug #18680 (Fix Under Review): multimds: cluster can assign active mds beyond max_mds during fail...
- 08:27 AM Bug #18680: multimds: cluster can assign active mds beyond max_mds during failures
- commit "mon/MDSMonitor: only allow deactivating the mds with max rank" in https://github.com/ceph/ceph/pull/14550 sho...
- 07:59 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
- by commit a1499bc4 (mds: stop purging strays when mds is being shutdown)
- 07:56 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
- 07:58 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
- commit 20d43372 (mds: drop superfluous MMDSOpenInoReply)
- 07:55 AM Bug #19239 (Fix Under Review): mds: stray count remains static after workflows complete
- should be fixed by commits in https://github.com/ceph/ceph/pull/14550...
04/13/2017
- 04:23 PM Bug #19395 (Fix Under Review): "Too many inodes in cache" warning can happen even when trimming i...
- The bad state is long gone, so I'm just going to change this ticket to fixing the weird case where we were getting a ...
- 03:39 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Yeah, you're right Darrell, operator error :) I guess VM did not have correct storage network attached when I tried t...
- 02:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- FYI, I don't think the module signature stuff is an issue. It's just notifying you that you've loaded a module that d...
- 12:23 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of al... - 08:04 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Yeah, did that.
My test server with compiled 4.9 kernel and patched ceph module mounted cephfs just fine.
It might... - 02:15 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not... - 03:02 PM Bug #19566 (Fix Under Review): MDS crash on mgr message during shutdown
- https://github.com/ceph/ceph/pull/14505
- 02:24 PM Bug #19566 (In Progress): MDS crash on mgr message during shutdown
- 02:31 PM Backport #19620 (Resolved): kraken: MDS server crashes due to inconsistent metadata.
- https://github.com/ceph/ceph/pull/14574
- 02:31 PM Backport #19619 (Resolved): jewel: MDS server crashes due to inconsistent metadata.
- https://github.com/ceph/ceph/pull/14676
- 12:03 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- Patrick was going to spin up a patch for this, so reassigning to him. Patrick, if I have that wrong, then please just...
- 11:07 AM Bug #19406 (Pending Backport): MDS server crashes due to inconsistent metadata.
- Marking as pending backport for the fix to data-scan which seems likely to be the underlying cause https://github.com...
- 08:13 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Well, currently the last MDS fails over to the old balancer, so he can in fact shift his load back to the others acco...
04/12/2017
- 10:27 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not load unsigned modul...
- 01:36 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- could you try adding commit 2b1ac852 (ceph: try getting buffer capability for readahead/fadvise) to your 4.9.x kernel...
- 07:36 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- I do have mmap disabled in Dovecot conf.
Relevant bits from Dovecot conf:
mmap_disable = yes
mail_nfs_index = ... - 02:17 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- elder one wrote:
> Got another error with 4.9.21 kernel.
>
> On the cephfs node /sys/kernel/debug/ceph/xxx/:
>
... - 08:29 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- We could make the greedyspill.lua balancer check to see if it is the last MDS. Then just return instead of failing. I...
- 03:17 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- and the cpu use on MDS0 stays at +/- 250%
- 03:16 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- This error shouldn't be an expected occurrence. I'll create a fix for this.
- 03:15 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- i see this in the logs...
- 03:12 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Nope. Is your load on mds.0 ? If yes, and it get's heavily loaded, and if mds.1 has load = 0, then i expect the balan...
- 03:09 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- did you modify the greedyspill.lua script at all?
- 02:51 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- BTW, you need to set debug_mds_balancer = 2 to see the balance working.
- 02:49 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Yes. For example, when I have 50 clients untarring the linux kernel into unique directories, the load is moved around.
- 02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Dan, do you see any evidence of actual load balancing?
- 02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- I also see this error. I have 2 Active/active mdses. The first shows no errors, the second shows the errors above. No...
- 11:56 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
- Ahh, it's even documented:...
- 11:55 AM Bug #19589 (Resolved): greedyspill.lua: :18: attempt to index a nil value (field '?')
- The included greedyspill.lua doesn't seem to work in a simple 3-active MDS scenario....
- 01:58 PM Bug #19593 (Resolved): purge queue and standby replay mds
- Current code opens the purge queue when mds starts standby replay. purge queue can have changed when the standby repl...
- 12:46 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- John Spray wrote:
> Intuitively I would say that things we expose as xattrs should probably bump ctime when they're ... - 12:30 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- Intuitively I would say that things we expose as xattrs should probably bump ctime when they're set, if we're being c...
- 11:54 AM Bug #19388 (Closed): mount.ceph does not accept -s option
- 11:43 AM Bug #18578 (Resolved): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
- 11:43 AM Backport #18707 (Resolved): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_non...
- 10:46 AM Bug #18461 (Resolved): failed to reconnect caps during snapshot tests
- 10:46 AM Backport #18678 (Resolved): kraken: failed to reconnect caps during snapshot tests
- 10:24 AM Bug #19205 (Resolved): Invalid error code returned by MDS is causing a kernel client WARNING
- 10:24 AM Backport #19206 (Resolved): jewel: Invalid error code returned by MDS is causing a kernel client ...
04/11/2017
- 09:03 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Dumped also mds cache (healhty cluster state 8 hours later)
Searched inode in mds error log:... - 01:54 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
- Got another error with 4.9.21 kernel.
On the cephfs node /sys/kernel/debug/ceph/xxx/:... - 08:16 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- No, I mean the ctime. You only want to update the mtime if you're changing the directory's contents. I would consider...
- 07:55 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- > Should we be updating the ctime and change_attr in the ceph.dir.layout case as well?
Did you mean mtime?
I th... - 07:32 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
- I think you're right -- well spotted! In particular, we need to bump it in the case where we update the ctime (ceph.f...
- 07:21 PM Bug #19583 (Resolved): mds: change_attr not inc in Server::handle_set_vxattr
- Noticed this was missing, was this intentional?
- 07:22 PM Bug #19388: mount.ceph does not accept -s option
- Looking at the complete picture, I think it's easiest (and acceptable for me) to wait for Jessie's successor, Stretch...
- 06:53 AM Feature #19578: mds: optimize CDir::_omap_commit() and CDir::_committed() for large directory
- We can track dirty dentries in a dirty list, CDir::_omap_commit() and CDir::_committed() only needs to check dentries...
- 06:51 AM Feature #19578 (Resolved): mds: optimize CDir::_omap_commit() and CDir::_committed() for large di...
- CDir::_omap_commit() and CDir::_committed() need to traverse whole dirfrag to find dirty dentries. It's not efficienc...
- 03:38 AM Bug #19450 (Fix Under Review): PurgeQueue read journal crash
- https://github.com/ceph/ceph/pull/14447
- 02:18 AM Bug #19450: PurgeQueue read journal crash
- mds always first read all entries in mds log, then write new entries to it. The partial entry is detected and dropped...
Also available in: Atom