Project

General

Profile

Activity

From 04/11/2017 to 05/10/2017

05/10/2017

01:25 PM Bug #19903 (Resolved): LibCephFS.ClearSetuid fails
all libcephfs/test.sh failures in http://pulpito.ceph.com/pdonnell-2017-05-10_02:32:15-fs-wip-pdonnell-integration-di... Zheng Yan
08:59 AM Bug #19891 (Fix Under Review): Test failure: test_full_different_file
https://github.com/ceph/ceph/pull/15026/ Zheng Yan
08:12 AM Bug #19891: Test failure: test_full_different_file
mds didn't get osd op reply for purge queue log prezero operations. It seems osd dropped these requests... Zheng Yan
02:22 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
Zheng Yan
02:21 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
seems like test case issue. asok command can use several seconds, which is enough for deleting all files. Zheng Yan
01:02 AM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
Patrick Donnelly
12:35 AM Bug #19890 (Fix Under Review): src/test/pybind/test_cephfs.py fails
Zheng Yan
12:34 AM Bug #19890: src/test/pybind/test_cephfs.py fails
https://github.com/ceph/ceph/pull/15018 Zheng Yan

05/09/2017

09:47 PM Bug #19896: client: test failure for O_RDWR file open
This problem also appears to be affecting a few other tests:... Patrick Donnelly
09:42 PM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
We have a test failure in test_cephfs.test_open (and test_cephfs.test_mount_unmount): http://pulpito.ceph.com/pdonnel... Patrick Donnelly
01:37 PM Bug #19450 (Resolved): PurgeQueue read journal crash
Zheng Yan
01:23 PM Bug #19893 (Resolved): test_rebuild_simple_altpool fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:25:02-kcephfs-master-testing-basic-smithi/1113182/ Zheng Yan
01:15 PM Bug #19892 (Resolved): Test failure: test_purge_queue_op_rate fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:25:02-kcephfs-master-testing-basic-smithi/1107011/ Zheng Yan
01:05 PM Bug #19891 (Resolved): Test failure: test_full_different_file
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:15:02-fs-master---basic-smithi/1106624/ Zheng Yan
12:52 PM Bug #19890 (Resolved): src/test/pybind/test_cephfs.py fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:15:05-fs-master---basic-smithi/1113004/teuthology.log
...
Zheng Yan
02:49 AM Bug #19426 (Can't reproduce): knfs blogbench hang
Zheng Yan

05/08/2017

03:24 PM Backport #19846 (In Progress): jewel: write to cephfs mount hangs, ceph-fuse and kernel
Jan Fajerski
02:00 PM Backport #19845 (In Progress): kraken: write to cephfs mount hangs, ceph-fuse and kernel
Jan Fajerski
09:46 AM Bug #19426: knfs blogbench hang
Jeff Layton wrote:
> Sorry I didn't see this sooner. Is this still cropping up?
>
> So what might be helpful the ...
Zheng Yan
09:38 AM Bug #19426: knfs blogbench hang
It seems the crash no longer happen after rebase the testing branch against 4.11 kernel Zheng Yan
09:25 AM Bug #19828 (Fix Under Review): mds: valgrind InvalidRead detected in Locker
https://github.com/ceph/ceph/pull/14991 Zheng Yan
03:27 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
what does 'written in part' mean? application wrote ~23G, failed to write the rest, or application wrote 26G but the ... Zheng Yan
02:59 AM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
"ceph: try getting buffer capability for readahead/fadvise" has backported into 4.9.x Zheng Yan

05/04/2017

07:06 PM Feature #19862 (New): mds: add LTTnG tracepoints for each type of MDS operation
It would be nice to know the latency of different file system operations so we can see which operations:
- scale poo...
Michael Sevilla
09:47 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
upload client log file junming rao
09:41 AM Bug #19854 (Duplicate): ceph-fuse write a big file,The file is only written in part
application write a big file( 26GB) to cephfs, the file is only written in part(23GB);
ceph version: 10.2.6 (656b5b6...
junming rao

05/03/2017

08:20 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
https://github.com/ceph/ceph/pull/15000 Nathan Cutler
08:20 PM Backport #19845 (Resolved): kraken: write to cephfs mount hangs, ceph-fuse and kernel
https://github.com/ceph/ceph/pull/14998 Nathan Cutler
05:46 PM Feature #19362: mds: add perf counters for each type of MDS operation
https://github.com/ceph/ceph/pull/14938 Michael Sevilla
03:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Zheng Yan wrote:
> elder one wrote:
> > One difference I noticed between 4.4 and 4.9 kernels
> > - with 4.4 kerne...
elder one
02:27 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
>
>...
Zheng Yan
02:06 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> One difference I noticed between 4.4 and 4.9 kernels
> - with 4.4 kernel on cephfs directory si...
Zheng Yan
11:04 AM Backport #18699 (Resolved): jewel: client: fix the cross-quota rename boundary check conditions
Nathan Cutler
07:47 AM Bug #18872 (Pending Backport): write to cephfs mount hangs, ceph-fuse and kernel
Zheng Yan
05:25 AM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
This is pretty infeasible right now -- nothing tracks which bits of a file are allocated or exist. The closest it com... Greg Farnum
03:27 AM Bug #19828 (Resolved): mds: valgrind InvalidRead detected in Locker
... Patrick Donnelly
01:55 AM Bug #17819 (Can't reproduce): MDS crashed while performing snapshot creation and deletion in a loop
Zheng Yan
01:53 AM Bug #17819: MDS crashed while performing snapshot creation and deletion in a loop
run the test overnight, can't reproduce the crash. Zheng Yan

05/02/2017

09:40 AM Bug #17408 (Can't reproduce): Possible un-needed wait on rstats when listing dir?
can't reproduce. probably fixed by ... Zheng Yan
09:21 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
Also upgraded ceph to 1...
elder one
08:29 AM Feature #19820 (Fix Under Review): ceph-fuse: use userspace permission check by default
https://github.com/ceph/ceph/pull/14907 Zheng Yan
07:39 AM Feature #19820 (Resolved): ceph-fuse: use userspace permission check by default
Zheng Yan
07:32 AM Bug #19812: client: not swapping directory caps efficiently leads to very slow create chains
The reason of slow creation/deletion is that ceph-fuse sends a getattr request (check permission of test directory) b... Zheng Yan
06:17 AM Feature #19819 (Rejected): Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
The purpose of this -- is to have standard interface for examining allocated extents of the files. For example, xfs_i... Марк Коренберг

04/28/2017

10:09 PM Bug #19812 (New): client: not swapping directory caps efficiently leads to very slow create chains
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg34200.html
In short: if you have a ceph-fuse and a kerne...
Greg Farnum

04/27/2017

07:27 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
PR: https://github.com/ceph/ceph/pull/14822 Jan Fajerski
03:21 AM Bug #19789 (New): FAIL: test_evict_client (tasks.cephfs.test_misc.TestMisc)
http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-24_03:15:02-fs-master---basic-smithi/ Zheng Yan

04/25/2017

09:03 AM Bug #19755 (Fix Under Review): MDS became unresponsive when truncating a very large file
https://github.com/ceph/ceph/pull/14769 Zheng Yan
03:39 AM Bug #19755 (Resolved): MDS became unresponsive when truncating a very large file
We were trying to copy a very large file (7TB exactly) between two directories through Samba/CephFS, and cancelled it... Sandy Xu
06:52 AM Backport #19763 (Resolved): kraken: non-local cephfs quota changes not visible until some IO is done
https://github.com/ceph/ceph/pull/16108 Nathan Cutler
06:52 AM Backport #19762 (Resolved): jewel: non-local cephfs quota changes not visible until some IO is done
https://github.com/ceph/ceph/pull/15466 Nathan Cutler

04/24/2017

09:14 PM Bug #19583 (Resolved): mds: change_attr not inc in Server::handle_set_vxattr
John Spray
09:13 PM Fix #19691 (Resolved): Remove journaler_allow_split_entries option
John Spray
09:13 PM Bug #18816 (Resolved): MDS crashes with log disabled
John Spray
09:12 PM Bug #19306: fs: mount NFS to cephfs, and then ls a directory containing a large number of files, ...
The userspace piece (https://github.com/ceph/ceph/pull/14317) has merged.
Zheng: please resolve the ticket when th...
John Spray
09:11 PM Feature #17855 (Resolved): Don't evict a slow client if it's the only client
John Spray
01:43 PM Bug #19706 (In Progress): Laggy mon daemons causing MDS failover (symptom: failed to set counters...
Should disable this check for the misc workunit, it was mainly intended for the workunits that have more sustained lo... John Spray
12:27 PM Feature #18425 (Resolved): mds: add the option to use tcmalloc directly
John Spray
10:13 AM Bug #17939 (Pending Backport): non-local cephfs quota changes not visible until some IO is done
John Spray
07:08 AM Support #16884 (Closed): rename() doesn't work between directories
Zheng Yan
07:02 AM Support #16884: rename() doesn't work between directories
Zheng Yan
03:01 AM Bug #19635 (Fix Under Review): Deadlock on two ceph-fuse clients accessing the same file
https://github.com/ceph/ceph/pull/14743 Zheng Yan

04/22/2017

02:10 AM Bug #19583 (Fix Under Review): mds: change_attr not inc in Server::handle_set_vxattr
https://github.com/ceph/ceph/pull/14726 Patrick Donnelly

04/21/2017

03:41 AM Bug #19734 (Resolved): mds: subsystems like ceph_subsys_mds_balancer do not log correctly
Our logging for subsystems is relying on a removed configuration macro:... Patrick Donnelly
03:36 AM Bug #19589 (Fix Under Review): greedyspill.lua: :18: attempt to index a nil value (field '?')
https://github.com/ceph/ceph/pull/14704 Patrick Donnelly

04/20/2017

09:54 PM Backport #19709 (In Progress): jewel: Enable MDS to start when session ino info is corrupt
Nathan Cutler
12:13 PM Backport #19709 (Resolved): jewel: Enable MDS to start when session ino info is corrupt
https://github.com/ceph/ceph/pull/14700 Nathan Cutler
09:49 PM Backport #19679 (In Progress): jewel: MDS: damage reporting by ino number is useless
Nathan Cutler
09:29 PM Backport #19677 (In Progress): jewel: Jewel ceph-fuse does not recover after lost connection to MDS
Nathan Cutler
05:55 PM Backport #19466 (Need More Info): jewel: mds: log rotation doesn't work if mds has respawned
Needs 3ba63063 1fb15a21
Of these two, 3ba63063 is non-trivial.
Nathan Cutler
11:32 AM Backport #19466 (In Progress): jewel: mds: log rotation doesn't work if mds has respawned
Nathan Cutler
05:06 PM Backport #19620 (Resolved): kraken: MDS server crashes due to inconsistent metadata.
Nathan Cutler
05:06 PM Backport #19483 (Resolved): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" co...
Nathan Cutler
05:05 PM Backport #19335 (Resolved): kraken: MDS heartbeat timeout during rejoin, when working with large ...
Nathan Cutler
05:04 PM Backport #19045 (Resolved): kraken: buffer overflow in test LibCephFS.DirLs
Nathan Cutler
05:03 PM Backport #18950 (Resolved): kraken: mds/StrayManager: avoid reusing deleted inode in StrayManager...
Nathan Cutler
05:02 PM Backport #18899 (Resolved): kraken: Test failure: test_open_inode
Nathan Cutler
05:01 PM Backport #18706 (Resolved): kraken: fragment space check can cause replayed request fail
Nathan Cutler
04:59 PM Backport #18700 (Resolved): kraken: client: fix the cross-quota rename boundary check conditions
Nathan Cutler
04:58 PM Bug #18306 (Resolved): segfault in handle_client_caps
Nathan Cutler
04:58 PM Backport #18616 (Resolved): kraken: segfault in handle_client_caps
Nathan Cutler
04:57 PM Bug #18179 (Resolved): MDS crashes on missing metadata object
Nathan Cutler
04:57 PM Backport #18566 (Resolved): kraken: MDS crashes on missing metadata object
Nathan Cutler
04:56 PM Bug #18396 (Resolved): Test Failure: kcephfs test_client_recovery.TestClientRecovery
Nathan Cutler
07:57 AM Bug #18396: Test Failure: kcephfs test_client_recovery.TestClientRecovery
http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-13_05:20:02-kcephfs-kraken-testing-basic-smithi/1019312/ Zheng Yan
04:56 PM Backport #18562 (Resolved): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
Nathan Cutler
04:55 PM Bug #18460 (Resolved): ceph-fuse crash during snapshot tests
Nathan Cutler
04:55 PM Backport #18552 (Resolved): kraken: ceph-fuse crash during snapshot tests
Nathan Cutler
02:53 PM Bug #19707 (Duplicate): Hadoop tests fail due to missing upstream tarball
Indeed it is -- I hadn't seen that other ticket because it was in the wrong project. John Spray
02:04 PM Bug #19707: Hadoop tests fail due to missing upstream tarball
Dup of #19456? Ken Dreyer
09:57 AM Bug #19707 (Duplicate): Hadoop tests fail due to missing upstream tarball
http://pulpito.ceph.com/teuthology-2017-04-03_03:45:03-hadoop-master---basic-mira/... John Spray
01:49 PM Bug #19712 (New): some kcephfs tests become very slow
http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-16_04:20:02-kcephfs-jewel-testing-basic-smithi/
http://pulpit...
Zheng Yan
01:40 PM Backport #19675 (In Progress): jewel: cephfs: Test failure: test_data_isolated (tasks.cephfs.test...
Nathan Cutler
01:24 PM Backport #19673 (In Progress): jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv...
Nathan Cutler
01:22 PM Backport #19671 (In Progress): jewel: MDS assert failed when shutting down
Nathan Cutler
01:15 PM Backport #19668 (In Progress): jewel: MDS goes readonly writing backtrace for a file whose data p...
Nathan Cutler
01:04 PM Bug #18872 (Fix Under Review): write to cephfs mount hangs, ceph-fuse and kernel
Turns out this is an issue of ceph leaking arch-dependend flags on the wire. See kernel ml [PATCH] ceph: Fix file ope... Jan Fajerski
12:13 PM Backport #19710 (Resolved): kraken: Enable MDS to start when session ino info is corrupt
https://github.com/ceph/ceph/pull/16107 Nathan Cutler
12:08 PM Backport #19666 (In Progress): jewel: fs:The mount point break off when mds switch hanppened.
Nathan Cutler
11:42 AM Backport #19665 (In Progress): jewel: C_MDSInternalNoop::complete doesn't free itself
Nathan Cutler
11:37 AM Backport #19619 (In Progress): jewel: MDS server crashes due to inconsistent metadata.
Nathan Cutler
11:34 AM Backport #19482 (In Progress): jewel: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" ...
Nathan Cutler
10:57 AM Backport #19334 (In Progress): jewel: MDS heartbeat timeout during rejoin, when working with larg...
Nathan Cutler
10:55 AM Backport #19044 (In Progress): jewel: buffer overflow in test LibCephFS.DirLs
Nathan Cutler
10:54 AM Backport #18949 (In Progress): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManag...
Nathan Cutler
10:49 AM Backport #18900 (In Progress): jewel: Test failure: test_open_inode
Nathan Cutler
10:47 AM Backport #18705 (In Progress): jewel: fragment space check can cause replayed request fail
Nathan Cutler
10:44 AM Backport #18699 (In Progress): jewel: client: fix the cross-quota rename boundary check conditions
Nathan Cutler
10:22 AM Bug #16842: mds: replacement MDS crashes on InoTable release
Created ticket for the workaround and marked it for backport here: http://tracker.ceph.com/issues/19708
I should h...
John Spray
10:21 AM Fix #19708 (Resolved): Enable MDS to start when session ino info is corrupt

This was a mitigation for issue #16842, which is itself a mystery.
Creating ticket to backport it.
Fix on mas...
John Spray
09:03 AM Bug #18816 (Fix Under Review): MDS crashes with log disabled
I'm proposing that we rip out this configuration option, it's a trap for the unwary:
https://github.com/ceph/ceph/pu...
John Spray
08:30 AM Bug #19706 (Can't reproduce): Laggy mon daemons causing MDS failover (symptom: failed to set coun...
http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-15_03:15:10-fs-master---basic-smithi/1027137/ Zheng Yan

04/19/2017

10:24 PM Feature #17834 (Fix Under Review): MDS Balancer overrides
https://github.com/ceph/ceph/pull/14598 Patrick Donnelly
04:02 PM Bug #15467 (Won't Fix): After "mount -l", ceph-fuse does not work
This appears to have happened on pre-jewel code, so it's unlikely anyone is interested in investigating. John Spray
10:11 AM Fix #19691 (Fix Under Review): Remove journaler_allow_split_entries option
https://github.com/ceph/ceph/pull/14636 John Spray
10:06 AM Fix #19691 (Resolved): Remove journaler_allow_split_entries option
This has been broken in practice since at least when the MDS journal format (JournalStream etc) was changed, as the r... John Spray

04/18/2017

08:38 PM Feature #17980 (Fix Under Review): MDS should reject connections from OSD-blacklisted clients
https://github.com/ceph/ceph/pull/14610 John Spray
08:38 PM Feature #9754 (Fix Under Review): A 'fence and evict' client eviction command
See #17980 patch John Spray
07:39 PM Backport #19680 (Resolved): kraken: MDS: damage reporting by ino number is useless
https://github.com/ceph/ceph/pull/16106 Nathan Cutler
07:39 PM Backport #19679 (Resolved): jewel: MDS: damage reporting by ino number is useless
https://github.com/ceph/ceph/pull/14699 Nathan Cutler
07:38 PM Backport #19678 (Resolved): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
https://github.com/ceph/ceph/pull/16105 Nathan Cutler
07:38 PM Backport #19677 (Resolved): jewel: Jewel ceph-fuse does not recover after lost connection to MDS
https://github.com/ceph/ceph/pull/14698 Nathan Cutler
07:38 PM Backport #19676 (Resolved): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_v...
https://github.com/ceph/ceph/pull/16104 Nathan Cutler
07:38 PM Backport #19675 (Resolved): jewel: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_vo...
https://github.com/ceph/ceph/pull/14685 Nathan Cutler
07:38 PM Backport #19674 (Resolved): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr kv p...
https://github.com/ceph/ceph/pull/16103 Nathan Cutler
07:38 PM Backport #19673 (Resolved): jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv pa...
https://github.com/ceph/ceph/pull/14684 Nathan Cutler
07:37 PM Backport #19672 (Resolved): kraken: MDS assert failed when shutting down
https://github.com/ceph/ceph/pull/16102 Nathan Cutler
07:37 PM Backport #19671 (Resolved): jewel: MDS assert failed when shutting down
https://github.com/ceph/ceph/pull/14683 Nathan Cutler
07:37 PM Backport #19669 (Resolved): kraken: MDS goes readonly writing backtrace for a file whose data poo...
https://github.com/ceph/ceph/pull/16101 Nathan Cutler
07:37 PM Backport #19668 (Resolved): jewel: MDS goes readonly writing backtrace for a file whose data pool...
https://github.com/ceph/ceph/pull/14682 Nathan Cutler
07:37 PM Backport #19667 (Resolved): kraken: fs:The mount point break off when mds switch hanppened.
https://github.com/ceph/ceph/pull/16100 Nathan Cutler
07:37 PM Backport #19666 (Resolved): jewel: fs:The mount point break off when mds switch hanppened.
https://github.com/ceph/ceph/pull/14679 Nathan Cutler
07:37 PM Backport #19665 (Resolved): jewel: C_MDSInternalNoop::complete doesn't free itself
https://github.com/ceph/ceph/pull/14677 Nathan Cutler
07:37 PM Backport #19664 (Resolved): kraken: C_MDSInternalNoop::complete doesn't free itself
https://github.com/ceph/ceph/pull/16099 Nathan Cutler
04:20 PM Bug #16842: mds: replacement MDS crashes on InoTable release
backport? c sights
01:29 PM Feature #10792 (New): qa: enable thrasher for MDS cluster size (vary max_mds)
The thrasher exists, this ticket is now for switching it on and getting the resulting runs green. John Spray
01:28 PM Feature #15068 (Resolved): fsck: multifs: enable repair tools to read from one filesystem and wri...
John Spray
01:28 PM Feature #15069 (Resolved): MDS: multifs: enable two filesystems to point to same pools if one of ...
John Spray
01:28 PM Fix #15134 (New): multifs: test case exercising mds_thrash for multiple filesystems
(tweaking ticket to be for creating a test case that uses the new/smarter thrashing code in a multi-fs way) John Spray
01:26 PM Bug #18579 (Resolved): Fuse client has "opening" session to nonexistent MDS rank after MDS cluste...
John Spray
01:26 PM Bug #18914 (Pending Backport): cephfs: Test failure: test_data_isolated (tasks.cephfs.test_volume...
John Spray
01:26 PM Feature #19075 (Resolved): Extend 'p' mds auth cap to cover quotas and all layout fields
John Spray
01:24 PM Bug #19501 (Pending Backport): C_MDSInternalNoop::complete doesn't free itself
John Spray
01:23 PM Bug #19566 (Resolved): MDS crash on mgr message during shutdown
John Spray
12:51 PM Bug #19640 (Resolved): ceph-fuse should only return writeback errors once per file handle
John Spray
11:49 AM Feature #18509 (Pending Backport): MDS: damage reporting by ino number is useless
John Spray

04/17/2017

02:08 PM Bug #19426: knfs blogbench hang
Sorry I didn't see this sooner. Is this still cropping up?
So what might be helpful the next time this happens loc...
Jeff Layton
01:17 PM Bug #19640 (Fix Under Review): ceph-fuse should only return writeback errors once per file handle
https://github.com/ceph/ceph/pull/14589 John Spray
01:16 PM Bug #19640 (Resolved): ceph-fuse should only return writeback errors once per file handle
Currently if someone fsyncs and sees the error, we give them the same error again on fclose.
We should match the n...
John Spray
11:13 AM Bug #19635 (In Progress): Deadlock on two ceph-fuse clients accessing the same file
This bug happens in following sequence of events
- Reuqest1 (from client1) create file1 (mds issues caps Asx to cl...
Zheng Yan
12:15 AM Bug #19635: Deadlock on two ceph-fuse clients accessing the same file
Those requests are getting hung up on the iauth and ixattr locks on the inode for the ".syn" file the test script cre... John Spray

04/16/2017

11:02 PM Bug #19635: Deadlock on two ceph-fuse clients accessing the same file
I was wondering if d463107473 ("mds: finish lock waiters in the same order that they were added.") could have been th... John Spray
09:48 PM Bug #19635 (Resolved): Deadlock on two ceph-fuse clients accessing the same file
See Dan's reproducer script, and thread "[ceph-users] fsping, why you no work no mo?"
https://raw.githubusercontent....
John Spray
01:19 PM Support #16738 (Closed): mount.ceph: unknown mount options: rbytes and norbytes
John Spray

04/15/2017

06:47 PM Bug #19437 (Pending Backport): fs:The mount point break off when mds switch hanppened.
John Spray
06:46 PM Bug #19033 (Pending Backport): cephfs: mds is crushed, after I set about 400 64KB xattr kv pairs ...
John Spray
06:45 PM Bug #18757 (Pending Backport): Jewel ceph-fuse does not recover after lost connection to MDS
Let's backport this for the benefit of people running cephfs today John Spray
06:41 PM Bug #19401 (Pending Backport): MDS goes readonly writing backtrace for a file whose data pool has...
John Spray
03:21 PM Support #16738: mount.ceph: unknown mount options: rbytes and norbytes
This issue can be closed. I have not experienced this issue anymore and the original systems the issue was found is n... Alexander Trost
11:15 AM Bug #19022 (Resolved): Crash in Client::queue_cap_snap when thrashing
John Spray

04/14/2017

10:00 PM Backport #19620 (In Progress): kraken: MDS server crashes due to inconsistent metadata.
Nathan Cutler
09:59 PM Bug #19406: MDS server crashes due to inconsistent metadata.
*master PR*: https://github.com/ceph/ceph/pull/14234 Nathan Cutler
09:57 PM Backport #19483 (In Progress): kraken: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it"...
Nathan Cutler
09:55 PM Backport #19335 (In Progress): kraken: MDS heartbeat timeout during rejoin, when working with lar...
Nathan Cutler
09:54 PM Backport #19045 (In Progress): kraken: buffer overflow in test LibCephFS.DirLs
Nathan Cutler
09:52 PM Backport #18950 (In Progress): kraken: mds/StrayManager: avoid reusing deleted inode in StrayMana...
Nathan Cutler
09:49 PM Backport #18899 (In Progress): kraken: Test failure: test_open_inode
Nathan Cutler
09:48 PM Backport #18706 (In Progress): kraken: fragment space check can cause replayed request fail
Nathan Cutler
09:46 PM Backport #18700 (In Progress): kraken: client: fix the cross-quota rename boundary check conditions
Nathan Cutler
09:44 PM Backport #18616 (In Progress): kraken: segfault in handle_client_caps
Nathan Cutler
09:43 PM Backport #18566 (In Progress): kraken: MDS crashes on missing metadata object
Nathan Cutler
09:42 PM Backport #18562 (In Progress): kraken: Test Failure: kcephfs test_client_recovery.TestClientRecovery
Nathan Cutler
09:40 PM Backport #18552 (In Progress): kraken: ceph-fuse crash during snapshot tests
Nathan Cutler
09:39 PM Bug #18166 (Resolved): monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE...
Nathan Cutler
09:39 PM Bug #18166: monitor cannot start because of "FAILED assert(info.state == MDSMap::STATE_STANDBY)"
kraken backport is unnecessary (fix already in v11.2.0) Nathan Cutler
09:39 PM Backport #18283 (Resolved): kraken: monitor cannot start because of "FAILED assert(info.state == ...
Already included in v11.2.0... Nathan Cutler
08:07 PM Support #16738: mount.ceph: unknown mount options: rbytes and norbytes
Ceph: v10.2.7
Linux Kernel: 4.9.21-040921-generic on Ubuntu 16.04.1
The "rbytes" option in fstab seems to be work...
Fred Drake
01:42 PM Bug #16886 (Can't reproduce): multimds: kclient hang (?) in tests
Zheng Yan
12:50 PM Bug #19630 (Fix Under Review): StrayManager::num_stray is inaccurate
https://github.com/ceph/ceph/pull/14554 Zheng Yan
09:26 AM Bug #19630 (Resolved): StrayManager::num_stray is inaccurate
Zheng Yan
09:50 AM Bug #19204 (Pending Backport): MDS assert failed when shutting down
John Spray
09:48 AM Feature #19551 (Resolved): CephFS MDS health messages should be logged in the cluster log
John Spray
08:28 AM Bug #18680 (Fix Under Review): multimds: cluster can assign active mds beyond max_mds during fail...
Zheng Yan
08:27 AM Bug #18680: multimds: cluster can assign active mds beyond max_mds during failures
commit "mon/MDSMonitor: only allow deactivating the mds with max rank" in https://github.com/ceph/ceph/pull/14550 sho... Zheng Yan
07:59 AM Bug #18755: multimds: MDCache.cc: 4735: FAILED assert(in)
by commit a1499bc4 (mds: stop purging strays when mds is being shutdown) Zheng Yan
07:56 AM Bug #18755 (Resolved): multimds: MDCache.cc: 4735: FAILED assert(in)
Zheng Yan
07:58 AM Bug #18754 (Resolved): multimds: MDCache.cc: 8569: FAILED assert(!info.ancestors.empty())
commit 20d43372 (mds: drop superfluous MMDSOpenInoReply) Zheng Yan
07:55 AM Bug #19239 (Fix Under Review): mds: stray count remains static after workflows complete
should be fixed by commits in https://github.com/ceph/ceph/pull/14550... Zheng Yan

04/13/2017

04:23 PM Bug #19395 (Fix Under Review): "Too many inodes in cache" warning can happen even when trimming i...
The bad state is long gone, so I'm just going to change this ticket to fixing the weird case where we were getting a ... John Spray
03:39 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Yeah, you're right Darrell, operator error :) I guess VM did not have correct storage network attached when I tried t... elder one
02:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
FYI, I don't think the module signature stuff is an issue. It's just notifying you that you've loaded a module that d... Darrell Enns
12:23 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
One difference I noticed between 4.4 and 4.9 kernels
- with 4.4 kernel on cephfs directory sizes (total bytes of al...
elder one
08:04 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Yeah, did that.
My test server with compiled 4.9 kernel and patched ceph module mounted cephfs just fine.
It might...
elder one
02:15 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not...
Zheng Yan
03:02 PM Bug #19566 (Fix Under Review): MDS crash on mgr message during shutdown
https://github.com/ceph/ceph/pull/14505 John Spray
02:24 PM Bug #19566 (In Progress): MDS crash on mgr message during shutdown
John Spray
02:31 PM Backport #19620 (Resolved): kraken: MDS server crashes due to inconsistent metadata.
https://github.com/ceph/ceph/pull/14574 Nathan Cutler
02:31 PM Backport #19619 (Resolved): jewel: MDS server crashes due to inconsistent metadata.
https://github.com/ceph/ceph/pull/14676 Nathan Cutler
12:03 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
Patrick was going to spin up a patch for this, so reassigning to him. Patrick, if I have that wrong, then please just... Jeff Layton
11:07 AM Bug #19406 (Pending Backport): MDS server crashes due to inconsistent metadata.
Marking as pending backport for the fix to data-scan which seems likely to be the underlying cause https://github.com... John Spray
08:13 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
Well, currently the last MDS fails over to the old balancer, so he can in fact shift his load back to the others acco... Dan van der Ster

04/12/2017

10:27 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Well, managed to patch ceph kernel module with 2b1ac852 commit, but my Ubuntu 4.9 kernel will not load unsigned modul... elder one
01:36 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
could you try adding commit 2b1ac852 (ceph: try getting buffer capability for readahead/fadvise) to your 4.9.x kernel... Zheng Yan
07:36 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
I do have mmap disabled in Dovecot conf.
Relevant bits from Dovecot conf:
mmap_disable = yes
mail_nfs_index = ...
elder one
02:17 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> Got another error with 4.9.21 kernel.
>
> On the cephfs node /sys/kernel/debug/ceph/xxx/:
>
...
Zheng Yan
08:29 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
We could make the greedyspill.lua balancer check to see if it is the last MDS. Then just return instead of failing. I... Michael Sevilla
03:17 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
and the cpu use on MDS0 stays at +/- 250% Mark Guz
03:16 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
This error shouldn't be an expected occurrence. I'll create a fix for this. Patrick Donnelly
03:15 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
i see this in the logs... Mark Guz
03:12 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
Nope. Is your load on mds.0 ? If yes, and it get's heavily loaded, and if mds.1 has load = 0, then i expect the balan... Dan van der Ster
03:09 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
did you modify the greedyspill.lua script at all? Mark Guz
02:51 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
BTW, you need to set debug_mds_balancer = 2 to see the balance working. Dan van der Ster
02:49 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
Yes. For example, when I have 50 clients untarring the linux kernel into unique directories, the load is moved around. Dan van der Ster
02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
Dan, do you see any evidence of actual load balancing? Mark Guz
02:39 PM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
I also see this error. I have 2 Active/active mdses. The first shows no errors, the second shows the errors above. No... Mark Guz
11:56 AM Bug #19589: greedyspill.lua: :18: attempt to index a nil value (field '?')
Ahh, it's even documented:... Dan van der Ster
11:55 AM Bug #19589 (Resolved): greedyspill.lua: :18: attempt to index a nil value (field '?')
The included greedyspill.lua doesn't seem to work in a simple 3-active MDS scenario.... Dan van der Ster
01:58 PM Bug #19593 (Resolved): purge queue and standby replay mds
Current code opens the purge queue when mds starts standby replay. purge queue can have changed when the standby repl... Zheng Yan
12:46 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
John Spray wrote:
> Intuitively I would say that things we expose as xattrs should probably bump ctime when they're ...
Jeff Layton
12:30 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
Intuitively I would say that things we expose as xattrs should probably bump ctime when they're set, if we're being c... John Spray
11:54 AM Bug #19388 (Closed): mount.ceph does not accept -s option
John Spray
11:43 AM Bug #18578 (Resolved): failed filelock.can_read(-1) assertion in Server::_dir_is_nonempty
Nathan Cutler
11:43 AM Backport #18707 (Resolved): kraken: failed filelock.can_read(-1) assertion in Server::_dir_is_non...
Nathan Cutler
10:46 AM Bug #18461 (Resolved): failed to reconnect caps during snapshot tests
Nathan Cutler
10:46 AM Backport #18678 (Resolved): kraken: failed to reconnect caps during snapshot tests
Nathan Cutler
10:24 AM Bug #19205 (Resolved): Invalid error code returned by MDS is causing a kernel client WARNING
Nathan Cutler
10:24 AM Backport #19206 (Resolved): jewel: Invalid error code returned by MDS is causing a kernel client ...
Nathan Cutler

04/11/2017

09:03 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Dumped also mds cache (healhty cluster state 8 hours later)
Searched inode in mds error log:...
elder one
01:54 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Got another error with 4.9.21 kernel.
On the cephfs node /sys/kernel/debug/ceph/xxx/:...
elder one
08:16 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
No, I mean the ctime. You only want to update the mtime if you're changing the directory's contents. I would consider... Jeff Layton
07:55 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
> Should we be updating the ctime and change_attr in the ceph.dir.layout case as well?
Did you mean mtime?
I th...
Patrick Donnelly
07:32 PM Bug #19583: mds: change_attr not inc in Server::handle_set_vxattr
I think you're right -- well spotted! In particular, we need to bump it in the case where we update the ctime (ceph.f... Jeff Layton
07:21 PM Bug #19583 (Resolved): mds: change_attr not inc in Server::handle_set_vxattr
Noticed this was missing, was this intentional? Patrick Donnelly
07:22 PM Bug #19388: mount.ceph does not accept -s option
Looking at the complete picture, I think it's easiest (and acceptable for me) to wait for Jessie's successor, Stretch... Michel Roelofs
06:53 AM Feature #19578: mds: optimize CDir::_omap_commit() and CDir::_committed() for large directory
We can track dirty dentries in a dirty list, CDir::_omap_commit() and CDir::_committed() only needs to check dentries... Zheng Yan
06:51 AM Feature #19578 (Resolved): mds: optimize CDir::_omap_commit() and CDir::_committed() for large di...
CDir::_omap_commit() and CDir::_committed() need to traverse whole dirfrag to find dirty dentries. It's not efficienc... Zheng Yan
03:38 AM Bug #19450 (Fix Under Review): PurgeQueue read journal crash
https://github.com/ceph/ceph/pull/14447 Zheng Yan
02:18 AM Bug #19450: PurgeQueue read journal crash
mds always first read all entries in mds log, then write new entries to it. The partial entry is detected and dropped... Zheng Yan
 

Also available in: Atom