Project

General

Profile

Activity

From 04/25/2017 to 05/24/2017

05/24/2017

03:51 PM Bug #20040 (Resolved): "terminate called after throwing an instance of 'std::out_of_range'" in po...
John Spray
03:49 PM Bug #19946 (Resolved): CInode.cc: 2481: FAILED assert(s == nested_auth_pins)
John Spray
03:42 PM Bug #19892 (Resolved): Test failure: test_purge_queue_op_rate fails
John Spray
03:35 PM Bug #19955: Too many stat ops when MDS trying to probe a large file
Zheng -- should we backport this? What will happen to old clients trying to access MDSs using the new behaviour? John Spray
02:26 PM Bug #20072 (Resolved): TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls results

It's possible this was always the case but we only just happened to see it affect a test?
The CephFS test TestSt...
John Spray
01:41 PM Bug #20060 (Fix Under Review): segmentation fault in _do_cap_update
Yes, seems to be. Douglas Fuller
08:50 AM Bug #20060: segmentation fault in _do_cap_update
please use gdb to which code triggered segfault
probably fixed by https://github.com/ceph/ceph/pull/15125/commits/...
Zheng Yan

05/23/2017

07:05 PM Bug #20060: segmentation fault in _do_cap_update
Full MDS log: https://www.dropbox.com/s/vfdqlqnyrjo4do6/mds.b.log.bz2 Douglas Fuller
06:58 PM Bug #20060 (Resolved): segmentation fault in _do_cap_update
... Douglas Fuller
07:01 PM Bug #19969 (Resolved): CDir.cc: 909: FAILED assert(get_num_ref() == (state_test(STATE_STICKY) ? 1...
Douglas Fuller
05:14 PM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
The issue is that we're getting nice regular tick() calls even when we're not readable, so the code in ::tick to rese... John Spray
10:53 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
Another one here, easily identifiable from teuthology log:... John Spray
02:00 PM Bug #20055 (Fix Under Review): Journaler may execute on_safe contexts prematurely
https://github.com/ceph/ceph/pull/15240 Zheng Yan
01:45 PM Bug #20055 (Resolved): Journaler may execute on_safe contexts prematurely
Zheng Yan

05/22/2017

09:14 PM Bug #20040 (Fix Under Review): "terminate called after throwing an instance of 'std::out_of_range...
https://github.com/ceph/ceph/pull/15213 John Spray
03:39 PM Bug #20040 (Resolved): "terminate called after throwing an instance of 'std::out_of_range'" in po...
Run: http://pulpito.ceph.com/teuthology-2017-05-21_22:37:21-powercycle-master-testing-basic-mira/
Jobs: '1204386', '...
Yuri Weinstein
03:41 PM Bug #20039 (Fix Under Review): mds: replay of export pinned inode does not result in export
https://github.com/ceph/ceph/pull/15205 Patrick Donnelly
03:33 PM Bug #20039 (Resolved): mds: replay of export pinned inode does not result in export
Found this while thrashing exports. Example log:... Patrick Donnelly
11:40 AM Feature #17835: mds: enable killpoint tests for MDS-MDS subtree export
(NB original PR was https://github.com/ceph/ceph/pull/13308) John Spray
10:29 AM Backport #20028 (Resolved): kraken: Deadlock on two ceph-fuse clients accessing the same file
https://github.com/ceph/ceph/pull/16191 Nathan Cutler
10:29 AM Backport #20027 (Resolved): jewel: Deadlock on two ceph-fuse clients accessing the same file
https://github.com/ceph/ceph/pull/15438 Nathan Cutler
10:29 AM Backport #20026 (Resolved): kraken: cephfs: MDS became unresponsive when truncating a very large ...
https://github.com/ceph/ceph/pull/16190 Nathan Cutler
10:29 AM Backport #20025 (Resolved): jewel: MDS became unresponsive when truncating a very large file
https://github.com/ceph/ceph/pull/15442 Nathan Cutler
02:22 AM Bug #19955: Too many stat ops when MDS trying to probe a large file
It's possible that client_range is not properly cleared in some corner cases. Need detail log to find out. Zheng Yan

05/19/2017

01:44 PM Bug #19854 (Duplicate): ceph-fuse write a big file,The file is only written in part
It's a known issue. If you use ceph-fuse, and multiple client read/modify a file at the same time, you should disable... Zheng Yan
10:23 AM Bug #17468 (Closed): CephFs: IO Pauses for more than a 40 seconds, while running write intensive IOs
Unable to consistently reproduce the bug. Vishal Kanaujia
12:56 AM Feature #17835: mds: enable killpoint tests for MDS-MDS subtree export
Taking this one. I'm working on a test now. Patrick Donnelly

05/18/2017

10:46 PM Bug #17259 (Won't Fix): multimds: ranks >= max_mds may be assigned after reducing max_mds
This affected kcephfs which was fixed in https://github.com/ceph/ceph-client/commit/76201b6354bb3aa31c7ba2bd42b9cbb8d... Patrick Donnelly
10:43 PM Bug #19240: multimds on linode: troubling op throughput scaling from 8 to 16 MDS in kernel bulid ...
To close this we should confirm hypothesis with new op tracking from http://tracker.ceph.com/issues/19362
I'll do ...
Patrick Donnelly
04:07 PM Bug #19934 (Resolved): ceph fs set cephfs standby_count_wanted 0 fails on jewel upgrade
https://github.com/ceph/ceph/pull/15126 Kefu Chai
03:53 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
hi zheng yan:
the problem occurs when multiple clients read/write a file at the same time in our environment;
junming rao
01:14 AM Bug #19955: Too many stat ops when MDS trying to probe a large file
John Spray wrote:
> File size recovery should only happen when a client has failed to reconnect properly during the ...
Sandy Xu
12:14 AM Bug #19969 (Fix Under Review): CDir.cc: 909: FAILED assert(get_num_ref() == (state_test(STATE_STI...
Zheng Yan
12:14 AM Bug #19969: CDir.cc: 909: FAILED assert(get_num_ref() == (state_test(STATE_STICKY) ? 1:0))
Fix is already in my wip-multimds-misc branch https://github.com/ceph/ceph/pull/14550/commits/b4e0d12c4d00e4994eff3d8... Zheng Yan

05/17/2017

08:28 PM Bug #19969: CDir.cc: 909: FAILED assert(get_num_ref() == (state_test(STATE_STICKY) ? 1:0))
MDS log: https://www.dropbox.com/s/hd7bk0lwznwxlcy/mds.a.log.bz2 Douglas Fuller
08:20 PM Bug #19969 (Resolved): CDir.cc: 909: FAILED assert(get_num_ref() == (state_test(STATE_STICKY) ? 1...
... Douglas Fuller
03:54 PM Bug #16588 (Resolved): ceph mds dump show incorrect number of metadata pools.
Greg Farnum
11:20 AM Bug #19955 (Fix Under Review): Too many stat ops when MDS trying to probe a large file
https://github.com/ceph/ceph/pull/15131 Zheng Yan
11:12 AM Bug #19955: Too many stat ops when MDS trying to probe a large file
File size recovery should only happen when a client has failed to reconnect properly during the MDS restart. Is that... John Spray
03:29 AM Bug #19955 (Resolved): Too many stat ops when MDS trying to probe a large file
When MDS recovers, it may emit tons of `stat` ops to OSDs trying to probe a large file, which as a result prevents cl... Sandy Xu
10:49 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
hi zheng yan:
disable page cache at the client(fuse_disable_pagecache = true), problem disappear;
junming rao
09:06 AM Bug #19946 (Fix Under Review): CInode.cc: 2481: FAILED assert(s == nested_auth_pins)
https://github.com/ceph/ceph/pull/15130 Zheng Yan
03:40 AM Bug #19946: CInode.cc: 2481: FAILED assert(s == nested_auth_pins)
happens only when mds_debug_auth_pins is enabled. the new dirfrag hasn't been added to inode's frag map, which confus... Zheng Yan

05/16/2017

07:51 PM Bug #19946: CInode.cc: 2481: FAILED assert(s == nested_auth_pins)
Full MDS log: https://www.dropbox.com/s/skjrie44ngc27cj/mds.b.log.bz2 Douglas Fuller
07:50 PM Bug #19946 (Resolved): CInode.cc: 2481: FAILED assert(s == nested_auth_pins)
nested_auth_pins == 2
s == 1
2017-05-16 10:15:54.754892 7fea71adb700 -1 /home/dfuller/ceph/src/mds/CInode.cc:
I...
Douglas Fuller
11:00 AM Bug #19934: ceph fs set cephfs standby_count_wanted 0 fails on jewel upgrade
... John Spray
02:19 AM Bug #19934 (Resolved): ceph fs set cephfs standby_count_wanted 0 fails on jewel upgrade
... Sage Weil
10:00 AM Bug #19892 (Fix Under Review): Test failure: test_purge_queue_op_rate fails
John Spray
09:52 AM Bug #19893 (Resolved): test_rebuild_simple_altpool fails
John Spray
09:45 AM Feature #19362 (Resolved): mds: add perf counters for each type of MDS operation
John Spray
09:43 AM Bug #19890 (Resolved): src/test/pybind/test_cephfs.py fails
John Spray
09:24 AM Feature #19820 (Resolved): ceph-fuse: use userspace permission check by default
Zheng Yan
09:14 AM Bug #19755 (Pending Backport): MDS became unresponsive when truncating a very large file
John Spray

05/15/2017

09:31 PM Bug #19734 (Resolved): mds: subsystems like ceph_subsys_mds_balancer do not log correctly
John Spray
08:40 PM Bug #19893 (Fix Under Review): test_rebuild_simple_altpool fails
https://github.com/ceph/ceph/pull/15094 Douglas Fuller
01:46 PM Bug #19892: Test failure: test_purge_queue_op_rate fails
Let's tweak the numbers on this test to get more consistent behaviour (more files), or maybe set the concurrent purge... John Spray
12:25 PM Bug #19891 (Resolved): Test failure: test_full_different_file
John Spray
12:18 PM Bug #19828 (Resolved): mds: valgrind InvalidRead detected in Locker
John Spray
12:18 PM Bug #19903 (Resolved): LibCephFS.ClearSetuid fails
John Spray
08:04 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
Zheng Yan wrote:
> can you reproduce this issue? (errors in the client.log are normal, they shouldn't cause this iss...
Zheng Yan
07:24 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
Zheng Yan wrote:
> can you reproduce this issue? (errors in the client.log are normal, they shouldn't cause this iss...
Zheng Yan
07:12 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
hi zheng yan:
reproduce this issue is easy, but I don't know what specific logs need to be opened at the client an...
junming rao

05/12/2017

11:02 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...

The blacklisting happened here:...
John Spray
10:04 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
So it turns out this is exposing an underlying issue (in at least some cases):
http://qa-proxy.ceph.com/teuthology/j...
John Spray
09:41 AM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
Please don't close, since issue is valid, and maybe someone wil eventually fix it. Марк Коренберг
09:29 AM Feature #19819 (Rejected): Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
Okay. I'm going to close this, because implementing that ioctl efficiently isn't doable without major changes to how... John Spray
09:12 AM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
This is feature request. I have no skills and proper knowledge on CephFS to implement this feature.
Motivation -- ...
Марк Коренберг
09:01 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
can you reproduce this issue? (errors in the client.log are normal, they shouldn't cause this issue) Zheng Yan
08:48 AM Bug #19912 (Fix Under Review): kcephfs: Test failure: test_trim_caps
https://github.com/ceph/ceph/pull/15062 Zheng Yan
08:17 AM Bug #19912 (Resolved): kcephfs: Test failure: test_trim_caps
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-11_05:20:03-kcephfs-kraken-testing-basic-smithi/1122976/
<p...
Zheng Yan
12:34 AM Bug #19903 (Fix Under Review): LibCephFS.ClearSetuid fails
https://github.com/ceph/ceph/pull/15039 Zheng Yan

05/11/2017

06:46 PM Feature #17834 (Resolved): MDS Balancer overrides
Patrick Donnelly
06:24 PM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
Марк: is this ticket describing something you want to work on, or is it just a request? In either case, what's the m... John Spray
09:46 AM Bug #19635 (Pending Backport): Deadlock on two ceph-fuse clients accessing the same file
John Spray
09:12 AM Bug #19589 (Resolved): greedyspill.lua: :18: attempt to index a nil value (field '?')
John Spray
07:44 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
Zheng Yan wrote:
> what does 'written in part' mean? application wrote ~23G, failed to write the rest, or applicatio...
junming rao

05/10/2017

01:25 PM Bug #19903 (Resolved): LibCephFS.ClearSetuid fails
all libcephfs/test.sh failures in http://pulpito.ceph.com/pdonnell-2017-05-10_02:32:15-fs-wip-pdonnell-integration-di... Zheng Yan
08:59 AM Bug #19891 (Fix Under Review): Test failure: test_full_different_file
https://github.com/ceph/ceph/pull/15026/ Zheng Yan
08:12 AM Bug #19891: Test failure: test_full_different_file
mds didn't get osd op reply for purge queue log prezero operations. It seems osd dropped these requests... Zheng Yan
02:22 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
Zheng Yan
02:21 AM Bug #19892: Test failure: test_purge_queue_op_rate fails
seems like test case issue. asok command can use several seconds, which is enough for deleting all files. Zheng Yan
01:02 AM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
Patrick Donnelly
12:35 AM Bug #19890 (Fix Under Review): src/test/pybind/test_cephfs.py fails
Zheng Yan
12:34 AM Bug #19890: src/test/pybind/test_cephfs.py fails
https://github.com/ceph/ceph/pull/15018 Zheng Yan

05/09/2017

09:47 PM Bug #19896: client: test failure for O_RDWR file open
This problem also appears to be affecting a few other tests:... Patrick Donnelly
09:42 PM Bug #19896 (Duplicate): client: test failure for O_RDWR file open
We have a test failure in test_cephfs.test_open (and test_cephfs.test_mount_unmount): http://pulpito.ceph.com/pdonnel... Patrick Donnelly
01:37 PM Bug #19450 (Resolved): PurgeQueue read journal crash
Zheng Yan
01:23 PM Bug #19893 (Resolved): test_rebuild_simple_altpool fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:25:02-kcephfs-master-testing-basic-smithi/1113182/ Zheng Yan
01:15 PM Bug #19892 (Resolved): Test failure: test_purge_queue_op_rate fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:25:02-kcephfs-master-testing-basic-smithi/1107011/ Zheng Yan
01:05 PM Bug #19891 (Resolved): Test failure: test_full_different_file
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-06_03:15:02-fs-master---basic-smithi/1106624/ Zheng Yan
12:52 PM Bug #19890 (Resolved): src/test/pybind/test_cephfs.py fails
http://qa-proxy.ceph.com/teuthology/teuthology-2017-05-08_03:15:05-fs-master---basic-smithi/1113004/teuthology.log
...
Zheng Yan
02:49 AM Bug #19426 (Can't reproduce): knfs blogbench hang
Zheng Yan

05/08/2017

03:24 PM Backport #19846 (In Progress): jewel: write to cephfs mount hangs, ceph-fuse and kernel
Jan Fajerski
02:00 PM Backport #19845 (In Progress): kraken: write to cephfs mount hangs, ceph-fuse and kernel
Jan Fajerski
09:46 AM Bug #19426: knfs blogbench hang
Jeff Layton wrote:
> Sorry I didn't see this sooner. Is this still cropping up?
>
> So what might be helpful the ...
Zheng Yan
09:38 AM Bug #19426: knfs blogbench hang
It seems the crash no longer happen after rebase the testing branch against 4.11 kernel Zheng Yan
09:25 AM Bug #19828 (Fix Under Review): mds: valgrind InvalidRead detected in Locker
https://github.com/ceph/ceph/pull/14991 Zheng Yan
03:27 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
what does 'written in part' mean? application wrote ~23G, failed to write the rest, or application wrote 26G but the ... Zheng Yan
02:59 AM Bug #18798 (Resolved): FS activity hung, MDS reports client "failing to respond to capability rel...
"ceph: try getting buffer capability for readahead/fadvise" has backported into 4.9.x Zheng Yan

05/04/2017

07:06 PM Feature #19862 (New): mds: add LTTnG tracepoints for each type of MDS operation
It would be nice to know the latency of different file system operations so we can see which operations:
- scale poo...
Michael Sevilla
09:47 AM Bug #19854: ceph-fuse write a big file,The file is only written in part
upload client log file junming rao
09:41 AM Bug #19854 (Duplicate): ceph-fuse write a big file,The file is only written in part
application write a big file( 26GB) to cephfs, the file is only written in part(23GB);
ceph version: 10.2.6 (656b5b6...
junming rao

05/03/2017

08:20 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
https://github.com/ceph/ceph/pull/15000 Nathan Cutler
08:20 PM Backport #19845 (Resolved): kraken: write to cephfs mount hangs, ceph-fuse and kernel
https://github.com/ceph/ceph/pull/14998 Nathan Cutler
05:46 PM Feature #19362: mds: add perf counters for each type of MDS operation
https://github.com/ceph/ceph/pull/14938 Michael Sevilla
03:22 PM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Zheng Yan wrote:
> elder one wrote:
> > One difference I noticed between 4.4 and 4.9 kernels
> > - with 4.4 kerne...
elder one
02:27 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
>
>...
Zheng Yan
02:06 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
elder one wrote:
> One difference I noticed between 4.4 and 4.9 kernels
> - with 4.4 kernel on cephfs directory si...
Zheng Yan
11:04 AM Backport #18699 (Resolved): jewel: client: fix the cross-quota rename boundary check conditions
Nathan Cutler
07:47 AM Bug #18872 (Pending Backport): write to cephfs mount hangs, ceph-fuse and kernel
Zheng Yan
05:25 AM Feature #19819: Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
This is pretty infeasible right now -- nothing tracks which bits of a file are allocated or exist. The closest it com... Greg Farnum
03:27 AM Bug #19828 (Resolved): mds: valgrind InvalidRead detected in Locker
... Patrick Donnelly
01:55 AM Bug #17819 (Can't reproduce): MDS crashed while performing snapshot creation and deletion in a loop
Zheng Yan
01:53 AM Bug #17819: MDS crashed while performing snapshot creation and deletion in a loop
run the test overnight, can't reproduce the crash. Zheng Yan

05/02/2017

09:40 AM Bug #17408 (Can't reproduce): Possible un-needed wait on rstats when listing dir?
can't reproduce. probably fixed by ... Zheng Yan
09:21 AM Bug #18798: FS activity hung, MDS reports client "failing to respond to capability release"
Just reporting that no errors after using patched (commit 2b1ac852) cephfs kernel module.
Also upgraded ceph to 1...
elder one
08:29 AM Feature #19820 (Fix Under Review): ceph-fuse: use userspace permission check by default
https://github.com/ceph/ceph/pull/14907 Zheng Yan
07:39 AM Feature #19820 (Resolved): ceph-fuse: use userspace permission check by default
Zheng Yan
07:32 AM Bug #19812: client: not swapping directory caps efficiently leads to very slow create chains
The reason of slow creation/deletion is that ceph-fuse sends a getattr request (check permission of test directory) b... Zheng Yan
06:17 AM Feature #19819 (Rejected): Add support of FS_IOC_FIEMAP ioctl on files, accessible through CephFS
The purpose of this -- is to have standard interface for examining allocated extents of the files. For example, xfs_i... Марк Коренберг

04/28/2017

10:09 PM Bug #19812 (New): client: not swapping directory caps efficiently leads to very slow create chains
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg34200.html
In short: if you have a ceph-fuse and a kerne...
Greg Farnum

04/27/2017

07:27 AM Bug #18872: write to cephfs mount hangs, ceph-fuse and kernel
PR: https://github.com/ceph/ceph/pull/14822 Jan Fajerski
03:21 AM Bug #19789 (New): FAIL: test_evict_client (tasks.cephfs.test_misc.TestMisc)
http://qa-proxy.ceph.com/teuthology/teuthology-2017-04-24_03:15:02-fs-master---basic-smithi/ Zheng Yan

04/25/2017

09:03 AM Bug #19755 (Fix Under Review): MDS became unresponsive when truncating a very large file
https://github.com/ceph/ceph/pull/14769 Zheng Yan
03:39 AM Bug #19755 (Resolved): MDS became unresponsive when truncating a very large file
We were trying to copy a very large file (7TB exactly) between two directories through Samba/CephFS, and cancelled it... Sandy Xu
06:52 AM Backport #19763 (Resolved): kraken: non-local cephfs quota changes not visible until some IO is done
https://github.com/ceph/ceph/pull/16108 Nathan Cutler
06:52 AM Backport #19762 (Resolved): jewel: non-local cephfs quota changes not visible until some IO is done
https://github.com/ceph/ceph/pull/15466 Nathan Cutler
 

Also available in: Atom