Project

General

Profile

Activity

From 06/08/2017 to 07/07/2017

07/07/2017

10:18 PM Bug #2494: mds: Cannot remove directory despite it being empty.
I've moved these dirs to ... David Galloway
10:14 PM Bug #20072 (Fix Under Review): TestStrays.test_snapshot_remove doesn't handle head whiteout in pg...
https://github.com/ceph/ceph/pull/16226 Patrick Donnelly
04:52 PM Bug #20072 (In Progress): TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls re...
Patrick Donnelly
09:14 PM Bug #20549 (Resolved): cephfs-journal-tool: segfault during journal reset
... Patrick Donnelly
12:47 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.cou...
Ivan Guan
10:53 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
If inode numbers were removed from inotable when replaying journal. how did "assert(inode_map.count(in->vino()) == 0)... Zheng Yan
09:38 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls",...
Ivan Guan
09:13 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
you can use cephfs-table-tool manually remove used inodes from inotable. (use "rados -p data ls", "rados -p metadata ... Zheng Yan
07:59 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Zheng Yan wrote:
> I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inco...
Ivan Guan
07:47 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Thank you,i made a test of running "ceph daemon mds.a scrub_path / force recursive repair" after scan_inodes.As you s... Ivan Guan
06:24 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
I fixed a cephfs-data-scan bug. https://github.com/ceph/ceph/pull/16202. It can explain this inconsistent state. Zheng Yan
06:23 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Ivan Guan wrote:
> Ivan Guan wrote:
> > After carefull consideration, i try to fix this bug using first solution.
...
Zheng Yan
03:43 AM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Ivan Guan wrote:
> After carefull consideration, i try to fix this bug using first solution.
> function inject_with...
Ivan Guan
12:41 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks

root@bhs1-mail02-ds05:~# cat /proc/sys/fs/file-nr
3360 0 3273932
root@bhs1-mail02-ds05:~# cat /proc/sys/...
Webert Lima
12:34 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
{
"id": 1029565,
"num_leases": 0,
*"num_caps": 2588906,*
"state": "open",
"replay_requests": 0,
"completed_requ...
Zheng Yan
12:20 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
There is something interesting here. The host "bhs1-mail02-ds05.m9.network" has the highest number of caps (over 2.5M... Webert Lima
11:45 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Running it today on the same cluster that crashed (client usage is about the same by the time):
~# ceph daemon mds...
Webert Lima
03:19 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
I recently found a bug. It can explain the crash.
https://github.com/ceph/ceph/pull/15440...
Zheng Yan
10:15 AM Bug #17435: Crash in ceph-fuse in ObjectCacher::trim while adding an OSD
the ceph-fuse client crash when top application reading , the version is 10.2.3.
bh_lru_rest containt a wrong state...
Jay Lee
06:10 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())

The recover_dentries command of journal tool only inject link, but never delete old links. this causes duplicated p...
Zheng Yan
02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Log snippet from: /ceph/teuthology-archive/pdonnell-2017-07-01_01:07:39-fs-wip-pdonnell-20170630-distro-basic-smithi/... Patrick Donnelly
02:06 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Zheng, adding that patch lets the test make progress but there still appears to be a problem:... Patrick Donnelly
05:31 AM Feature #10792: qa: enable thrasher for MDS cluster size (vary max_mds)
https://github.com/ceph/ceph/pull/16200 Patrick Donnelly
05:29 AM Feature #19230: Limit MDS deactivation to one at a time
thrasher now deactivates one at a time: https://github.com/ceph/ceph/pull/15950 Patrick Donnelly
05:19 AM Bug #20212 (Resolved): test_fs_new failure on race between pool creation and appearance in `df`
Patrick Donnelly
04:47 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
Patrick Donnelly
04:42 AM Bug #20376 (Resolved): last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
Patrick Donnelly
04:41 AM Bug #20254 (Resolved): mds: coverity error in Server::_rename_prepare
Patrick Donnelly
04:40 AM Bug #20318 (Resolved): Race in TestExports.test_export_pin
Patrick Donnelly
04:35 AM Bug #16914 (Resolved): multimds: pathologically slow deletions in some tests
Patrick Donnelly

07/06/2017

09:30 PM Bug #20537: mds: MDLog.cc: 276: FAILED assert(!capped)
Zheng, please look at this one. Patrick Donnelly
09:30 PM Bug #20537 (Resolved): mds: MDLog.cc: 276: FAILED assert(!capped)
... Patrick Donnelly
08:48 PM Backport #20140: jewel: Journaler may execute on_safe contexts prematurely
Original backport (the one that went into 10.2.8) was incomplete. See #20536 for the completion. Nathan Cutler
08:46 PM Backport #20536 (In Progress): jewel: Journaler may execute on_safe contexts prematurely (part 2)
Nathan Cutler
08:45 PM Backport #20536 (Rejected): jewel: Journaler may execute on_safe contexts prematurely (part 2)
https://github.com/ceph/ceph/pull/16192 Nathan Cutler
08:20 PM Backport #20028 (In Progress): kraken: Deadlock on two ceph-fuse clients accessing the same file
Nathan Cutler
08:19 PM Bug #20535 (Resolved): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
The Active MDS crashes, all clients freeze and the standby (or standby-replay) daemon takes hours to recover.
cep...
Webert Lima
08:17 PM Backport #20026 (In Progress): kraken: cephfs: MDS became unresponsive when truncating a very lar...
Nathan Cutler
01:46 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
This inconsistent state is created by cephfs-data-scan? I think we can avoid injecting dentry to stray directory (inj... Zheng Yan
01:32 PM Bug #20494: cephfs_data_scan: try_remove_dentries_for_stray assertion failure
After carefull consideration, i try to fix this bug using first solution.
function inject_with_backtrace will traver...
Ivan Guan
01:29 PM Bug #20467: Ceph FS kernel client not consistency
Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > Please try
> >
> > https://github.com/ceph/ceph-client/commit/6c51866...
Zheng Yan
10:04 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> Please try
>
> https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace...
Yunzhi Cheng
04:20 AM Bug #20467: Ceph FS kernel client not consistency
Please try
https://github.com/ceph/ceph-client/commit/6c51866e67dda544b9be524185a526cf4ace865a
Zheng Yan
02:58 AM Backport #20349 (In Progress): jewel: df reports negative disk "used" value when quota exceed
Wei-Chung Cheng
02:53 AM Backport #20403 (In Progress): jewel: cephfs permission denied until second client accesses file
Wei-Chung Cheng

07/05/2017

02:21 PM Feature #16523 (Resolved): Assert directory fragmentation is occuring during stress tests
... John Spray
01:33 PM Bug #17731 (Can't reproduce): MDS stuck in stopping with other rank's strays
This code has all changed a lot since. John Spray
09:02 AM Bug #20467: Ceph FS kernel client not consistency
Yunzhi Cheng wrote:
>
> Seems it's error, should I do echo module ceph +p > /sys/kernel/debug/dynamic_debug ?
>
...
Zheng Yan
08:30 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> ceph-mds version and configure, dirfrag or multimds enabled?
>
> please describe workload you ...
Yunzhi Cheng
07:22 AM Bug #20467: Ceph FS kernel client not consistency
ceph-mds version and configure, dirfrag or multimds enabled?
please describe workload you put on cephfs, how many ...
Zheng Yan
03:08 AM Bug #20467: Ceph FS kernel client not consistency
Zheng Yan wrote:
> I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could yo...
Yunzhi Cheng

07/04/2017

09:13 PM Backport #20500 (In Progress): kraken: src/test/pybind/test_cephfs.py fails
Nathan Cutler
09:12 PM Backport #20500 (Resolved): kraken: src/test/pybind/test_cephfs.py fails
https://github.com/ceph/ceph/pull/16114 Nathan Cutler
09:10 PM Bug #19890 (Pending Backport): src/test/pybind/test_cephfs.py fails
Nathan Cutler
11:30 AM Backport #19763 (In Progress): kraken: non-local cephfs quota changes not visible until some IO i...
Nathan Cutler
11:26 AM Backport #19710 (In Progress): kraken: Enable MDS to start when session ino info is corrupt
Nathan Cutler
11:17 AM Backport #19680 (In Progress): kraken: MDS: damage reporting by ino number is useless
Nathan Cutler
11:12 AM Backport #19678 (In Progress): kraken: Jewel ceph-fuse does not recover after lost connection to MDS
Nathan Cutler
11:08 AM Backport #19676 (In Progress): kraken: cephfs: Test failure: test_data_isolated (tasks.cephfs.tes...
Nathan Cutler
11:06 AM Backport #19674 (In Progress): kraken: cephfs: mds is crushed, after I set about 400 64KB xattr k...
Nathan Cutler
11:04 AM Backport #19672 (In Progress): kraken: MDS assert failed when shutting down
Nathan Cutler
10:58 AM Backport #19669 (In Progress): kraken: MDS goes readonly writing backtrace for a file whose data ...
Nathan Cutler
10:36 AM Backport #19667 (In Progress): kraken: fs:The mount point break off when mds switch hanppened.
Nathan Cutler
10:28 AM Bug #20494 (Closed): cephfs_data_scan: try_remove_dentries_for_stray assertion failure
Using teuthology run data-scan.yaml and when completed test_data_scan.py:test_parallel_execution test case i want to ... Ivan Guan
10:07 AM Backport #19664 (In Progress): kraken: C_MDSInternalNoop::complete doesn't free itself
Nathan Cutler

07/03/2017

04:12 PM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
Patrick Donnelly
01:48 PM Bug #20424 (Fix Under Review): doc: improve description of `mds deactivate` to better contrast wi...
In the absence of inspiration for better names:
https://github.com/ceph/ceph/pull/16080
It would be really nice...
John Spray
01:38 PM Bug #20424: doc: improve description of `mds deactivate` to better contrast with `mds fail`
It may be useful to change the names of these commands to be more intuitive. Patrick Donnelly
03:45 AM Bug #20469: Ceph Client can't access file and show '???'
Yunzhi Cheng wrote:
> Zheng Yan wrote:
> > this one and http://tracker.ceph.com/issues/20467 may be caused by the s...
Zheng Yan

07/01/2017

02:41 PM Bug #20469: Ceph Client can't access file and show '???'
Zheng Yan wrote:
> this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the n...
Yunzhi Cheng

06/30/2017

07:43 PM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
Here's another: /a/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/1333703 Patrick Donnelly
02:08 PM Bug #20469: Ceph Client can't access file and show '???'
this one and http://tracker.ceph.com/issues/20467 may be caused by the same bug. please try the newest upstream kernel Zheng Yan
07:19 AM Bug #20469 (Need More Info): Ceph Client can't access file and show '???'
Very strange behavior... Yunzhi Cheng
10:53 AM Bug #20467: Ceph FS kernel client not consistency
I checked ubuntu kenrel. It does not contain following commit. I suspect it's the cause. Could you please try newest ... Zheng Yan
03:01 AM Bug #20467: Ceph FS kernel client not consistency
... Yunzhi Cheng
02:55 AM Bug #20467: Ceph FS kernel client not consistency
kernel version is 4.4.0-46 Yunzhi Cheng
02:43 AM Bug #20467: Ceph FS kernel client not consistency
kernel version ? Zheng Yan
02:40 AM Bug #20467 (Resolved): Ceph FS kernel client not consistency
I use 'ls' to list files in one directory, two client has different result... Yunzhi Cheng

06/29/2017

09:45 AM Bug #20440 (Fix Under Review): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotabl...
caused by https://github.com/ceph/ceph/pull/15844.
need incremental patch https://github.com/ukernel/ceph/commit/47...
Zheng Yan

06/28/2017

07:37 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
Douglas Fuller
05:33 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
https://github.com/ceph/ceph/pull/15982 Nathan Cutler
05:09 PM Bug #20452 (Fix Under Review): Adding pool with id smaller then existing data pool ids breaks MDS...
Jan Fajerski
04:14 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
Seems like the implementation of MDSMap::is_data_pool makes a wrong assumption by using binary_search.
data_pools is...
Jan Fajerski
04:09 PM Bug #20452: Adding pool with id smaller then existing data pool ids breaks MDSMap::is_data_pool
removing pool 1 fixes the issues
ceph fs rm_data_pool fs 1
Creating files now works again.
Jan Fajerski
04:06 PM Bug #20452 (Resolved): Adding pool with id smaller then existing data pool ids breaks MDSMap::is_...
To reproduce:
Setup ceph cluster with mds but don't create fs yet (nor the necessary pools). Equivalent to the resul...
Jan Fajerski
03:29 PM Bug #20441 (Fix Under Review): mds: failure during data scan
https://github.com/ceph/ceph/pull/15979 Douglas Fuller
03:20 PM Bug #20441: mds: failure during data scan
That error is whitelisted; the relevant error is:
failure_reason: '"2017-06-27 21:09:57.713761 mds.a mds.0 172.21....
Douglas Fuller
01:02 AM Bug #20441: mds: failure during data scan
Doug, please take a look at this one. Patrick Donnelly
01:01 AM Bug #20441 (Resolved): mds: failure during data scan
... Patrick Donnelly
12:34 AM Bug #20440: mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_version())
Zheng, please take a look at this one. Patrick Donnelly
12:34 AM Bug #20440 (Resolved): mds: mds/journal.cc: 1559: FAILED assert(inotablev == mds->inotable->get_v...
From:
http://qa-proxy.ceph.com/teuthology/pdonnell-2017-06-27_19:50:40-fs-wip-pdonnell-20170627---basic-smithi/133...
Patrick Donnelly

06/27/2017

04:11 PM Bug #20072: TestStrays.test_snapshot_remove doesn't handle head whiteout in pgls results
FWIW that is basicaly what we did with the rados api test cleanup failures (loop waiting for snaptrimmer to do its th... Sage Weil
04:59 AM Bug #20424 (Resolved): doc: improve description of `mds deactivate` to better contrast with `mds ...
Currently the help output is not very useful for `ceph mds deactivate`:... Patrick Donnelly
02:21 AM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
Also: https://github.com/ceph/ceph/pull/15937 Patrick Donnelly
02:07 AM Backport #20412 (Fix Under Review): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) ...
John, that looks like the problem. Here's a PR:
https://github.com/ceph/ceph/pull/15936
Patrick Donnelly

06/26/2017

08:25 PM Backport #20027 (Resolved): jewel: Deadlock on two ceph-fuse clients accessing the same file
John Spray
08:24 PM Backport #19846 (Resolved): jewel: write to cephfs mount hangs, ceph-fuse and kernel
John Spray
08:21 PM Backport #20412: test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in Jewel 10.2...
Aargh, I think this might just be failing because this is a new test that was written for luminous, where client_quot... John Spray
07:49 PM Backport #20412 (In Progress): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails...
I dug into the logs. It looks like the MDS is not sending a quota update to the client. From a brief look at the code... Patrick Donnelly
07:30 PM Bug #20337 (New): test_rebuild_simple_altpool triggers MDS assertion
John Spray
06:54 PM Bug #20337: test_rebuild_simple_altpool triggers MDS assertion
I'm not seeing how Filesystem.are_daemons_healthy is waiting for daemons outside the filesystem: it's inspecting daem... John Spray
06:42 PM Bug #20337 (Need More Info): test_rebuild_simple_altpool triggers MDS assertion
wait_for_daemons should wait for every daemon regardless of filesystem. is there a failure log I can look at? Douglas Fuller
05:26 AM Backport #20140 (Resolved): jewel: Journaler may execute on_safe contexts prematurely
Nathan Cutler

06/25/2017

07:56 AM Backport #20412 (Resolved): test_remote_update_write (tasks.cephfs.test_quota.TestQuota) fails in...
https://github.com/ceph/ceph/pull/15936 Nathan Cutler

06/23/2017

08:14 PM Backport #20404 (Rejected): kraken: cephfs permission denied until second client accesses file
Nathan Cutler
08:14 PM Backport #20403 (Resolved): jewel: cephfs permission denied until second client accesses file
https://github.com/ceph/ceph/pull/16150 Nathan Cutler
12:04 PM Backport #20148 (Resolved): jewel: Too many stat ops when MDS trying to probe a large file
John Spray
08:05 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
try uploading it somewhere else or send it to my email zyan@redhat.com Zheng Yan
07:51 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
yanmei ding wrote:
> yanmei ding wrote:
> > Zheng Yan wrote:
> > > please upload detailed log for the slow case.
...
yanmei ding
07:47 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
yanmei ding wrote:
> Zheng Yan wrote:
> > please upload detailed log for the slow case.
>
> This is a detailed l...
yanmei ding
07:46 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
Zheng Yan wrote:
> please upload detailed log for the slow case.
This is a detailed log.
Thank you!
yanmei ding
06:43 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
please upload detailed log for the slow case. Zheng Yan
02:05 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
John Spray wrote:
> In that case I suggest you wait for 12.1.0 to see if the issue is fixed there.
John Spray: I ...
yanmei ding
03:54 AM Bug #20376 (Fix Under Review): last_epoch_(over|under) in MDBalancer should be updated if mds0 ha...
Patrick Donnelly
12:17 AM Bug #20376: last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
There is a merge request for this bug fix: https://github.com/ceph/ceph/pull/15825, could you have a review? @Patrick Jianyu Li

06/22/2017

10:51 PM Bug #20376: last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
Patrick Donnelly
02:20 AM Bug #20376 (Resolved): last_epoch_(over|under) in MDBalancer should be updated if mds0 has failed
When mds0 has failed and started up again, it will reset beat_epoch to zero. In this case, other MDSes should update ... Jianyu Li
05:28 PM Bug #20122: Ceph MDS crash with assert failure
Are you able to reliably reproduce this? Do you have any MDS logs during the failure? Patrick Donnelly
11:08 AM Bug #20340 (Pending Backport): cephfs permission denied until second client accesses file
John Spray
11:07 AM Bug #20338 (Resolved): mem leak in Journaler::_issue_read() in ceph-mds
John Spray
11:06 AM Bug #20165 (Resolved): Deadlock during shutdown in PurgeQueue::_consume
John Spray
10:57 AM Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemon...
The fix is not working in at least some cases. Here's a smoking gun failure:
http://pulpito.ceph.com/jspray-2017-...
John Spray
10:47 AM Feature #20196: mds: early reintegration of strays on hardlink deletion
Zheng's patch for the special case (both links in cache at time of primary unlink) is merged for luminous -- hopefull... John Spray

06/21/2017

10:06 PM Bug #20212 (Fix Under Review): test_fs_new failure on race between pool creation and appearance i...
https://github.com/ceph/ceph/pull/15822 Patrick Donnelly
08:36 PM Bug #20212 (In Progress): test_fs_new failure on race between pool creation and appearance in `df`
Patrick Donnelly
09:12 PM Bug #20254 (Fix Under Review): mds: coverity error in Server::_rename_prepare
https://github.com/ceph/ceph/pull/15818 Patrick Donnelly
08:34 PM Bug #20318 (Fix Under Review): Race in TestExports.test_export_pin
https://github.com/ceph/ceph/pull/15817 Patrick Donnelly
02:58 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
In that case I suggest you wait for 12.1.0 to see if the issue is fixed there. John Spray
12:52 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
John Spray wrote:
> yanmei ding: there have been fixes on master since 12.0.3, please could you retest with the tip ...
yanmei ding
10:22 AM Bug #20340: cephfs permission denied until second client accesses file
Thanks for this patch. It seems to fix the problem for our users. Dan van der Ster
07:57 AM Bug #20340 (Fix Under Review): cephfs permission denied until second client accesses file
https://github.com/ceph/ceph/pull/15800 Zheng Yan

06/20/2017

06:25 PM Bug #20170 (Resolved): filelock_interrupt.py fails on multimds
John Spray
06:21 PM Documentation #13311 (Resolved): explain user permission syntax, details
John Spray
06:21 PM Documentation #13311: explain user permission syntax, details
Not sure why this got made a FS ticket, but fortunately I wrote the docs for cephfs client auth caps a while ago so t... John Spray
06:20 PM Feature #8786 (Resolved): ceph kernel module for el7
CephFS kernel module is in RHEL since 7.4. the kmod-* packages are discontinued as per the note at https://github.co... John Spray
06:15 PM Bug #20060 (Resolved): segmentation fault in _do_cap_update
John Spray
06:14 PM Bug #20131 (Resolved): mds/MDBalancer: update MDSRank export_targets according to current balance...
John Spray
06:14 PM Bug #20335 (Resolved): test_migration_on_shutdown, test_grow_shrink failing
John Spray
01:24 PM Bug #20335 (Fix Under Review): test_migration_on_shutdown, test_grow_shrink failing
https://github.com/ceph/ceph/pull/15768 Zheng Yan
06:09 PM Bug #16914 (Fix Under Review): multimds: pathologically slow deletions in some tests
It looks like this case is now working properly with the latest code, so flipping this ticket to need review and remo... John Spray
02:03 PM Bug #18641 (Can't reproduce): mds: stalled clients apparently due to stale sessions
John Spray
02:02 PM Bug #17069 (Closed): multimds: slave rmdir assertion failure
Closing because currently we know that snapshots+multimds is broken. John Spray
02:00 PM Bug #16925 (Can't reproduce): multimds: cfuse (?) hang on fsx.sh workunit
John Spray
01:54 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
yanmei ding: there have been fixes on master since 12.0.3, please could you retest with the tip of master? John Spray
01:08 PM Bug #20334: I/O become slowly when multi mds which subtree root has replica
Zheng Yan wrote:
> yanmei ding wrote:
> > John Spray wrote:
> > > Yanmei Ding: can you be more specific about how ...
yanmei ding
08:03 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
yanmei ding wrote:
> John Spray wrote:
> > Yanmei Ding: can you be more specific about how to reproduce this or wha...
Zheng Yan
01:20 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
John Spray wrote:
> Yanmei Ding: can you be more specific about how to reproduce this or what is going wrong interna...
yanmei ding
01:49 PM Fix #20246 (In Progress): Make clog message on scrub errors friendlier.
John Spray
01:47 PM Bug #20282 (Closed): qa: missing even trivial tests for many commands
John Spray
01:44 PM Bug #20282: qa: missing even trivial tests for many commands
The script just greps for anything that looks like a COMMMAND and then greps for their existence in qa/ and src/tests... Greg Farnum
01:44 PM Bug #20329 (Resolved): Ceph file system hang on Jewel
Resolving, patch will show up in stable release as and when. John Spray
01:42 AM Bug #20329: Ceph file system hang on Jewel
Eric Eastman wrote:
> The number in the first column changes. Here is the output running the command in a while loop...
Zheng Yan
11:05 AM Bug #20338 (Fix Under Review): mem leak in Journaler::_issue_read() in ceph-mds
https://github.com/ceph/ceph/pull/15776
Zheng Yan
09:49 AM Bug #20340: cephfs permission denied until second client accesses file
Yup, in this case the diri is_stray (it looks like this... Dan van der Ster
09:28 AM Bug #20340: cephfs permission denied until second client accesses file
Ahh so it *is* related to path-restricted cap.
I tried as above with client B having the same client caps -- didn't ...
Dan van der Ster
09:18 AM Bug #20340: cephfs permission denied until second client accesses file
I've confirmed that none of these help resolve these EPERM files:
* restart the ceph-fuse on client A
* mount...
Dan van der Ster
09:29 AM Bug #17858: Cannot create deep directories when caps contain "path=/somepath"
Just pinging this to say that there remain some issues with path-restricted caps, as shown in #20340. Dan van der Ster

06/19/2017

08:52 PM Backport #20350 (Rejected): kraken: df reports negative disk "used" value when quota exceed
Nathan Cutler
08:52 PM Backport #20349 (Resolved): jewel: df reports negative disk "used" value when quota exceed
https://github.com/ceph/ceph/pull/16151 Nathan Cutler
03:54 PM Bug #20338: mem leak in Journaler::_issue_read() in ceph-mds
Was there a teuthology run where this was happening? John Spray
10:57 AM Bug #20338 (Resolved): mem leak in Journaler::_issue_read() in ceph-mds
... Kefu Chai
01:47 PM Bug #20341 (Duplicate): test_migration_on_shutdown fails on master
John Spray
01:29 PM Bug #20341 (Duplicate): test_migration_on_shutdown fails on master
http://qa-proxy.ceph.com/teuthology/teuthology-2017-06-19_03:15:05-fs-master-distro-basic-smithi/1300512/teuthology.l... Zheng Yan
01:30 PM Bug #20340: cephfs permission denied until second client accesses file
I should mention that while client A was a user that has a path-restricted mds cap, the client B that "fixes" the EPE... Dan van der Ster
01:25 PM Bug #20340 (Resolved): cephfs permission denied until second client accesses file
Here is a file that client A gets permission denied during stat:... Dan van der Ster
12:49 PM Bug #20329: Ceph file system hang on Jewel
Eric: that's a conversation to have with whoever is providing your kernel -- the kernel bits of Ceph are not part of ... John Spray
12:38 PM Bug #20329: Ceph file system hang on Jewel
The number in the first column changes. Here is the output running the command in a while loop, once a second. Every ... Eric Eastman
09:19 AM Bug #20329: Ceph file system hang on Jewel
... Zheng Yan
12:41 PM Bug #20178 (Pending Backport): df reports negative disk "used" value when quota exceed
John Spray
10:53 AM Bug #20282: qa: missing even trivial tests for many commands
Greg: can you say which script you're looking to cover these commands in? Things like session kill would be pretty a... John Spray
10:50 AM Bug #20272 (Rejected): Ceph OSD & MDS Failure
I don't think there's anything to be done with this right now -- feel free to reopen if there's some other evidence t... John Spray
10:47 AM Bug #20334: I/O become slowly when multi mds which subtree root has replica
Yanmei Ding: can you be more specific about how to reproduce this or what is going wrong internally? John Spray
10:34 AM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
Two things are going wrong here, I think:
* The test code is doing a self.fs.wait_for_daemons() (test_data_scan.py:4...
John Spray
10:18 AM Bug #20335 (Resolved): test_migration_on_shutdown, test_grow_shrink failing
Seems to be happening repeatedly since June 10. Latest fs-master failures:
http://pulpito.ceph.com/teuthology-201...
John Spray
09:35 AM Bug #20313 (Fix Under Review): Assertion in handle_dir_update
https://github.com/ceph/ceph/pull/15510/commits/1a5fd47880229d69a6ea484e662e8b8280ff5158 Zheng Yan

06/18/2017

05:47 PM Bug #20328 (Duplicate): Test failure: test_export_pin (tasks.cephfs.test_exports.TestExports)
John Spray
02:24 PM Bug #20334 (Resolved): I/O become slowly when multi mds which subtree root has replica
yanmei ding

06/16/2017

04:22 PM Bug #20329 (Resolved): Ceph file system hang on Jewel
We are running Ceph 10.2.7 and after adding a new multi-threaded writer application we are seeing hangs accessing met... Eric Eastman
01:24 PM Bug #20328 (Duplicate): Test failure: test_export_pin (tasks.cephfs.test_exports.TestExports)
http://qa-proxy.ceph.com/teuthology/jspray-2017-06-15_02:50:24-multimds-wip-jcsp-testing-20170614-testing-basic-smith... Zheng Yan

06/15/2017

02:51 PM Bug #19706 (Resolved): Laggy mon daemons causing MDS failover (symptom: failed to set counters on...
John Spray
02:28 PM Bug #20318 (Resolved): Race in TestExports.test_export_pin
Seen failure here:
http://pulpito.ceph.com/jspray-2017-06-15_02:50:24-multimds-wip-jcsp-testing-20170614-testing-bas...
John Spray
02:13 PM Bug #20313 (Resolved): Assertion in handle_dir_update
Seen in test branch that had the following PRs in it:
[15125] mds: miscellaneous multimds fixes part2
[15510] mds...
John Spray

06/14/2017

03:30 PM Bug #20282: qa: missing even trivial tests for many commands
The damage and client stuff is all exercised in tasks/cephfs/test_* stuff. Are you talking specifically about unit t... John Spray
02:16 PM Backport #20294 (In Progress): jewel: Populate DamageTable from forward scrub
Nathan Cutler
02:13 PM Backport #20294 (Resolved): jewel: Populate DamageTable from forward scrub
https://github.com/ceph/ceph/pull/14699 Nathan Cutler
02:10 PM Feature #16016 (Pending Backport): Populate DamageTable from forward scrub
Nathan Cutler
02:01 PM Backport #19334 (Resolved): jewel: MDS heartbeat timeout during rejoin, when working with large a...
John Spray
01:43 PM Backport #19665 (Resolved): jewel: C_MDSInternalNoop::complete doesn't free itself
John Spray
01:38 PM Backport #19677 (Resolved): jewel: Jewel ceph-fuse does not recover after lost connection to MDS
John Spray
01:37 PM Backport #19762 (Resolved): jewel: non-local cephfs quota changes not visible until some IO is done
John Spray
01:33 PM Backport #19709 (Resolved): jewel: Enable MDS to start when session ino info is corrupt
John Spray
01:31 PM Backport #19675 (Resolved): jewel: cephfs: Test failure: test_data_isolated (tasks.cephfs.test_vo...
John Spray
01:30 PM Backport #19673 (Resolved): jewel: cephfs: mds is crushed, after I set about 400 64KB xattr kv pa...
John Spray
01:30 PM Backport #19671 (Resolved): jewel: MDS assert failed when shutting down
John Spray
01:29 PM Backport #19668 (Resolved): jewel: MDS goes readonly writing backtrace for a file whose data pool...
John Spray
01:27 PM Backport #19666 (Resolved): jewel: fs:The mount point break off when mds switch hanppened.
John Spray
01:26 PM Backport #19619 (Resolved): jewel: MDS server crashes due to inconsistent metadata.
John Spray
01:24 PM Backport #19482 (Resolved): jewel: No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" com...
John Spray
01:23 PM Backport #19044 (Resolved): jewel: buffer overflow in test LibCephFS.DirLs
John Spray
01:23 PM Backport #18949 (Resolved): jewel: mds/StrayManager: avoid reusing deleted inode in StrayManager:...
John Spray
01:22 PM Backport #18900 (Resolved): jewel: Test failure: test_open_inode
John Spray
01:22 PM Backport #18705 (Resolved): jewel: fragment space check can cause replayed request fail
John Spray

06/13/2017

11:57 PM Bug #20282 (Closed): qa: missing even trivial tests for many commands
I wrote a trivial script to look for missing commands in tests (https://github.com/ceph/ceph/pull/15675/commits/3aad0... Greg Farnum
01:17 PM Bug #20272: Ceph OSD & MDS Failure
The MDS backtrace is just the same as the OSD one. John Spray
02:27 AM Bug #20272: Ceph OSD & MDS Failure
You probably need to bump up the number of allowed thread/process IDs on your box if it's crashing there. But that sh... Greg Farnum
08:08 AM Bug #20129: Client syncfs is slow (waits for next MDS tick)
John Spray wrote:
> dongdong tao -- could you please open a pull request with your code change once it is working fo...
dongdong tao

06/12/2017

10:43 PM Bug #20272 (Rejected): Ceph OSD & MDS Failure
The following error from one of the OSDs in my cluster brought the Ceph MDS server down over the weekend:... Kyle Traff
12:44 PM Bug #20254 (Resolved): mds: coverity error in Server::_rename_prepare
... Patrick Donnelly

06/11/2017

11:47 AM Fix #20246 (Resolved): Make clog message on scrub errors friendlier.
Currently it looks something like this:... John Spray

06/09/2017

04:18 PM Backport #18283 (Closed): kraken: monitor cannot start because of "FAILED assert(info.state == MD...
Nathan Cutler
09:54 AM Bug #19955: Too many stat ops when MDS trying to probe a large file
Later I found two more related problems that may need further discussion:
# Some tools(like fio) will set the size o...
Sandy Xu
 

Also available in: Atom