Project

General

Profile

Activity

From 08/17/2017 to 09/15/2017

09/15/2017

10:13 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
What resources are actually duplicated across a CephContext, which we don't want duplicated? When I think of duplicat... Greg Farnum
10:07 PM Bug #21412: cephfs: too many cephfs snapshots chokes the system
Can you dump the ops in flight on both the MDS and the client issuing the snap rmdir when this happens? And the perfc... Greg Farnum
09:53 PM Bug #21412 (Closed): cephfs: too many cephfs snapshots chokes the system
We have a cluster with /cephfs/.snap directory with over 4800 entries. Trying to delete older snapshots (some are ove... Wyllys Ingersoll
09:19 PM Bug #21071 (Pending Backport): qa: test_misc creates metadata pool with dummy object resulting in...
Patrick Donnelly
09:17 PM Bug #21275 (Resolved): test hang after mds evicts kclient
Patrick Donnelly
09:17 PM Bug #21381 (Pending Backport): test_filtered_df: assert 0.9 < ratio < 1.1
Patrick Donnelly
08:48 PM Bug #20594 (Resolved): mds: cache limits should be expressed in memory usage, not inode count
Patrick Donnelly
08:48 PM Bug #21378 (Resolved): mds: up:stopping MDS cannot export directories
Patrick Donnelly
08:48 PM Backport #21385 (Resolved): luminous: mds: up:stopping MDS cannot export directories
Patrick Donnelly
08:47 PM Bug #21222 (Resolved): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
08:46 PM Bug #21221 (Resolved): MDCache::try_subtree_merge() may print N^2 lines of debug message
Patrick Donnelly
08:45 PM Backport #21384 (Resolved): luminous: mds: cache limits should be expressed in memory usage, not ...
Patrick Donnelly
08:45 PM Backport #21322 (Resolved): luminous: MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
08:45 PM Backport #21357 (Resolved): luminous: mds: segfault during `rm -rf` of large directory
Patrick Donnelly
08:44 PM Backport #21323 (Resolved): luminous: MDCache::try_subtree_merge() may print N^2 lines of debug m...
Patrick Donnelly
07:15 PM Bug #21406 (Resolved): ceph.in: tell mds does not understand --cluster
... Patrick Donnelly
06:57 PM Bug #21405 (Resolved): qa: add EC data pool to testing
This would end up being another sub-suite I in fs/ which adds support for testing an erasure data pool with overwrite... Patrick Donnelly
03:57 PM Bug #21304: mds v12.2.0 crashing
It works fine with that. To be precise I built from the luminous branch from today. No crashes for 8 hours under heav... Andrej Filipcic
01:48 PM Bug #21402 (Resolved): mds: move remaining containers in CDentry/CDir/CInode to mempool
This commit:
https://github.com/ceph/ceph/commit/e035b64fcb0482c3318656e1680d683814f494fe
does only part of the...
Patrick Donnelly
10:27 AM Bug #21383 (In Progress): qa: failures from pjd fstest
Zheng Yan

09/14/2017

09:27 PM Bug #21393 (Resolved): MDSMonitor: inconsistent role/who usage in command help
`ceph rmfailed` refers to its argument as "who" and `ceph repaired` refers to its argument as "rank". We should make ... Patrick Donnelly
07:38 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Note that I think we do need to convert over programs like ganesha and samba to only keep a single CephContext and sh... Jeff Layton
07:37 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Ok, I finally settled on just keeping things more or less limping along like they are now with lockdep, and just ensu... Jeff Layton
04:06 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
> Probably related to: http://tracker.ceph.com/issues/19706
I'll keep an eye on it. I'm suspecting out of sync clock...
Webert Lima
01:34 PM Bug #21304: mds v12.2.0 crashing
I'm running luminous (head commit is ba746cd14d) ceph-mds for while, haven't reproduced the issue. could you try the ... Zheng Yan
11:01 AM Backport #21324 (In Progress): luminous: ceph: tell mds.* results in warning
Nathan Cutler
10:28 AM Bug #20892 (Resolved): qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
10:28 AM Backport #21114 (Resolved): luminous: qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
10:24 AM Bug #21004 (Resolved): fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
10:24 AM Backport #21107 (Resolved): luminous: fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
08:23 AM Bug #21274 (Resolved): Client: if request gets aborted, its reference leaks
the bug was introduced by ... Zheng Yan
01:45 AM Bug #21274 (Pending Backport): Client: if request gets aborted, its reference leaks
Patrick Donnelly
04:32 AM Backport #21357 (In Progress): luminous: mds: segfault during `rm -rf` of large directory
Patrick Donnelly
04:32 AM Backport #21323 (In Progress): luminous: MDCache::try_subtree_merge() may print N^2 lines of debu...
Patrick Donnelly
03:37 AM Backport #21322 (In Progress): luminous: MDS: standby-replay mds should avoid initiating subtree ...
https://github.com/ceph/ceph/pull/17714 Patrick Donnelly
01:47 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Regression link in my last comment is wrong, see: http://tracker.ceph.com/issues/21378 Patrick Donnelly
01:46 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Fix for regression can be merged with this backport: https://github.com/ceph/ceph/pull/17689 Patrick Donnelly
03:37 AM Backport #21385 (In Progress): luminous: mds: up:stopping MDS cannot export directories
Patrick Donnelly
03:37 AM Backport #21385 (Resolved): luminous: mds: up:stopping MDS cannot export directories
https://github.com/ceph/ceph/pull/17714 Patrick Donnelly
03:27 AM Backport #21278 (Resolved): luminous: the standbys are not updated via "ceph tell mds.* command"
Patrick Donnelly
03:27 AM Backport #21267 (Resolved): luminous: Incorrect grammar in FS message "1 filesystem is have a fai...
Patrick Donnelly
03:20 AM Backport #21384 (In Progress): luminous: mds: cache limits should be expressed in memory usage, n...
Patrick Donnelly
03:19 AM Backport #21384 (Resolved): luminous: mds: cache limits should be expressed in memory usage, not ...
https://github.com/ceph/ceph/pull/17711 Patrick Donnelly
03:10 AM Bug #20594 (Pending Backport): mds: cache limits should be expressed in memory usage, not inode c...
Patrick Donnelly
02:01 AM Bug #21383 (Resolved): qa: failures from pjd fstest
... Patrick Donnelly
01:46 AM Bug #21378 (Pending Backport): mds: up:stopping MDS cannot export directories
Patrick Donnelly

09/13/2017

06:53 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
I started a discussion on ceph-devel and I think the consensus is that we can't make CephContext a singleton.
I we...
Jeff Layton
06:17 PM Bug #21381: test_filtered_df: assert 0.9 < ratio < 1.1
I think this was caused by:
commit 365558571c59dd42cf0934e6c31c7b4bf2c65026 365558571c (upstream/pull/17513/head)
...
Patrick Donnelly
06:04 PM Bug #21381 (Fix Under Review): test_filtered_df: assert 0.9 < ratio < 1.1
https://github.com/ceph/ceph/pull/17701 Douglas Fuller
04:26 PM Bug #21381 (Resolved): test_filtered_df: assert 0.9 < ratio < 1.1
... Patrick Donnelly
03:28 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
The entire log using bzip2 compressed down to 4.6G. You can download it from:
ftp://ftp.keepertech.com/outgoing/eri...
Eric Eastman
10:12 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Eric Eastman wrote:
> The log file with *debug_mds=10* from MDS startup to reaching the assert is 110GB. I am attac...
Zheng Yan
02:40 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
One active, one standby-replay, one standby as shown:
mds: cephfs-1/1/1 up {0=ede-c2-mon02=up:active}, 1 up:stand...
Eric Eastman
02:32 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Eric Eastman wrote:
> Replacing the 'assert(in)' with 'continue' did get the Ceph file system working again. Lookin...
Zheng Yan
01:44 PM Bug #21380 (Closed): mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
Thanks Zheng, I merged your commits. Patrick Donnelly
08:36 AM Bug #21380: mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
it's caused by bug in https://github.com/ceph/ceph/pull/17657. fixed by https://github.com/ukernel/ceph/commits/batri... Zheng Yan
04:27 AM Bug #21380 (Closed): mds: src/mds/MDSCacheObject.h: 171: FAILED assert(ref_map[by] > 0)
... Patrick Donnelly
08:51 AM Bug #21275: test hang after mds evicts kclient
with kernel fixes, the test case still hang at umount. http://qa-proxy.ceph.com/teuthology/zyan-2017-09-12_01:10:12-k... Zheng Yan
08:35 AM Bug #21379 (Duplicate): TestJournalRepair.test_reset: src/mds/CDir.cc: 930: FAILED assert(get_num...
dup of #21380 Zheng Yan
04:16 AM Bug #21379 (Duplicate): TestJournalRepair.test_reset: src/mds/CDir.cc: 930: FAILED assert(get_num...
... Patrick Donnelly
05:59 AM Bug #21363 (Closed): ceph-fuse crashing while mounting cephfs
Shinobu Kinjo
05:56 AM Bug #21363: ceph-fuse crashing while mounting cephfs
The issue#20972 has already been fixed in PR 16963.
[1] http://tracker.ceph.com/issues/20972
[2] https://github.c...
Prashant D
03:39 AM Bug #21363: ceph-fuse crashing while mounting cephfs
3242b2b should fix the issue. Shinobu Kinjo
03:53 AM Bug #21378: mds: up:stopping MDS cannot export directories
Looks good! Jianyu Li
03:48 AM Bug #21378 (Fix Under Review): mds: up:stopping MDS cannot export directories
https://github.com/ceph/ceph/pull/17689 Zheng Yan
03:30 AM Bug #21378: mds: up:stopping MDS cannot export directories
Seems the check in export_dir is too strict for up:stopping state:
if (!mds->is_active()) {
dout(7) << "i'...
Jianyu Li
02:41 AM Bug #21378 (Resolved): mds: up:stopping MDS cannot export directories
... Patrick Donnelly
03:20 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
cause regression http://tracker.ceph.com/issues/21378 Zheng Yan
02:42 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Fix causes: http://tracker.ceph.com/issues/21222 Patrick Donnelly
03:08 AM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
h3. description... Nathan Cutler
02:13 AM Backport #21357 (Fix Under Review): luminous: mds: segfault during `rm -rf` of large directory
PR for luminous https://github.com/ceph/ceph/pull/17686 Zheng Yan
02:05 AM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
It's dup of http://tracker.ceph.com/issues/21070. fix has already been fixed in master branch Zheng Yan
03:07 AM Bug #21070 (Pending Backport): MDS: MDS is laggy or crashed When deleting a large number of files
Nathan Cutler
02:42 AM Backport #21322: luminous: MDS: standby-replay mds should avoid initiating subtree export
Nathan, fix causes regression: http://tracker.ceph.com/issues/21222 Patrick Donnelly

09/12/2017

10:49 PM Bug #21311 (Rejected): ceph perf dump should report standby MDSes
OK, so I'm going to take the opinionated position that this is a WONTFIX as we have an existing interface that provid... John Spray
05:31 PM Bug #21311: ceph perf dump should report standby MDSes
John, if you have strong opinions about ripping out perf counters, I'll send this one over to you. Feel free to send ... Douglas Fuller
03:25 PM Bug #21311: ceph perf dump should report standby MDSes
So on closer inspection I see that as you say, for the existing stuff it is indeed using perf counters, but it doesn'... John Spray
03:04 PM Bug #21311: ceph perf dump should report standby MDSes
John Spray wrote:
> This is a collectd thing, which isn't to say that we shouldn't care, but... I'm not sure bugs ag...
David Galloway
07:34 PM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
I'm running into the same problem on luminous 12.2.0 - while removing a directory with lots of files, the MDS crashes... Andras Pataki
05:39 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
Replacing the 'assert(in)' with 'continue' did get the Ceph file system working again. Looking at the log, there wer... Eric Eastman
05:26 PM Bug #21071 (Fix Under Review): qa: test_misc creates metadata pool with dummy object resulting in...
https://github.com/ceph/ceph/pull/17676 Douglas Fuller
05:19 PM Bug #21071: qa: test_misc creates metadata pool with dummy object resulting in WRN: POOL_APP_NOT_...
I think we should just whitelist this, then. It's an intentionally pathological case, and this error should not be tr... Douglas Fuller
12:19 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
We could also fix this less invasively too -- try to make g_lockdep_ceph_ctx a refcounted object pointer, and then fi... Jeff Layton
02:19 AM Bug #21363 (Duplicate): ceph-fuse crashing while mounting cephfs
When trying to mount cephfs using ceph-fuse on ubuntu 16.04.3, parent or child ceph-fuse process getting SIGABRT sign... Prashant D
01:52 AM Bug #21362 (Need More Info): cephfs ec data pool + windows fio,ceph cluster degraed several hours...
1.configure
version : 12.2.0, ceph professional rpms install,new installed env.
cephfs: meta pool (ssd 1*3 replica...
Yong Wang
12:28 AM Bug #20594 (Fix Under Review): mds: cache limits should be expressed in memory usage, not inode c...
https://github.com/ceph/ceph/pull/17657 Patrick Donnelly

09/11/2017

10:16 PM Bug #21311: ceph perf dump should report standby MDSes
This is a collectd thing, which isn't to say that we shouldn't care, but... I'm not sure bugs against collectd really... John Spray
05:33 PM Bug #21311: ceph perf dump should report standby MDSes
Doug, please take this one. Patrick Donnelly
08:45 PM Bug #20945 (Resolved): get_quota_root sends lookupname op for every buffered write
Nathan Cutler
08:44 PM Backport #21112 (Resolved): luminous: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
08:02 PM Backport #21359 (Resolved): luminous: racy is_mounted() checks in libcephfs
https://github.com/ceph/ceph/pull/17875 Nathan Cutler
07:33 PM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
The log file with *debug_mds=10* from MDS startup to reaching the assert is 110GB. I am attaching the last 50K lines... Eric Eastman
08:48 AM Bug #21337: luminous: MDS is not getting past up:replay on Luminous cluster
please set debug_mds=10, restart mds and upload the full log. To recover the situation, just replace the 'assert(in)'... Zheng Yan
06:43 AM Bug #21337 (Resolved): luminous: MDS is not getting past up:replay on Luminous cluster
On my Luminous 12.2.0 test cluster, after I have run for the last few days, the MDS process is not getting past up:re... Eric Eastman
05:46 PM Backport #21357: luminous: mds: segfault during `rm -rf` of large directory
Zheng, please take a look. Patrick Donnelly
05:45 PM Backport #21357 (Resolved): luminous: mds: segfault during `rm -rf` of large directory
https://github.com/ceph/ceph/pull/17686 Patrick Donnelly
05:28 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
I spent a couple of hours today crawling over the code in ganesha and ceph that handles the CephContext. We have rout... Jeff Layton
05:22 PM Bug #21025 (Pending Backport): racy is_mounted() checks in libcephfs
Patrick Donnelly
05:07 PM Bug #21025 (Resolved): racy is_mounted() checks in libcephfs
PR is merged. Jeff Layton
01:58 PM Bug #21275 (Fix Under Review): test hang after mds evicts kclient
Patch is on ceph-devel. Patrick Donnelly

09/08/2017

08:20 PM Backport #21324 (Resolved): luminous: ceph: tell mds.* results in warning
https://github.com/ceph/ceph/pull/17729 Nathan Cutler
08:20 PM Backport #21323 (Resolved): luminous: MDCache::try_subtree_merge() may print N^2 lines of debug m...
https://github.com/ceph/ceph/pull/17712 Nathan Cutler
08:20 PM Backport #21322 (Resolved): luminous: MDS: standby-replay mds should avoid initiating subtree export
https://github.com/ceph/ceph/pull/17714 Nathan Cutler
08:20 PM Backport #21321 (Resolved): luminous: mds: asok command error merged with partial Formatter output
https://github.com/ceph/ceph/pull/17870 Nathan Cutler
06:26 PM Bug #21191 (Pending Backport): ceph: tell mds.* results in warning
Patrick Donnelly
06:26 PM Bug #21222 (Pending Backport): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
06:26 PM Bug #21221 (Pending Backport): MDCache::try_subtree_merge() may print N^2 lines of debug message
Patrick Donnelly
06:25 PM Bug #21252 (Pending Backport): mds: asok command error merged with partial Formatter output
Patrick Donnelly
05:14 PM Cleanup #21069 (Resolved): client: missing space in some client debug log messages
Nathan Cutler
05:14 PM Backport #21103 (Resolved): luminous: client: missing space in some client debug log messages
Nathan Cutler
03:51 PM Backport #21103: luminous: client: missing space in some client debug log messages
https://github.com/ceph/ceph/pull/17469 merged Yuri Weinstein
03:10 PM Bug #21311 (Rejected): ceph perf dump should report standby MDSes
This was discovered when observing the cephmetrics dashboard monitoring the Sepia cluster.... David Galloway
01:52 PM Bug #21275: test hang after mds evicts kclient
Got it. I think we've hit problems like that in NFS, and what we had to do is save copies of the fields from utsname(... Jeff Layton
01:43 PM Bug #21275: test hang after mds evicts kclient
... Zheng Yan
06:14 AM Bug #21304 (Can't reproduce): mds v12.2.0 crashing

luminous mds crashes few times a day. large activity (eg untaring kernel tarball) causes to crash it in few minutes...
Andrej Filipcic

09/07/2017

01:52 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
cc'ing Matt on this bug, as it may have implications for the new code that can fetch config info out of RADOS:
Bas...
Jeff Layton
01:10 PM Backport #21267 (In Progress): luminous: Incorrect grammar in FS message "1 filesystem is have a ...
Abhishek Lekshmanan
01:09 PM Backport #21278 (In Progress): luminous: the standbys are not updated via "ceph tell mds.* command"
Abhishek Lekshmanan
07:35 AM Backport #21278 (Resolved): luminous: the standbys are not updated via "ceph tell mds.* command"
https://github.com/ceph/ceph/pull/17565 Nathan Cutler
08:27 AM Bug #21274 (Fix Under Review): Client: if request gets aborted, its reference leaks
https://github.com/ceph/ceph/pull/17545 Zheng Yan
01:48 AM Bug #21274 (Resolved): Client: if request gets aborted, its reference leaks
/a/pdonnell-2017-09-06_15:30:20-fs-wip-pdonnell-testing-20170906-distro-basic-smithi/1601384/teuthology.log
log of...
Zheng Yan
07:47 AM Backport #21113 (Resolved): jewel: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
07:43 AM Bug #18157 (Resolved): ceph-fuse segfaults on daemonize
Nathan Cutler
07:43 AM Backport #20972 (Resolved): jewel ceph-fuse segfaults at mount time, assert in ceph::log::Log::stop
Nathan Cutler
05:44 AM Bug #21275 (Resolved): test hang after mds evicts kclient
http://pulpito.ceph.com/zyan-2017-09-07_03:18:23-kcephfs-master-testing-basic-mira/
http://qa-proxy.ceph.com/teuth...
Zheng Yan
01:22 AM Bug #21230 (Pending Backport): the standbys are not updated via "ceph tell mds.* command"
Kefu Chai

09/06/2017

07:39 PM Backport #21267 (Resolved): luminous: Incorrect grammar in FS message "1 filesystem is have a fai...
https://github.com/ceph/ceph/pull/17566 Nathan Cutler
09:05 AM Bug #21252: mds: asok command error merged with partial Formatter output
Sorry, the bug was introduced by my commit:... Zheng Yan
03:52 AM Bug #21153 (Pending Backport): Incorrect grammar in FS message "1 filesystem is have a failed mds...
Patrick Donnelly
03:51 AM Bug #20337 (Resolved): test_rebuild_simple_altpool triggers MDS assertion
Patrick Donnelly

09/05/2017

09:48 PM Bug #21252: mds: asok command error merged with partial Formatter output
I should note: the error itself is very concerning because the only way for dump_cache to fail is if it's operating o... Patrick Donnelly
09:46 PM Bug #21252 (Fix Under Review): mds: asok command error merged with partial Formatter output
https://github.com/ceph/ceph/pull/17506 Patrick Donnelly
08:20 PM Bug #21252 (Resolved): mds: asok command error merged with partial Formatter output
... Patrick Donnelly
09:35 PM Bug #21222 (Fix Under Review): MDS: standby-replay mds should avoid initiating subtree export
Patrick Donnelly
03:39 PM Bug #16709 (Resolved): No output for "ceph mds rmfailed 0 --yes-i-really-mean-it" command
Nathan Cutler
03:38 PM Bug #18660 (Resolved): fragment space check can cause replayed request fail
Nathan Cutler
03:38 PM Bug #18661 (Resolved): Test failure: test_open_inode
Nathan Cutler
03:38 PM Bug #18877 (Resolved): mds/StrayManager: avoid reusing deleted inode in StrayManager::_purge_stra...
Nathan Cutler
03:38 PM Bug #18941 (Resolved): buffer overflow in test LibCephFS.DirLs
Nathan Cutler
03:38 PM Bug #19118 (Resolved): MDS heartbeat timeout during rejoin, when working with large amount of cap...
Nathan Cutler
03:37 PM Bug #19406 (Resolved): MDS server crashes due to inconsistent metadata.
Nathan Cutler
03:32 PM Bug #19955 (Resolved): Too many stat ops when MDS trying to probe a large file
Nathan Cutler
03:32 PM Backport #20149 (Rejected): kraken: Too many stat ops when MDS trying to probe a large file
Kraken is EOL. Nathan Cutler
03:32 PM Bug #20055 (Resolved): Journaler may execute on_safe contexts prematurely
Nathan Cutler
03:31 PM Backport #20141 (Rejected): kraken: Journaler may execute on_safe contexts prematurely
Kraken is EOL.Kraken is EOL. Nathan Cutler
11:07 AM Bug #20988: client: dual client segfault with racing ceph_shutdown
Hmm. I'm not sure that really helps. Here's the doc comment over ceph_create_with_context:... Jeff Layton
09:46 AM Bug #20988: client: dual client segfault with racing ceph_shutdown
I found a workaround. We can create single CephContext for multiple ceph_mount.... Zheng Yan
09:35 AM Backport #21114 (In Progress): luminous: qa: FS_DEGRADED spurious health warnings in some sub-suites
Nathan Cutler
09:33 AM Backport #21112 (In Progress): luminous: get_quota_root sends lookupname op for every buffered write
Nathan Cutler
09:28 AM Backport #21107 (In Progress): luminous: fs: client/mds has wrong check to clear S_ISGID on chown
Nathan Cutler
09:22 AM Backport #21103 (In Progress): luminous: client: missing space in some client debug log messages
Nathan Cutler
08:45 AM Bug #21193 (Duplicate): ceph.in: `ceph tell mds.* injectargs` does not update standbys
http://tracker.ceph.com/issues/21230 Chang Liu
08:40 AM Bug #21230 (Fix Under Review): the standbys are not updated via "ceph tell mds.* command"
https://github.com/ceph/ceph/pull/17463 Kefu Chai
07:59 AM Bug #21230 (Resolved): the standbys are not updated via "ceph tell mds.* command"
Chang Liu

09/04/2017

07:11 PM Bug #20178 (Resolved): df reports negative disk "used" value when quota exceed
Nathan Cutler
07:11 PM Backport #20350 (Rejected): kraken: df reports negative disk "used" value when quota exceed
Kraken is EOL. Nathan Cutler
07:11 PM Backport #20349 (Resolved): jewel: df reports negative disk "used" value when quota exceed
Nathan Cutler
07:10 PM Bug #20340 (Resolved): cephfs permission denied until second client accesses file
Nathan Cutler
07:10 PM Backport #20404 (Rejected): kraken: cephfs permission denied until second client accesses file
Kraken is EOL. Nathan Cutler
07:10 PM Backport #20403 (Resolved): jewel: cephfs permission denied until second client accesses file
Nathan Cutler
06:43 PM Bug #21221 (Fix Under Review): MDCache::try_subtree_merge() may print N^2 lines of debug message
https://github.com/ceph/ceph/pull/17456 Patrick Donnelly
09:11 AM Bug #21221 (Resolved): MDCache::try_subtree_merge() may print N^2 lines of debug message
MDCache::try_subtree_merge(dirfrag) calls MDCache::try_subtree_merge_at() for each subtree in the dirfrag. try_subtre... Zheng Yan
11:14 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Here is a merge request for this bug fix: https://github.com/ceph/ceph/pull/17452, could you have a review? @Patrick Jianyu Li
11:11 AM Bug #21222: MDS: standby-replay mds should avoid initiating subtree export
Although for the latest code in master branch, this issue could be avoided by the destination check in export_dir:
...
Jianyu Li
10:24 AM Bug #21222 (Resolved): MDS: standby-replay mds should avoid initiating subtree export
For jewel-10.2.7 version, use two active mds and two related standby-replay mds.
When standby-replay replays the m...
Jianyu Li

08/31/2017

01:06 PM Feature #18490: client: implement delegation support in userland cephfs
Here's a capture showing the delegation grant and recall (what can I say, I'm a proud parent). The delegation was rev... Jeff Layton
12:47 PM Feature #18490: client: implement delegation support in userland cephfs
I was able to get ganesha to hand out a v4.0 delegation today and recall it properly. So, PoC is successful!
There s...
Jeff Layton
11:12 AM Backport #21113 (In Progress): jewel: get_quota_root sends lookupname op for every buffered write
Nathan Cutler

08/30/2017

11:17 PM Bug #21193 (Duplicate): ceph.in: `ceph tell mds.* injectargs` does not update standbys
... Patrick Donnelly
10:54 PM Bug #21191 (Fix Under Review): ceph: tell mds.* results in warning
https://github.com/ceph/ceph/pull/17384 Patrick Donnelly
10:19 PM Bug #21191: ceph: tell mds.* results in warning
John believes 9753a0065db8bfb03d86a7185bc636c7aa4c7af7 may be the cause. Patrick Donnelly
10:15 PM Bug #21191 (Resolved): ceph: tell mds.* results in warning
... Patrick Donnelly

08/29/2017

03:31 PM Documentation #21172 (Duplicate): doc: Export over NFS
Create a document similar to RGW's NFS support, https://github.com/ceph/ceph/blob/master/doc/radosgw/nfs.rst
to help...
Ramana Raja
10:08 AM Bug #21168 (Fix Under Review): cap import/export message ordering issue
https://github.com/ceph/ceph/pull/17340 Zheng Yan
09:59 AM Bug #21168 (Resolved): cap import/export message ordering issue
there are cap import/export message ordering issue
symptoms are:
kernel prints error "handle_cap_import: mismat...
Zheng Yan

08/28/2017

05:54 PM Feature #18490: client: implement delegation support in userland cephfs
I made some progress today. I got ganesha over ceph to hand out a read delegation. Once I tried to force a recall (by... Jeff Layton
01:51 PM Feature #21156 (Resolved): mds: speed up recovery with many open inodes
opening inode during rejoin stage is slow when clients have large number of caps.
Currently mds journal open inode...
Zheng Yan
01:42 PM Bug #21083: client: clean up header to isolate real public methods and entry points for client_lock
Should be reorganized with an eye toward finer grained locks, along with a client_lock audit. -Jeff Patrick Donnelly
01:36 PM Bug #21058: mds: remove UNIX file permissions binary dependency
May not be necessary as bits are defined by POSIX. Should still look for other dependencies which may vary. Patrick Donnelly
12:54 PM Bug #21153 (Fix Under Review): Incorrect grammar in FS message "1 filesystem is have a failed mds...
https://github.com/ceph/ceph/pull/17301 John Spray
12:52 PM Bug #21153 (Resolved): Incorrect grammar in FS message "1 filesystem is have a failed mds daemon"
John Spray
06:53 AM Bug #21149: SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
I think the exception was triggered by writing the debug message before reading ceph config.
*PR* https://github.com...
shangzhong zhu
06:49 AM Bug #21149 (Rejected): SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
When I run the Hadoop write test, the following exception occurs(NOT 100%):
/clove/vm/renhw/ceph/rpmbuild/BUILD/ce...
shangzhong zhu
03:34 AM Bug #21070 (Fix Under Review): MDS: MDS is laggy or crashed When deleting a large number of files
https://github.com/ceph/ceph/pull/17291 Zheng Yan
03:17 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
for seeky readdir on large directory
int fd = open("/mnt/ceph", O_RDONLY | O_DIRECTORY)
lseek(fd, xxxxxx, SEEK_SE...
Zheng Yan

08/26/2017

02:29 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
your patch can solve this problem, I test twice today.:-)
my modification is just to verify the dirfrag offset cau...
huanwen ren

08/25/2017

09:29 PM Bug #20535 (Resolved): mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert, the backport is merged so I'm marking this as resolved. If you experience this particular issue again, please... Patrick Donnelly
09:27 PM Backport #20564 (Resolved): jewel: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Patrick Donnelly
09:22 PM Bug #20596 (Fix Under Review): MDSMonitor: obsolete `mds dump` and other deprecated mds commands
https://github.com/ceph/ceph/pull/17266 Patrick Donnelly
01:33 PM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
my patch alone does not work? your change will make seeky readdir on directory inefficiency Zheng Yan
08:28 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
In addition, I set "offset_hash" to 0 when offset_str is empty, which can solve this problem.
Modify the code as fol...
huanwen ren
08:20 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
ok,I will test it
huanwen ren
02:02 AM Bug #21091: StrayManager::truncate is broken
yes Zheng Yan

08/24/2017

09:52 PM Bug #20990 (Resolved): mds,mgr: add 'is_valid=false' when failed to parse caps
Patrick Donnelly
09:52 PM Backport #21047 (Resolved): luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
Patrick Donnelly
04:58 PM Backport #21047: luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
https://github.com/ceph/ceph/pull/17240 Patrick Donnelly
09:52 PM Bug #21064 (Resolved): FSCommands: missing wait for osdmap writeable + propose
Patrick Donnelly
09:52 PM Backport #21101 (Resolved): luminous: FSCommands: missing wait for osdmap writeable + propose
Patrick Donnelly
04:51 PM Backport #21101: luminous: FSCommands: missing wait for osdmap writeable + propose
https://github.com/ceph/ceph/pull/17238 Patrick Donnelly
04:48 PM Backport #21101 (Resolved): luminous: FSCommands: missing wait for osdmap writeable + propose
Patrick Donnelly
09:51 PM Bug #21065 (Resolved): client: UserPerm delete with supp. groups allocated by malloc generates va...
Patrick Donnelly
03:49 AM Bug #21065 (Pending Backport): client: UserPerm delete with supp. groups allocated by malloc gene...
Patrick Donnelly
09:51 PM Backport #21100 (Resolved): luminous: client: UserPerm delete with supp. groups allocated by mall...
Patrick Donnelly
04:51 PM Backport #21100: luminous: client: UserPerm delete with supp. groups allocated by malloc generate...
https://github.com/ceph/ceph/pull/17237 Patrick Donnelly
04:46 PM Backport #21100 (Resolved): luminous: client: UserPerm delete with supp. groups allocated by mall...
Patrick Donnelly
09:51 PM Bug #21078 (Resolved): df hangs in ceph-fuse
Patrick Donnelly
03:49 AM Bug #21078 (Pending Backport): df hangs in ceph-fuse
Patrick Donnelly
09:51 PM Backport #21099 (Resolved): luminous: client: df hangs in ceph-fuse
Patrick Donnelly
04:45 PM Backport #21099: luminous: client: df hangs in ceph-fuse
https://github.com/ceph/ceph/pull/17236 Patrick Donnelly
04:40 PM Backport #21099 (Resolved): luminous: client: df hangs in ceph-fuse
Patrick Donnelly
09:51 PM Bug #21082 (Resolved): client: the client_lock is not taken for Client::getcwd
Patrick Donnelly
03:50 AM Bug #21082 (Pending Backport): client: the client_lock is not taken for Client::getcwd
Patrick Donnelly
09:50 PM Backport #21098 (Resolved): luminous: client: the client_lock is not taken for Client::getcwd
Patrick Donnelly
04:43 PM Backport #21098: luminous: client: the client_lock is not taken for Client::getcwd
https://github.com/ceph/ceph/pull/17235 Patrick Donnelly
04:37 PM Backport #21098 (Resolved): luminous: client: the client_lock is not taken for Client::getcwd
Patrick Donnelly
07:05 PM Bug #21091: StrayManager::truncate is broken
This only affects deletions of snapshotted files right? Patrick Donnelly
09:10 AM Bug #21091 (Fix Under Review): StrayManager::truncate is broken
https://github.com/ceph/ceph/pull/17219 Zheng Yan
08:56 AM Bug #21091 (Resolved): StrayManager::truncate is broken
Zheng Yan
05:23 PM Backport #21114 (Resolved): luminous: qa: FS_DEGRADED spurious health warnings in some sub-suites
https://github.com/ceph/ceph/pull/17474 Nathan Cutler
05:23 PM Backport #21113 (Resolved): jewel: get_quota_root sends lookupname op for every buffered write
https://github.com/ceph/ceph/pull/17396 Nathan Cutler
05:23 PM Backport #21112 (Resolved): luminous: get_quota_root sends lookupname op for every buffered write
https://github.com/ceph/ceph/pull/17473 Nathan Cutler
05:23 PM Backport #21107 (Resolved): luminous: fs: client/mds has wrong check to clear S_ISGID on chown
https://github.com/ceph/ceph/pull/17471 Nathan Cutler
05:22 PM Backport #21103 (Resolved): luminous: client: missing space in some client debug log messages
https://github.com/ceph/ceph/pull/17469 Nathan Cutler
10:49 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
please try the attached patch Zheng Yan
08:45 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
The address of *dn(mds.0.server dn1-10x600000000000099) is overflowed,
but have not found the reason.
huanwen ren
08:42 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
I added some print debugging information, and can be reproduced:
1.The right to print...
huanwen ren

08/23/2017

08:52 PM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Webert Lima wrote:
> My active MDS had committed suicide due to "dne in mds map" (this is happening a lot but I don'...
Patrick Donnelly
08:18 PM Bug #21065 (Fix Under Review): client: UserPerm delete with supp. groups allocated by malloc gene...
https://github.com/ceph/ceph/pull/17204 Patrick Donnelly
07:43 PM Bug #21082 (Fix Under Review): client: the client_lock is not taken for Client::getcwd
https://github.com/ceph/ceph/pull/17205 Patrick Donnelly
03:57 PM Bug #21082 (Resolved): client: the client_lock is not taken for Client::getcwd
https://github.com/ceph/ceph/blob/db16d50cc56f5221d7bcdb28a29d5e0a456cba94/src/client/Client.cc#L9387-L9425
We als...
Patrick Donnelly
06:13 PM Feature #16016 (Resolved): Populate DamageTable from forward scrub
Nathan Cutler
06:12 PM Backport #20294 (Resolved): jewel: Populate DamageTable from forward scrub
Nathan Cutler
06:12 PM Feature #18509 (Resolved): MDS: damage reporting by ino number is useless
Nathan Cutler
06:12 PM Backport #19679 (Resolved): jewel: MDS: damage reporting by ino number is useless
Nathan Cutler
06:10 PM Bug #19291 (Resolved): mds: log rotation doesn't work if mds has respawned
Nathan Cutler
06:10 PM Backport #19466 (Resolved): jewel: mds: log rotation doesn't work if mds has respawned
Nathan Cutler
05:57 PM Cleanup #21069 (Pending Backport): client: missing space in some client debug log messages
Patrick Donnelly
02:51 AM Cleanup #21069: client: missing space in some client debug log messages
*PR*: https://github.com/ceph/ceph/pull/17175 shangzhong zhu
02:45 AM Cleanup #21069 (Resolved): client: missing space in some client debug log messages
2017-08-11 19:05:17.344361 7fb87b1eb700 20 client.15557 may_delete0x10000000522.head(faked_ino=0 ref=3 ll_ref=0 cap_r... shangzhong zhu
04:52 PM Bug #21064 (Pending Backport): FSCommands: missing wait for osdmap writeable + propose
Patrick Donnelly
04:06 PM Bug #21083 (New): client: clean up header to isolate real public methods and entry points for cli...
With the recent revelation that the client_lock was not locked for Client::getcwd [1] and other history of missing lo... Patrick Donnelly
02:10 PM Bug #21081 (Duplicate): mon: get writeable osdmap for added data pool
Patrick Donnelly
02:08 PM Bug #21081 (Duplicate): mon: get writeable osdmap for added data pool
https://github.com/ceph/ceph/pull/17163 Abhishek Lekshmanan
02:08 PM Bug #20945 (Pending Backport): get_quota_root sends lookupname op for every buffered write
Patrick Donnelly
02:06 PM Bug #21004 (Pending Backport): fs: client/mds has wrong check to clear S_ISGID on chown
Patrick Donnelly
01:55 PM Bug #21078 (Fix Under Review): df hangs in ceph-fuse
https://github.com/ceph/ceph/pull/17199 John Spray
01:49 PM Bug #21078: df hangs in ceph-fuse
yep. mon says:... John Spray
01:48 PM Bug #21078: df hangs in ceph-fuse
Loops like this:... John Spray
01:42 PM Bug #21078 (Resolved): df hangs in ceph-fuse
See "[ceph-users] ceph-fuse hanging on df with ceph luminous >= 12.1.3".
The filesystem works normally, except for...
John Spray
01:51 PM Bug #20892 (Pending Backport): qa: FS_DEGRADED spurious health warnings in some sub-suites
Patrick Donnelly
11:12 AM Backport #21067: jewel: MDS integer overflow fix
OK, backport staged (see description) Nathan Cutler
11:11 AM Backport #21067 (In Progress): jewel: MDS integer overflow fix
Nathan Cutler
11:09 AM Backport #21067: jewel: MDS integer overflow fix
h3. description
Please backport commit 0d74334332fb70212fc71f1130e886952920038d (mds: use client_t instead of int ...
Nathan Cutler
06:47 AM Bug #19755 (Resolved): MDS became unresponsive when truncating a very large file
Nathan Cutler
06:42 AM Backport #20025 (Resolved): jewel: MDS became unresponsive when truncating a very large file
Nathan Cutler
04:21 AM Bug #21071: qa: test_misc creates metadata pool with dummy object resulting in WRN: POOL_APP_NOT_...
Doug, please take this one. Patrick Donnelly
04:20 AM Bug #21071 (Resolved): qa: test_misc creates metadata pool with dummy object resulting in WRN: PO...
... Patrick Donnelly
03:32 AM Bug #21070: MDS: MDS is laggy or crashed When deleting a large number of files
I open the log to try to view the problem where the log information is as follows:... huanwen ren
03:06 AM Bug #21070 (Resolved): MDS: MDS is laggy or crashed When deleting a large number of files
We plan to use mdtest to create 100w level of the file, in the ceph-fuse mount the directory, the command is as follo... huanwen ren

08/22/2017

10:02 PM Backport #21067 (Resolved): jewel: MDS integer overflow fix
https://github.com/ceph/ceph/pull/17188 Thorvald Natvig
09:44 PM Bug #21066 (New): qa: racy test_export_pin check for export_targets
... Patrick Donnelly
08:59 PM Bug #21065: client: UserPerm delete with supp. groups allocated by malloc generates valgrind error
We'll need to convert the UserPerm constructor and such to use malloc/free. ceph_userperm_new can be called from C co... Jeff Layton
08:30 PM Bug #21065 (Resolved): client: UserPerm delete with supp. groups allocated by malloc generates va...
... Patrick Donnelly
07:13 PM Bug #21064 (Fix Under Review): FSCommands: missing wait for osdmap writeable + propose
https://github.com/ceph/ceph/pull/17163 Patrick Donnelly
07:09 PM Bug #21064 (Resolved): FSCommands: missing wait for osdmap writeable + propose
... Patrick Donnelly
04:04 PM Backport #21047: luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
Nathan Cutler wrote:
> Patrick, do you mean that the following three PRs should be backported in a single PR targeti...
Patrick Donnelly
07:09 AM Backport #21047: luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
Patrick, do you mean that the following three PRs should be backported in a single PR targeting luminous?
* https:...
Nathan Cutler
12:36 PM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
Note: the failure is transient (occurred in 2 out of 5 runs so far). Nathan Cutler
11:37 AM Bug #16881: RuntimeError: Files in flight high water is unexpectedly low (0 / 6)
Hello CephFS developers, I am reproducing this bug in the latest jewel integration branch. Here are the prime suspect... Nathan Cutler

08/21/2017

09:41 PM Backport #21047: luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
Nathan, the fix for http://tracker.ceph.com/issues/21027 should also make it into Luminous with this backport. I'm go... Patrick Donnelly
04:13 PM Backport #21047 (Resolved): luminous: mds,mgr: add 'is_valid=false' when failed to parse caps
Nathan Cutler
08:53 PM Bug #21058 (New): mds: remove UNIX file permissions binary dependency
The MDS has various file permission/type bits pulled from UNIX headers. These could be different depending on what sy... Patrick Donnelly
07:46 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Moving this to main "Ceph" project as it looks more like a problem in the AdminSocket code. The thing seems to mainly... Jeff Layton
06:13 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Here's a testcase that seems to trigger it fairly reliably. You may have to run it a few times to get it to crash but... Jeff Layton
05:03 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Correct. I'll see if I can roll up a testcase for this when I get a few mins. Jeff Layton
04:50 PM Bug #20988: client: dual client segfault with racing ceph_shutdown
Jeff, just confirming this bug is with two client instances and not one instance with two threads? Patrick Donnelly
02:59 PM Bug #21007 (Resolved): The ceph fs set mds_max command must be udpated
Patch merged: https://github.com/ceph/ceph/pull/17044 Bara Ancincova
01:49 PM Feature #18490: client: implement delegation support in userland cephfs
The latest set has timeout support that basically does a client->unmount() on the thing. With the patches for this bu... Jeff Layton
11:01 AM Feature #18490: client: implement delegation support in userland cephfs
For the clean-ish shutdown case, it would be neat to have a common code path with the -EBLACKLISTED handling (see Cli... John Spray
01:43 PM Bug #21025: racy is_mounted() checks in libcephfs
PR is here:
https://github.com/ceph/ceph/pull/17095
Jeff Layton
01:40 PM Bug #21004 (Fix Under Review): fs: client/mds has wrong check to clear S_ISGID on chown
Patrick Donnelly
11:41 AM Bug #20535: mds segmentation fault ceph_lock_state_t::get_overlapping_locks
Reporting in, I've had the first incident after the version upgrade.
My active MDS had committed suicide due to "d...
Webert Lima
09:03 AM Bug #20892: qa: FS_DEGRADED spurious health warnings in some sub-suites
kcephfs suite has similar issue:
http://pulpito.ceph.com/teuthology-2017-08-19_05:20:01-kcephfs-luminous-testing-bas...
Zheng Yan

08/17/2017

06:41 PM Feature #19109 (Resolved): Use data pool's 'df' for statfs instead of global stats, if there is o...
Oh, oops. I forgot I merged this into luminous. Thanks Doug. Patrick Donnelly
06:22 PM Feature #19109: Use data pool's 'df' for statfs instead of global stats, if there is only one dat...
There's no need to wait for the kernel client since the message encoding is versioned. This has already been merged i... Douglas Fuller
06:14 PM Feature #19109 (Pending Backport): Use data pool's 'df' for statfs instead of global stats, if th...
Waiting for
https://github.com/ceph/ceph-client/commit/b7f94d6a95dfe2399476de1e0d0a7c15c01611d0
to be merged up...
Patrick Donnelly
03:15 PM Bug #21025 (Resolved): racy is_mounted() checks in libcephfs
libcephfs.cc has a bunch of is_mounted checks like this in it:... Jeff Layton
03:02 PM Feature #18490: client: implement delegation support in userland cephfs
Patrick Donnelly wrote:
> here "client" means Ganesha. What about how does Ganesha handle its client not releasing...
Jeff Layton
 

Also available in: Atom