Project

General

Profile

Activity

From 09/13/2020 to 10/12/2020

10/12/2020

07:42 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
That patch looks correct. Would you like to post the PR Dan? Patrick Donnelly
06:36 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
I have the coredump so we can debug further. In the hit_session frame, we see the session clearly:... Dan van der Ster
05:53 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
Patrick Donnelly wrote:
> Dan van der Ster wrote:
> > Indeed, I evicted the weird clients spinning on Stale file ha...
Dan van der Ster
05:24 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
Dan van der Ster wrote:
> Indeed, I evicted the weird clients spinning on Stale file handles, and then the mds stopp...
Patrick Donnelly
02:50 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
Indeed, I evicted the weird clients spinning on Stale file handles, and then the mds stopping procedure finished with... Dan van der Ster
02:35 PM Bug #47833: mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_session(Sessi...
Here is a log with debug_mds=10. ceph-post-file: f4f87969-d492-4e1d-8e8e-5c9e81e45d2f
From what I can gather, (...
Dan van der Ster
02:10 PM Bug #47833 (Resolved): mds FAILED ceph_assert(sessions != 0) in function 'void SessionMap::hit_se...
We are not able to decrease from max_mds=2 to 1 on our cephfs cluster.
As soon as we decrease max_mds, the mds goe...
Dan van der Ster
06:39 PM Backport #47608: octopus: mds: OpenFileTable::prefetch_inodes during rejoin can cause out-of-memory
Zheng Yan wrote:
> https://github.com/ceph/ceph/pull/37383
merged
Yuri Weinstein
06:39 PM Backport #47604: octopus: mds: purge_queue's _calculate_ops is inaccurate
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37372
merged
Yuri Weinstein
06:38 PM Backport #47601: octopus: mgr/nfs: Cluster creation throws 'NoneType' object has no attribute 're...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37371
merged
Yuri Weinstein
06:38 PM Backport #47260: octopus: client: FAILED assert(dir->readdir_cache[dirp->cache_index] == dn)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37370
merged
Yuri Weinstein
06:37 PM Backport #47623: octopus: various quota failures
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37369
merged
Yuri Weinstein
06:37 PM Backport #47255: octopus: client: Client::open() pass wrong cap mask to path_walk
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37369
merged
Yuri Weinstein
06:37 PM Backport #47253: octopus: mds: fix possible crash when the MDS is stopping
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37368
merged
Yuri Weinstein
02:43 PM Cleanup #47160 (Resolved): qa/tasks/cephfs: Break up test_volumes.py
I don't think this will be feasible to backport without significant effort. Ramana, do you think it's worth it? Patrick Donnelly
01:39 PM Bug #47798 (Triaged): pybind/mgr/volumes: TypeError: bad operand type for unary -: 'str' for errn...
Patrick Donnelly
09:02 AM Bug #46883: kclient: ghost kernel mount
Patrick Donnelly wrote:
> So there are two issues here:
[...]
>
> * Use separate auth credentials for each mount...
Xiubo Li
06:53 AM Bug #46883: kclient: ghost kernel mount
Will work on it. Xiubo Li
02:46 AM Bug #47565 (Fix Under Review): qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x20...
Xiubo Li

10/10/2020

07:05 PM Bug #36389: untar encounters unexpected EPERM on kclient/multimds cluster with thrashing
Patrick Donnelly wrote:
> I think this might be a dup of #47723
Yes, it probably is. There is no evidence in /ce...
Ilya Dryomov
09:03 AM Bug #40864 (Resolved): cephfs-shell: rmdir doesn't complain when directory is not empty
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
08:58 AM Backport #47824 (Resolved): octopus: pybind/mgr/volumes: Make number of cloner threads configurable
https://github.com/ceph/ceph/pull/37671 Nathan Cutler
08:58 AM Backport #47823 (Resolved): nautilus: pybind/mgr/volumes: Make number of cloner threads configurable
https://github.com/ceph/ceph/pull/37936 Nathan Cutler
08:43 AM Backport #47259: nautilus: client: FAILED assert(dir->readdir_cache[dirp->cache_index] == dn)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37232
m...
Nathan Cutler
06:22 AM Backport #47259 (Resolved): nautilus: client: FAILED assert(dir->readdir_cache[dirp->cache_index]...
Wei-Chung Cheng
08:43 AM Backport #47252: nautilus: mds: fix possible crash when the MDS is stopping
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37229
m...
Nathan Cutler
06:22 AM Backport #47252 (Resolved): nautilus: mds: fix possible crash when the MDS is stopping
Wei-Chung Cheng
08:43 AM Backport #47246: nautilus: qa: Replacing daemon mds.a as rank 0 with standby daemon mds.b" in clu...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37228
m...
Nathan Cutler
06:21 AM Backport #47246 (Resolved): nautilus: qa: Replacing daemon mds.a as rank 0 with standby daemon md...
Wei-Chung Cheng
08:43 AM Backport #47088 (Resolved): nautilus: mds: recover files after normal session close
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37178
m...
Nathan Cutler
08:42 AM Backport #47605 (Resolved): nautilus: mds: purge_queue's _calculate_ops is inaccurate
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37481
m...
Nathan Cutler

10/09/2020

06:59 PM Bug #47678: mgr: include/interval_set.h: 466: ceph_abort_msg("abort() called")
Putting this in the fs project for now. Patrick Donnelly
06:55 PM Bug #36389: untar encounters unexpected EPERM on kclient/multimds cluster with thrashing
I think this might be a dup of #47723 Patrick Donnelly
06:51 PM Bug #47563 (Fix Under Review): qa: kernel client closes session improperly causing eviction due t...
Patrick Donnelly
06:49 PM Bug #46883 (Triaged): kclient: ghost kernel mount
Patrick Donnelly
06:47 PM Bug #46648 (In Progress): mds: cannot handle hundreds+ of subtrees
Zheng is currently working on this. Patrick Donnelly
06:47 PM Bug #46507 (Triaged): qa: test_data_scan: "show inode" returns ENOENT
Patrick Donnelly
06:46 PM Bug #47787 (Triaged): mgr/nfs: exercise host-level HA of NFS-Ganesha by killing the process
Patrick Donnelly
05:57 PM Bug #47806 (Fix Under Review): mon/MDSMonitor: divide mds identifier and mds real name with dot
Patrick Donnelly
03:14 AM Bug #47806 (Resolved): mon/MDSMonitor: divide mds identifier and mds real name with dot
Current health detail outputs mds slow request as below.... Zhi Zhang
05:36 PM Backport #42157 (Rejected): nautilus: cephfs-shell: rmdir doesn't complain when directory is not ...
Patrick Donnelly
04:02 PM Backport #47259: nautilus: client: FAILED assert(dir->readdir_cache[dirp->cache_index] == dn)
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37232
merged
Yuri Weinstein
04:01 PM Backport #47252: nautilus: mds: fix possible crash when the MDS is stopping
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37229
merged
Yuri Weinstein
04:01 PM Backport #47246: nautilus: qa: Replacing daemon mds.a as rank 0 with standby daemon mds.b" in clu...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37228
merged
Yuri Weinstein
04:00 PM Backport #47088: nautilus: mds: recover files after normal session close
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37178
merged
Yuri Weinstein
03:08 PM Backport #47605: nautilus: mds: purge_queue's _calculate_ops is inaccurate
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37481
merged
Yuri Weinstein
03:03 PM Backport #46960 (Resolved): nautilus: cephfs-journal-tool: incorrect read_offset after finding mi...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37479
m...
Nathan Cutler
03:03 PM Backport #47622: nautilus: various quota failures
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37231
m...
Nathan Cutler
02:37 AM Backport #47622 (Resolved): nautilus: various quota failures
Wei-Chung Cheng
03:03 PM Backport #47254: nautilus: client: Client::open() pass wrong cap mask to path_walk
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37231
m...
Nathan Cutler
02:36 AM Backport #47254 (Resolved): nautilus: client: Client::open() pass wrong cap mask to path_walk
Wei-Chung Cheng
03:02 PM Backport #47090 (Resolved): nautilus: After restarting an mds, its standy-replay mds remained in ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37179
m...
Nathan Cutler
03:02 PM Backport #46784 (Resolved): nautilus: mds/CInode: Optimize only pinned by subtrees check
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36965
m...
Nathan Cutler
11:53 AM Bug #45344 (Fix Under Review): doc: Table Of Contents doesn't work
Zac Dover
09:53 AM Bug #45344: doc: Table Of Contents doesn't work
Zac Dover wrote:
> There's strange behavior here.
>
> The top-level menu items link nowhere, but the second-order...
Jos Collin
09:32 AM Bug #45344: doc: Table Of Contents doesn't work
There's strange behavior here.
The top-level menu items link nowhere, but the second-order menu items link to targ...
Zac Dover

10/08/2020

08:25 PM Feature #46892 (Pending Backport): pybind/mgr/volumes: Make number of cloner threads configurable
Patrick Donnelly
08:23 PM Feature #42451 (Resolved): mds: add root_squash
Patrick Donnelly
08:16 PM Bug #47786: mds: log [ERR] : failed to commit dir 0x100000005f1.1010* object, errno -2
/ceph/teuthology-archive/pdonnell-2020-10-08_01:40:56-multimds-wip-pdonnell-testing-20201007.214100-distro-basic-smit... Patrick Donnelly
03:38 PM Bug #47798 (Duplicate): pybind/mgr/volumes: TypeError: bad operand type for unary -: 'str' for er...
A stack trace when facing ETIMEDOUT errno during subvolume operations is presented below,... Shyamsundar Ranganathan
03:23 PM Backport #46960: nautilus: cephfs-journal-tool: incorrect read_offset after finding missing objects
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37479
merged
Yuri Weinstein
03:23 PM Backport #47622: nautilus: various quota failures
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37231
merged
Yuri Weinstein
03:23 PM Backport #47254: nautilus: client: Client::open() pass wrong cap mask to path_walk
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37231
merged
Yuri Weinstein
03:22 PM Backport #47090: nautilus: After restarting an mds, its standy-replay mds remained in the "resolv...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37179
merged
Yuri Weinstein
03:22 PM Backport #46784: nautilus: mds/CInode: Optimize only pinned by subtrees check
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36965
merged
Yuri Weinstein
01:35 PM Bug #46273 (Resolved): mds: deleting a large number of files in a directory causes the file syste...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:35 PM Bug #46355 (Resolved): client: directory inode can not call release_callback
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:35 PM Bug #46597 (Resolved): qa: Fs cleanup fails with a traceback
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:34 PM Bug #47015 (Resolved): mds: decoding of enum types on big-endian systems broken
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:34 PM Bug #47201 (Resolved): mds: CDir::_omap_commit(int): Assertion `committed_version == 0' failed.
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
10:53 AM Bug #46434: osdc: FAILED ceph_assert(bh->waitfor_read.empty())
Saw this issue again in a recent nautilus test run,
https://pulpito.ceph.com/yuriw-2020-10-05_22:19:52-multimds-wip-...
Ramana Raja
10:26 AM Backport #47316 (Resolved): octopus: mds: CDir::_omap_commit(int): Assertion `committed_version =...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37034
m...
Nathan Cutler
10:26 AM Backport #46520 (Resolved): octopus: mds: deleting a large number of files in a directory causes ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37034
m...
Nathan Cutler
10:26 AM Backport #46522: octopus: mds: fix hang issue when accessing a file under a lost parent directory
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37020
m...
Nathan Cutler
10:25 AM Backport #46516 (Resolved): octopus: client: directory inode can not call release_callback
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37017
m...
Nathan Cutler
10:25 AM Backport #47080 (Resolved): octopus: mds: decoding of enum types on big-endian systems broken
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36813
m...
Nathan Cutler
10:25 AM Backport #46947 (Resolved): octopus: qa: Fs cleanup fails with a traceback
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36713
m...
Nathan Cutler

10/07/2020

08:34 PM Bug #47787 (Triaged): mgr/nfs: exercise host-level HA of NFS-Ganesha by killing the process
In my own testing, the process is not respawned and the NFS client hangs. I suspect there's some changes necessary to... Patrick Donnelly
08:24 PM Bug #47786 (Resolved): mds: log [ERR] : failed to commit dir 0x100000005f1.1010* object, errno -2
... Patrick Donnelly
07:38 PM Bug #47591 (Resolved): TestNFS: test_exports_on_mgr_restart: command failed with status 32: 'sudo...
Patrick Donnelly
05:41 PM Documentation #47784 (In Progress): nfs: Remove doc on creating cephfs exports using rook
Varsha Rao
05:37 PM Documentation #47784 (Resolved): nfs: Remove doc on creating cephfs exports using rook
The doc[1] on creating cephfs exports using dashboard with rook is outdated and using dashboard backend script is bug... Varsha Rao
05:12 PM Bug #47783 (Fix Under Review): mgr/nfs: Pseudo path prints wrong error message
Varsha Rao
05:09 PM Bug #47783 (Resolved): mgr/nfs: Pseudo path prints wrong error message
Pseudo path must be an absolute path. But the error message printed is "It should not be absolute path". Varsha Rao

10/06/2020

05:37 PM Bug #47591 (Fix Under Review): TestNFS: test_exports_on_mgr_restart: command failed with status 3...
I am still not able to reproduce the issue with latest master branch. Looking at failure logs, I suspect ganesha daem... Varsha Rao
05:26 PM Bug #46129 (Resolved): mds: fix hang issue when accessing a file under a lost parent directory
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
05:05 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
Thanks. scan_links fixed these dups (and some Bad nlink). For future reference, it took 2 hours to complete on a clus... Dan van der Ster
06:42 AM Backport #45853: octopus: cephfs-journal-tool: NetHandler create_socket couldn't create socket
Zheng Yan do you intend to work on this one? Shyukri Shyukriev
05:39 AM Bug #47515 (Fix Under Review): pybind/snap_schedule: deactivating a schedule is ineffective
Venky Shankar
01:49 AM Backport #46522 (Resolved): octopus: mds: fix hang issue when accessing a file under a lost paren...
Wei-Chung Cheng

10/05/2020

11:33 PM Backport #47316: octopus: mds: CDir::_omap_commit(int): Assertion `committed_version == 0' failed.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37034
merged
Yuri Weinstein
11:32 PM Backport #46520: octopus: mds: deleting a large number of files in a directory causes the file sy...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37034
merged
Yuri Weinstein
11:32 PM Backport #46522: octopus: mds: fix hang issue when accessing a file under a lost parent directory
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37020
merged
Yuri Weinstein
11:31 PM Backport #46516: octopus: client: directory inode can not call release_callback
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37017
merged
Yuri Weinstein
11:31 PM Backport #47080: octopus: mds: decoding of enum types on big-endian systems broken
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36813
merged
Yuri Weinstein
11:30 PM Backport #46947: octopus: qa: Fs cleanup fails with a traceback
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36713
merged
Yuri Weinstein
08:20 PM Bug #44638 (Resolved): test_scrub_pause_and_resume (tasks.cephfs.test_scrub_checks.TestScrubContr...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
07:53 PM Backport #46524: octopus: non-head batch requests may hold authpins and locks
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37022
m...
Nathan Cutler
02:15 AM Backport #46524 (Resolved): octopus: non-head batch requests may hold authpins and locks
Wei-Chung Cheng
07:52 PM Backport #46473: octopus: mds: make threshold for MDS_TRIM warning configurable
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36970
m...
Nathan Cutler
02:15 AM Backport #46473 (Resolved): octopus: mds: make threshold for MDS_TRIM warning configurable
Wei-Chung Cheng
07:50 PM Backport #47017 (Resolved): nautilus: mds: kcephfs parse dirfrag's ndist is always 0
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37177
m...
Nathan Cutler
03:41 PM Backport #47017: nautilus: mds: kcephfs parse dirfrag's ndist is always 0
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37177
merged
Yuri Weinstein
07:50 PM Backport #47317 (Resolved): nautilus: mds: CDir::_omap_commit(int): Assertion `committed_version ...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/37035
m...
Nathan Cutler
03:41 PM Backport #47317: nautilus: mds: CDir::_omap_commit(int): Assertion `committed_version == 0' failed.
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37035
merged
Yuri Weinstein
07:50 PM Backport #46941: nautilus: mds: memory leak during cache drop
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36967
m...
Nathan Cutler
03:48 PM Backport #46941 (Resolved): nautilus: mds: memory leak during cache drop
Wei-Chung Cheng
03:40 PM Backport #46941: nautilus: mds: memory leak during cache drop
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36967
merged
Yuri Weinstein
07:50 PM Backport #46787: nautilus: client: in _open() the open ref maybe decreased twice, but only increa...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36966
m...
Nathan Cutler
03:48 PM Backport #46787 (Resolved): nautilus: client: in _open() the open ref maybe decreased twice, but ...
Wei-Chung Cheng
03:39 PM Backport #46787: nautilus: client: in _open() the open ref maybe decreased twice, but only increa...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36966
merged
Yuri Weinstein
07:49 PM Backport #47081 (Resolved): nautilus: mds: decoding of enum types on big-endian systems broken
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36814
m...
Nathan Cutler
03:39 PM Backport #47081: nautilus: mds: decoding of enum types on big-endian systems broken
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36814
merged
Yuri Weinstein
07:49 PM Backport #46151 (Resolved): nautilus: test_scrub_pause_and_resume (tasks.cephfs.test_scrub_checks...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36168
m...
Nathan Cutler
03:38 PM Backport #46151: nautilus: test_scrub_pause_and_resume (tasks.cephfs.test_scrub_checks.TestScrubC...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36168
merged
Yuri Weinstein
07:49 PM Backport #46943: nautilus: mds: segv in MDCache::wait_for_uncommitted_fragments
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36968
m...
Nathan Cutler
03:48 PM Backport #46943 (Resolved): nautilus: mds: segv in MDCache::wait_for_uncommitted_fragments
Wei-Chung Cheng
03:36 PM Backport #46943: nautilus: mds: segv in MDCache::wait_for_uncommitted_fragments
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36968
merged
Yuri Weinstein
07:48 PM Backport #46633: nautilus: mds forwarding request 'no_available_op_found'
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36963
m...
Nathan Cutler
03:36 PM Backport #46633 (Resolved): nautilus: mds forwarding request 'no_available_op_found'
Wei-Chung Cheng
03:35 PM Backport #46633: nautilus: mds forwarding request 'no_available_op_found'
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36963
merged
Yuri Weinstein
01:16 PM Bug #44785 (Resolved): non-head batch requests may hold authpins and locks
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
01:15 PM Feature #45906 (Resolved): mds: make threshold for MDS_TRIM warning configurable
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
06:30 AM Bug #47744 (Duplicate): nautilus: CDir::_omap_commit(int): Assertion `committed_version == 0' fai...
Duplicate of https://tracker.ceph.com/issues/47201 Ramana Raja
06:27 AM Bug #47744 (Duplicate): nautilus: CDir::_omap_commit(int): Assertion `committed_version == 0' fai...
... Ramana Raja

10/04/2020

05:32 AM Feature #46059 (Resolved): vstart_runner.py: optionally rotate logs between tests
Kefu Chai

10/03/2020

04:05 PM Backport #46524: octopus: non-head batch requests may hold authpins and locks
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/37022
merged
Yuri Weinstein
03:58 PM Backport #46473: octopus: mds: make threshold for MDS_TRIM warning configurable
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36970
merged
Yuri Weinstein
12:15 AM Bug #47734 (Fix Under Review): client: hang after statfs
Patrick Donnelly
12:00 AM Bug #47734 (Resolved): client: hang after statfs
... Patrick Donnelly

10/02/2020

06:07 PM Bug #47689 (Fix Under Review): rados/upgrade/nautilus-x-singleton fails due to cluster [WRN] evic...
Patrick Donnelly
06:04 PM Bug #47689 (In Progress): rados/upgrade/nautilus-x-singleton fails due to cluster [WRN] evicting ...
This appears to be a fairly old failure. Here's a few instances:
https://pulpito.ceph.com/teuthology-2020-07-22_07...
Patrick Donnelly
04:25 PM Bug #47591 (New): TestNFS: test_exports_on_mgr_restart: command failed with status 32: 'sudo moun...
... Neha Ojha
01:07 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
Dan van der Ster wrote:
> Now I'm trying to repair the metadata on this fs so it fully consistent.
> When I run 'sc...
Zheng Yan
11:35 AM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
Now I'm trying to repair the metadata on this fs so it fully consistent.
When I run 'scrub start / force recurstive ...
Dan van der Ster

10/01/2020

08:08 PM Bug #47642 (Resolved): nautilus: qa/suites/{kcephfs, multimds}: client kernel "testing" builds fo...
Patrick Donnelly
05:05 PM Bug #47689: rados/upgrade/nautilus-x-singleton fails due to cluster [WRN] evicting unresponsive c...
/a/teuthology-2020-10-01_07:01:02-rados-master-distro-basic-smithi/5485885 Neha Ojha
03:31 PM Bug #43762: pybind/mgr/volumes: create fails with TypeError
Jos Collin wrote:
> Victoria Martinez de la Cruz wrote:
> > Adding more context to this
> >
> > This happened af...
Victoria Martinez de la Cruz
06:36 AM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
Patrick Donnelly wrote:
> Sounds good. Please write up a PR for this Xiubo.
Sure, will do.
Xiubo Li
02:17 AM Bug #43902: qa: mon_thrash: timeout "ceph quorum_status"
/ceph/teuthology-archive/pdonnell-2020-09-29_05:23:34-fs-wip-pdonnell-testing-20200929.022151-distro-basic-smithi/547... Patrick Donnelly

09/30/2020

09:18 PM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
Sounds good. Please write up a PR for this Xiubo. Patrick Donnelly
01:42 AM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
Patrick Donnelly wrote:
> Xiubo Li wrote:
> > @Patrick,
> >
> > Maybe the MDS shouldn't report the WRN to monito...
Xiubo Li
09:10 PM Bug #47307: mds: throttle workloads which acquire caps faster than the client can release
Dan van der Ster wrote:
> Are you sure that the defaults for recalling aren't overly conservative?
Yes, the proba...
Patrick Donnelly
05:54 PM Bug #47689: rados/upgrade/nautilus-x-singleton fails due to cluster [WRN] evicting unresponsive c...
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483508/ Neha Ojha
03:41 PM Fix #46645 (Resolved): librados|libcephfs: use latest MonMap when creating from CephContext
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
03:02 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
b1 was not longer there after we followed the recover_dentries procedure, so it is gone.
Dan van der Ster
02:43 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
try deleting 'd1' using 'rados rmomapkey'. If you have debug_mds=10, it should be easy to get d1's parent dirfrag (co... Zheng Yan
01:21 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
Here is the `b1` dir at the start of this issue:... Dan van der Ster
01:19 PM Bug #47698: mds crashed in try_remove_dentries_for_stray after touching file in strange directory
After finishing the following, the MDS started:... Dan van der Ster
01:03 PM Bug #47698 (New): mds crashed in try_remove_dentries_for_stray after touching file in strange dir...
We had a directory "b1" which appeared empty but could not be rmdir'd.
The directory also had a very large size, als...
Dan van der Ster
07:27 AM Bug #47693 (In Progress): qa: snap replicator tests
Milind Changire
07:24 AM Bug #47693 (Rejected): qa: snap replicator tests
add tests for snap replicator component
requires PR#36276
Milind Changire
07:11 AM Backport #46479 (Resolved): octopus: mds: send scrub status to ceph-mgr only when scrub is runnin...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36047
m...
Nathan Cutler

09/29/2020

09:46 PM Backport #46479: octopus: mds: send scrub status to ceph-mgr only when scrub is running (or pause...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36047
merged
Yuri Weinstein
08:08 PM Bug #47689 (Resolved): rados/upgrade/nautilus-x-singleton fails due to cluster [WRN] evicting unr...
... Neha Ojha
07:20 PM Backport #47605 (In Progress): nautilus: mds: purge_queue's _calculate_ops is inaccurate
Nathan Cutler
05:05 PM Backport #47020 (In Progress): nautilus: client: shutdown race fails with status 141
Nathan Cutler
04:56 PM Backport #46960 (In Progress): nautilus: cephfs-journal-tool: incorrect read_offset after finding...
Nathan Cutler
02:17 PM Bug #47307: mds: throttle workloads which acquire caps faster than the client can release
Are you sure that the defaults for recalling aren't overly conservative?
Today debugging a situation with 2 heavy ...
Dan van der Ster
02:08 PM Bug #47307 (In Progress): mds: throttle workloads which acquire caps faster than the client can r...
Patrick Donnelly
02:10 PM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
Xiubo Li wrote:
> @Patrick,
>
> Maybe the MDS shouldn't report the WRN to monitor when revoking the "Fwbl" caps ?...
Patrick Donnelly
08:58 AM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
@Patrick,
Maybe the MDS shouldn't report the WRN to monitor when revoking the "Fwbl" caps ? Since it may need to f...
Xiubo Li
08:21 AM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
During flush the 0x200000007d5 inode, there also have many other inodes doing the flush on the same osd.6 at the same... Xiubo Li
03:21 AM Bug #47565: qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d5 pending p...
From 5451587/remote/smithi110/log/ceph-client.1.30354.log.gz:
We can see that the client.4606 has received the rev...
Xiubo Li
02:45 AM Bug #47565 (In Progress): qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x2000000...
Xiubo Li
02:07 PM Bug #47682: MDS can't release caps faster than clients taking caps
Dan, see: #47307 Patrick Donnelly
01:51 PM Bug #47682 (Rejected): MDS can't release caps faster than clients taking caps
with more effective tuning I think we can manage. cancelling this ticket. Dan van der Ster
10:23 AM Bug #47682: MDS can't release caps faster than clients taking caps
Our current config is:
mds_recall_global_max_decay_threshold 200000
mds_recall_max_decay_threshold 100000
mds_re...
Dan van der Ster
10:10 AM Bug #47682: MDS can't release caps faster than clients taking caps
Update:
* the central cache freelist eventually decreases after an hour or so.
* I suppose the bigger issue is tha...
Dan van der Ster
08:06 AM Bug #47682 (Rejected): MDS can't release caps faster than clients taking caps
We have a workload in which a kernel client is stat'ing all files in an FS. This workload triggered a few issues:
...
Dan van der Ster

09/28/2020

07:30 PM Backport #47014 (Resolved): octopus: librados|libcephfs: use latest MonMap when creating from Cep...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36705
m...
Nathan Cutler
07:30 PM Backport #47013 (Resolved): nautilus: librados|libcephfs: use latest MonMap when creating from Ce...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36704
m...
Nathan Cutler
02:51 PM Backport #47013: nautilus: librados|libcephfs: use latest MonMap when creating from CephContext
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36704
merged
Yuri Weinstein
07:29 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
I have a patch I'm testing now that seems to also anecdotally fix some of the umount hangs I've seen lately during xf... Jeff Layton
05:50 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
Patrick Donnelly wrote:
>
> I _think_ the concern is that hte client could conceivably dirty the cap the MDS just ...
Jeff Layton
05:39 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
Jeff Layton wrote:
> Doesn't look like libcephfs does anything saner:
>
> [...]
>
> ...and it looks like the t...
Patrick Donnelly
05:07 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
Doesn't look like libcephfs does anything saner:... Jeff Layton
04:51 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
Jeff Layton wrote:
> Hmm, ok. This may be related to another bug I've been chasing where umount hangs waiting for th...
Patrick Donnelly
04:39 PM Bug #47563 (In Progress): qa: kernel client closes session improperly causing eviction due to tim...
Jeff Layton
04:34 PM Bug #47563: qa: kernel client closes session improperly causing eviction due to timeout
Hmm, ok. This may be related to another bug I've been chasing where umount hangs waiting for the session to close. I ... Jeff Layton
06:50 PM Bug #47006 (Resolved): mon: required client features adding/removing
Patrick Donnelly
06:49 PM Feature #47148 (Resolved): mds: get rid of the mds_lock when storing the inode backtrace to meta ...
Patrick Donnelly
06:47 PM Tasks #47047 (Resolved): client: release the client_lock before copying data in all the reads
Patrick Donnelly
06:47 PM Bug #47039 (Resolved): client: mutex lock FAILED ceph_assert(nlock > 0)
Patrick Donnelly
06:42 PM Bug #47679 (New): kceph: kernel does not open session with MDS importing subtree
... Patrick Donnelly
06:24 PM Bug #47678: mgr: include/interval_set.h: 466: ceph_abort_msg("abort() called")
https://pulpito.ceph.com/teuthology-2020-09-21_04:15:02-multimds-master-distro-basic-smithi/5454314/
Seems to be a...
Patrick Donnelly
06:17 PM Bug #47678 (New): mgr: include/interval_set.h: 466: ceph_abort_msg("abort() called")
... Patrick Donnelly
06:11 PM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
Another: /ceph/teuthology-archive/pdonnell-2020-09-26_05:47:56-fs-wip-pdonnell-testing-20200926.000836-distro-basic-s... Patrick Donnelly
04:33 PM Feature #47034: mds: readdir for snapshot diff
Hey Zheng,
CephFS snapshot mirror would make use of rctime approach. That needs PR https://github.com/ceph/ceph/pu...
Venky Shankar
03:03 PM Bug #47642 (Fix Under Review): nautilus: qa/suites/{kcephfs, multimds}: client kernel "testing" b...
Ramana Raja
01:40 PM Bug #47662 (Fix Under Review): mds: try to replicate hot dir to restarted MDS
Patrick Donnelly

09/27/2020

10:59 PM Backport #47014: octopus: librados|libcephfs: use latest MonMap when creating from CephContext
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36705
merged
Yuri Weinstein
10:41 AM Bug #47662 (Resolved): mds: try to replicate hot dir to restarted MDS
Hot dir would be replicated to other active MDSes, but if replica MDS restarted, auth MDS won't replicate this dir ag... Zhi Zhang

09/25/2020

04:40 PM Feature #15070 (Resolved): mon: client: multifs: auth caps on client->mon connections to limit th...
Patrick Donnelly
02:59 PM Bug #47652: teuthology's misc.sudo_write_file is incompatible with vstart_runner
> The compatibility was broken by this teuthology PR, since it makes
"this teuthology PR": https://github.com/cep...
Rishabh Dave
02:58 PM Bug #47652 (Fix Under Review): teuthology's misc.sudo_write_file is incompatible with vstart_runner
Rishabh Dave
02:41 PM Bug #47652 (Resolved): teuthology's misc.sudo_write_file is incompatible with vstart_runner
Here's the traceback -... Rishabh Dave
02:32 PM Feature #46059: vstart_runner.py: optionally rotate logs between tests
Got some time to work on this finally. Fixed the PR after some scrutiny, ceph API tests pass for this PR now. Rishabh Dave
08:57 AM Backport #47622 (In Progress): nautilus: various quota failures
Wei-Chung Cheng
08:43 AM Bug #47643: mds: Segmentation fault in thread 7fcff3078700 thread_name:md_log_replay
Patrick Donnelly wrote:
> > #x 0x5628d800
>
> I'm not sure this double-deref is indicating anything. Are you sure...
Jan Fajerski
12:22 AM Cleanup #47325 (Resolved): client: remove unneccessary client_lock for objector->write()
Patrick Donnelly
12:20 AM Bug #40613 (New): kclient: .handle_message_footer got old message 1 <= 648 0x558ceadeaac0 client_...
This one is back:... Patrick Donnelly

09/24/2020

07:33 PM Bug #46823 (Resolved): nautilus: kceph w/ testing branch: mdsc_handle_session corrupt message mds...
Fixed upstream. Jeff Layton
07:17 PM Backport #47622 (Need More Info): nautilus: various quota failures
Nathan Cutler
07:16 PM Backport #47623 (In Progress): octopus: various quota failures
Nathan Cutler
05:29 PM Bug #47643 (Need More Info): mds: Segmentation fault in thread 7fcff3078700 thread_name:md_log_re...
> #x 0x5628d800
I'm not sure this double-deref is indicating anything. Are you sure that's a pointer? Would you no...
Patrick Donnelly
04:43 PM Bug #47643 (Need More Info): mds: Segmentation fault in thread 7fcff3078700 thread_name:md_log_re...
In ceph-14.2.11.394+g9cbbc473c0 (downstream build but mds sources are the same as v14.2.11) we got a report about the... Jan Fajerski
04:33 PM Bug #47642 (Resolved): nautilus: qa/suites/{kcephfs, multimds}: client kernel "testing" builds fo...
As described in https://tracker.ceph.com/issues/47540, kernel "testing" builds for CentOS 7 are unavailable. This is ... Ramana Raja
11:31 AM Bug #47591 (Can't reproduce): TestNFS: test_exports_on_mgr_restart: command failed with status 32...
The mount command does not fail with latest builds: http://pulpito.front.sepia.ceph.com/varsha-2020-09-24_10:49:55-ra... Varsha Rao
07:29 AM Bug #46769: qa: Refactor cephfs creation/removal code.
Based on comment https://github.com/ceph/ceph/pull/36368#pullrequestreview-458486627, retaining the behavior of clean... Kotresh Hiremath Ravishankar
03:44 AM Backport #47608 (In Progress): octopus: mds: OpenFileTable::prefetch_inodes during rejoin can cau...
https://github.com/ceph/ceph/pull/37383 Zheng Yan
03:43 AM Backport #47609 (In Progress): nautilus: mds: OpenFileTable::prefetch_inodes during rejoin can ca...
https://github.com/ceph/ceph/pull/37382 Zheng Yan

09/23/2020

07:06 PM Bug #45835: mds: OpenFileTable::prefetch_inodes during rejoin can cause out-of-memory
Dan van der Ster wrote:
> The fix was merged. Something needed to start the backports process?
@Dan, the "backpor...
Nathan Cutler
07:05 PM Bug #46583 (Resolved): mds slave request 'no_available_op_found'
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
07:04 PM Backport #47623 (Resolved): octopus: various quota failures
https://github.com/ceph/ceph/pull/37369 Nathan Cutler
07:04 PM Backport #47622 (Resolved): nautilus: various quota failures
https://github.com/ceph/ceph/pull/37231 Nathan Cutler
03:56 PM Backport #46790 (Rejected): nautilus: mds slave request 'no_available_op_found'
This isn't really necessary for backport. Patrick Donnelly
01:31 PM Backport #46790 (Need More Info): nautilus: mds slave request 'no_available_op_found'
non-trivial conflicts
@Patrick, could you help find the right assignee for this?
Nathan Cutler
03:56 PM Backport #46789 (Rejected): octopus: mds slave request 'no_available_op_found'
This isn't really necessary for backport. Patrick Donnelly
01:30 PM Backport #46789 (Need More Info): octopus: mds slave request 'no_available_op_found'
non-trivial conflicts
@Patrick, could you help find the right assignee for this?
Nathan Cutler
03:06 PM Bug #47224 (Pending Backport): various quota failures
Patrick Donnelly
01:22 PM Backport #47608 (Need More Info): octopus: mds: OpenFileTable::prefetch_inodes during rejoin can ...
extensive changeset with non-trivial conflicts Nathan Cutler
11:17 AM Backport #47608 (Resolved): octopus: mds: OpenFileTable::prefetch_inodes during rejoin can cause ...
https://github.com/ceph/ceph/pull/37383 Nathan Cutler
01:19 PM Backport #47604 (In Progress): octopus: mds: purge_queue's _calculate_ops is inaccurate
Nathan Cutler
11:15 AM Backport #47604 (Resolved): octopus: mds: purge_queue's _calculate_ops is inaccurate
https://github.com/ceph/ceph/pull/37372 Nathan Cutler
01:12 PM Backport #47601 (In Progress): octopus: mgr/nfs: Cluster creation throws 'NoneType' object has no...
Nathan Cutler
11:14 AM Backport #47601 (Resolved): octopus: mgr/nfs: Cluster creation throws 'NoneType' object has no at...
https://github.com/ceph/ceph/pull/37371 Nathan Cutler
01:09 PM Backport #47260 (In Progress): octopus: client: FAILED assert(dir->readdir_cache[dirp->cache_inde...
Nathan Cutler
01:08 PM Backport #47255 (In Progress): octopus: client: Client::open() pass wrong cap mask to path_walk
Nathan Cutler
01:02 PM Backport #47253 (In Progress): octopus: mds: fix possible crash when the MDS is stopping
Nathan Cutler
01:02 PM Backport #47247 (In Progress): octopus: qa: Replacing daemon mds.a as rank 0 with standby daemon ...
Nathan Cutler
12:53 PM Backport #47151 (In Progress): octopus: pybind/mgr/volumes: add debugging for global lock
Nathan Cutler
12:52 PM Backport #47147 (In Progress): octopus: pybind/mgr/nfs: Test mounting of exports created with nfs...
Nathan Cutler
12:51 PM Backport #47095 (Need More Info): octopus: mds: provide altrenatives to increase the total cephfs...
non-trivial feature Nathan Cutler
12:50 PM Backport #47089 (In Progress): octopus: After restarting an mds, its standy-replay mds remained i...
Nathan Cutler
12:49 PM Backport #47085 (In Progress): octopus: common: validate type CephBool cause 'invalid command json'
Nathan Cutler
12:30 PM Backport #47083 (In Progress): octopus: mds: 'forward loop' when forward_all_requests_to_auth is set
Nathan Cutler
12:25 PM Feature #47266 (Closed): add a subcommand to change caps in a simpler and clear way
Rishabh Dave
12:13 PM Bug #47006 (Fix Under Review): mon: required client features adding/removing
Jos Collin
12:07 PM Backport #47021 (In Progress): octopus: client: shutdown race fails with status 141
Nathan Cutler
12:06 PM Backport #47018 (In Progress): octopus: mds: kcephfs parse dirfrag's ndist is always 0
Nathan Cutler
12:06 PM Backport #47016 (In Progress): octopus: mds: fix the decode version
Nathan Cutler
12:05 PM Backport #46942 (In Progress): octopus: mds: segv in MDCache::wait_for_uncommitted_fragments
Nathan Cutler
12:05 PM Backport #46940 (In Progress): octopus: mds: memory leak during cache drop
Nathan Cutler
12:02 PM Backport #46859 (In Progress): octopus: mds: do not raise "client failing to respond to cap relea...
Nathan Cutler
12:01 PM Backport #46857 (In Progress): octopus: qa: add debugging for volumes plugin use of libcephfs
Nathan Cutler
12:01 PM Backport #46855 (In Progress): octopus: client: static dirent for readdir is not thread-safe
Nathan Cutler
11:59 AM Backport #46463 (In Progress): octopus: mgr/volumes: fs subvolume clones stuck in progress when l...
Nathan Cutler
11:54 AM Backport #46094 (Need More Info): octopus: cephfs-shell: set proper return value for the tool
non-trivial conflicts Nathan Cutler
11:18 AM Bug #44408 (Resolved): qa: after the cephfs qa test case quit the mountpoints still exist
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:17 AM Backport #47609 (Rejected): nautilus: mds: OpenFileTable::prefetch_inodes during rejoin can cause...
https://github.com/ceph/ceph/pull/37382 Nathan Cutler
11:17 AM Bug #46269 (Resolved): ceph-fuse: ceph-fuse process is terminated by the logratote task and what ...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Nathan Cutler
11:15 AM Backport #47605 (Resolved): nautilus: mds: purge_queue's _calculate_ops is inaccurate
https://github.com/ceph/ceph/pull/37481 Nathan Cutler
11:10 AM Backport #47087 (In Progress): octopus: mds: recover files after normal session close
Nathan Cutler
11:02 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Hi Jeff,
Have finished code in MDS, and for now I didn't handle the loopup version case. All the version related ...
Xiubo Li
01:24 AM Feature #47162 (Fix Under Review): mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li
08:21 AM Backport #47178 (Resolved): nautilus: qa: after the cephfs qa test case quit the mountpoints stil...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36863
m...
Nathan Cutler
08:19 AM Backport #47152 (Resolved): nautilus: pybind/mgr/volumes: add debugging for global lock
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36828
m...
Nathan Cutler
08:19 AM Backport #46948 (Resolved): nautilus: qa: Fs cleanup fails with a traceback
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36714
m...
Nathan Cutler
08:19 AM Backport #46592 (Resolved): nautilus: ceph-fuse: ceph-fuse process is terminated by the logratote...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/36181
m...
Nathan Cutler

09/22/2020

10:27 PM Bug #47591 (Resolved): TestNFS: test_exports_on_mgr_restart: command failed with status 32: 'sudo...
a/mgfritch-2020-09-21_20:24:35-rados:cephadm-wip-mgfritch-testing-2020-09-21-1034-distro-basic-smithi/5457554/teuthol... Michael Fritch
08:09 PM Backport #47095: octopus: mds: provide altrenatives to increase the total cephfs subvolume snapsh...
https://tracker.ceph.com/issues/47158 depends on the backport for this issue.
A simple cherry pick is throwing con...
Shyamsundar Ranganathan
08:05 PM Backport #47158: octopus: mgr/volumes: Mark subvolumes with ceph.dir.subvolume vxattr, to improve...
Also depends on the backport of https://tracker.ceph.com/issues/47095 Shyamsundar Ranganathan
07:49 PM Backport #47178: nautilus: qa: after the cephfs qa test case quit the mountpoints still exist
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36863
merged
Yuri Weinstein
06:15 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
Thank you for the provided information.
I will test the MDS failover in a day. Quick question regarding "mds_log_m...
Heilig IOS
04:14 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
Heilig IOS wrote:
> Still no changes. The "mds_log_max_segments" didn't help. The MDS failover is running for 30 min...
Patrick Donnelly
04:13 PM Bug #47582 (Rejected): MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
(This discussion should move to ceph-users.) Patrick Donnelly
02:40 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
Still no changes. The "mds_log_max_segments" didn't help. The MDS failover is running for 30 minutes already. What el... Heilig IOS
02:04 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
I decreased it with these commands:... Heilig IOS
01:52 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
Heilig IOS wrote:
> Current value: mds_log_max_segments = 100000
that's the root cause. the value should be small...
Zheng Yan
01:47 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
Current value: mds_log_max_segments = 100000 Heilig IOS
01:34 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
what is the value of "mds log max segments" config Zheng Yan
01:24 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
I have this issue right now. No, there is no "mds behind on trim" warning. Heilig IOS
01:11 PM Bug #47582: MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
were there "mds behind on trim" warning Zheng Yan
12:16 PM Bug #47582 (Rejected): MDS failover takes 10-15 hours: Ceph MDS stays in "up:replay" state for hours
We have 9 nodes Ceph cluster. Ceph version is 15.2.5. The cluster has 175 OSD (HDD) + 3 NVMe for cache tier for "ceph... Heilig IOS
04:07 PM Backport #47254: nautilus: client: Client::open() pass wrong cap mask to path_walk
regression: https://tracker.ceph.com/issues/47224 Patrick Donnelly
04:07 PM Backport #47255: octopus: client: Client::open() pass wrong cap mask to path_walk
regression: https://tracker.ceph.com/issues/47224 Patrick Donnelly
03:21 PM Feature #47490: Integration of dashboard with volume/nfs module
Volume/nfs module doc: https://docs.ceph.com/docs/master/cephfs/fs-nfs-exports Varsha Rao
03:02 PM Feature #47490 (In Progress): Integration of dashboard with volume/nfs module
Patrick Donnelly
09:35 AM Feature #47490: Integration of dashboard with volume/nfs module
Exports and nfs clusters cannot be managed by dashboard and volumes/nfs interface at the same time. Xattrs can be use... Varsha Rao
02:55 PM Feature #47587 (In Progress): pybind/mgr/nfs: add Rook support
Patrick Donnelly
02:10 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Xiubo Li wrote:
> > Hi Jeff,
> >
> > There is another case for lookup:
> >
> > If the MD...
Xiubo Li
12:02 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
> Hi Jeff,
>
> There is another case for lookup:
>
> If the MDS is old version, such as all th...
Jeff Layton
11:52 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
I think the MDS should treat these names as opaque. The client should never need to look up a dentry by the binary cr... Jeff Layton
02:35 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Hi Jeff,
There is another case for lookup:
If the MDS is old version, such as all the dentries is under `ceph_f...
Xiubo Li
12:34 PM Bug #47224 (Fix Under Review): various quota failures
Zheng Yan

09/21/2020

09:53 PM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
All right, I'm going to shove some more debug information in Objecter and Monitor. Adam Emerson
12:40 AM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
Xiubo Li wrote:
> Patrick Donnelly wrote:
> > Xiubo Li wrote:
> > > Hi Patrick,
> > >
> > > For this let's add ...
Patrick Donnelly
09:04 PM Bug #47526 (Resolved): qa: RuntimeError: FSCID 2 not in map
Patrick Donnelly
09:02 PM Bug #36389: untar encounters unexpected EPERM on kclient/multimds cluster with thrashing
... Patrick Donnelly
08:32 PM Bug #47565 (Resolved): qa: "client.4606 isn't responding to mclientcaps(revoke), ino 0x200000007d...
... Patrick Donnelly
07:47 PM Bug #47563 (Resolved): qa: kernel client closes session improperly causing eviction due to timeout
... Patrick Donnelly
05:08 PM Bug #45835 (Pending Backport): mds: OpenFileTable::prefetch_inodes during rejoin can cause out-of...
Patrick Donnelly
03:47 PM Bug #45835: mds: OpenFileTable::prefetch_inodes during rejoin can cause out-of-memory
The fix was merged. Something needed to start the backports process? Dan van der Ster
03:21 PM Backport #47152: nautilus: pybind/mgr/volumes: add debugging for global lock
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/36828
merged
Yuri Weinstein
03:20 PM Backport #46948: nautilus: qa: Fs cleanup fails with a traceback
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36714
merged
Yuri Weinstein
03:20 PM Backport #46592: nautilus: ceph-fuse: ceph-fuse process is terminated by the logratote task and w...
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/36181
merged
Yuri Weinstein
01:51 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Xiubo Li wrote:
> > Ceph has its own base64 encode/decode logic already in src/common/armor.c,...
Xiubo Li
01:42 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
> Ceph has its own base64 encode/decode logic already in src/common/armor.c, which is the same with ...
Jeff Layton
01:39 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt

I am planing to append a `fscrypt.alternate_name : ${raw_ciphertext}` pair to the xattr map when doing the create d...
Xiubo Li
04:07 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Ceph has its own base64 encode/decode logic already in src/common/armor.c, which is the same with the kernel does. Xiubo Li

09/20/2020

11:01 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Xiubo Li wrote:
> >
> > Yeah, this looks good.
> >
> > BTW, what the alternat_name will s...
Xiubo Li

09/19/2020

05:56 PM Bug #47389: ceph fs volume create fails to create pool
Hi Joshua,
I tried on the master and it works for me. The HEAD was at 240c46a75a44cb9363cf994cb264e9d7048c98a1 dat...
Kotresh Hiremath Ravishankar
12:29 AM Bug #47512 (Pending Backport): mgr/nfs: Cluster creation throws 'NoneType' object has no attribut...
Patrick Donnelly
12:27 AM Bug #47423 (Resolved): volume rm throws Permissioned denied error
Patrick Donnelly
12:24 AM Bug #47353 (Pending Backport): mds: purge_queue's _calculate_ops is inaccurate
Patrick Donnelly

09/18/2020

11:26 PM Bug #47518 (Resolved): qa: spawn MDS daemons before creating file system
Patrick Donnelly
11:22 PM Backport #47249 (In Progress): octopus: mon: deleting a CephFS and its pools causes MONs to crash
Patrick Donnelly
11:19 PM Backport #47248 (In Progress): nautilus: mon: deleting a CephFS and its pools causes MONs to crash
Patrick Donnelly
04:43 PM Bug #47499: Simultaneous MDS and OSD crashes when answering to client
It just happened again on a different MDS with a different client and I found something in common. In all the crashes... Dan van der Ster
04:12 PM Bug #47526 (Fix Under Review): qa: RuntimeError: FSCID 2 not in map
Patrick Donnelly
02:02 AM Bug #47526 (Resolved): qa: RuntimeError: FSCID 2 not in map
... Patrick Donnelly
03:42 PM Backport #46786 (In Progress): octopus: client: in _open() the open ref maybe decreased twice, bu...
Wei-Chung Cheng
03:41 PM Backport #46783 (In Progress): octopus: mds/CInode: Optimize only pinned by subtrees check
Wei-Chung Cheng
03:29 PM Backport #46637 (In Progress): octopus: mds: optimize ephemeral rand pin
Wei-Chung Cheng
02:56 PM Backport #46636 (In Progress): octopus: mds: null pointer dereference in MDCache::finish_rollback
Wei-Chung Cheng
02:53 PM Backport #46634 (In Progress): octopus: mds forwarding request 'no_available_op_found'
Wei-Chung Cheng
11:55 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
>
> Yeah, this looks good.
>
> BTW, what the alternat_name will store ? The full ciphertext bin...
Jeff Layton
11:04 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Zheng Yan wrote:
> Jeff Layton wrote:
[...]
> > I think that approach will give us the most flexibility going forw...
Xiubo Li
10:39 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Xiubo Li wrote:
> >
> > Yeah, right.
> >
> > If the master key is absent, for the ->looku...
Xiubo Li
11:04 AM Backport #47259 (In Progress): nautilus: client: FAILED assert(dir->readdir_cache[dirp->cache_ind...
Wei-Chung Cheng
10:59 AM Backport #47254 (In Progress): nautilus: client: Client::open() pass wrong cap mask to path_walk
Wei-Chung Cheng
10:52 AM Backport #47252 (In Progress): nautilus: mds: fix possible crash when the MDS is stopping
Wei-Chung Cheng
10:49 AM Backport #47246 (In Progress): nautilus: qa: Replacing daemon mds.a as rank 0 with standby daemon...
Wei-Chung Cheng
01:29 AM Bug #47444 (Resolved): crash in FSMap::parse_role
Patrick Donnelly

09/17/2020

04:27 PM Bug #47518 (Fix Under Review): qa: spawn MDS daemons before creating file system
Patrick Donnelly
04:01 PM Bug #47518 (Resolved): qa: spawn MDS daemons before creating file system
... Patrick Donnelly
03:52 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Probably something like the last one. I think we're best off avoiding any logic that requires t...
Zheng Yan
03:36 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
>
> Yeah, right.
>
> If the master key is absent, for the ->lookup() the client will tell MDS t...
Jeff Layton
05:19 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
> Jeff Layton wrote:
> > Xiubo Li wrote:
[...]
> Yeah, since the long name case is rare, and ...
Xiubo Li
05:09 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Jeff Layton wrote:
> Xiubo Li wrote:
> > Hi Jeff,
> >
> > One question:
> >
> > Currently the ext4 will just ...
Xiubo Li
12:44 PM Bug #47515: pybind/snap_schedule: deactivating a schedule is ineffective
(formatting fix)... Venky Shankar
12:43 PM Bug #47515 (Resolved): pybind/snap_schedule: deactivating a schedule is ineffective
Deactivating a snap schedule does not have any effect on the schedule. Schedules snapshots still get created by the s... Venky Shankar
10:51 AM Bug #47512 (Fix Under Review): mgr/nfs: Cluster creation throws 'NoneType' object has no attribut...
Varsha Rao
10:45 AM Bug #47512 (Resolved): mgr/nfs: Cluster creation throws 'NoneType' object has no attribute 'repla...
... Varsha Rao

09/16/2020

02:02 PM Bug #47499 (New): Simultaneous MDS and OSD crashes when answering to client
We observed 4 MDSes and 2 OSDs segfaulting simultaneously when answering to one client. All the six tracebacks report... Enrico Bocchi
12:27 PM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:... Jeff Layton
11:25 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
>
> With this approach there seems no need to covert the ciphertext to base64-encode text when s...
Jeff Layton
11:16 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Xiubo Li wrote:
> Hi Jeff,
>
> One question:
>
> Currently the ext4 will just store the ciphertext as the fina...
Jeff Layton
08:10 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
For the 2nd approach, suggeset by Zheng, more detail in my mind is:
If we will store both the "based64-encoded-pla...
Xiubo Li
06:13 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
Hi Jeff,
One question:
Currently the ext4 will just store the ciphertext as the final filename to the disk, and...
Xiubo Li
02:22 AM Feature #47162: mds: handle encrypted filenames in the MDS for fscrypt
From the source code, the encoded filename length will be roughly increased to 4/3 of the original filename.... Xiubo Li
12:05 PM Bug #47423 (Fix Under Review): volume rm throws Permissioned denied error
Rishabh Dave
07:05 AM Feature #47490 (Pending Backport): Integration of dashboard with volume/nfs module
Currently, there are two ways to create exports with mgr/volume/nfs module and
dashboard. Both use the same code[1]...
Varsha Rao
03:33 AM Backport #47090 (In Progress): nautilus: After restarting an mds, its standy-replay mds remained ...
Patrick Donnelly
03:30 AM Backport #47088 (In Progress): nautilus: mds: recover files after normal session close
Patrick Donnelly
03:27 AM Backport #47084 (Need More Info): nautilus: mds: 'forward loop' when forward_all_requests_to_auth...
Zheng, the backport for this is non-trivial. Can you take a look? Patrick Donnelly
03:25 AM Backport #47017 (In Progress): nautilus: mds: kcephfs parse dirfrag's ndist is always 0
Patrick Donnelly
02:46 AM Bug #47488: Apparent deadlock in tasks.mgr.dashboard.test_cephfs.CephfsTest.test_snapshots
To progress this further we really need more/better logs. Created https://github.com/ceph/ceph/pull/37176 to assist i... Brad Hubbard
02:40 AM Bug #47488 (New): Apparent deadlock in tasks.mgr.dashboard.test_cephfs.CephfsTest.test_snapshots
/a/yuriw-2020-09-02_17:33:04-rados-wip-yuri-master-baseline-9.2.2020-distro-basic-smithi/5400010... Brad Hubbard
01:49 AM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
Patrick Donnelly wrote:
> Xiubo Li wrote:
> > Hi Patrick,
> >
> > For this let's add more debug logs to check wh...
Xiubo Li

09/15/2020

07:32 PM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
Xiubo Li wrote:
> Hi Patrick,
>
> For this let's add more debug logs to check where it is stucked in ?
>
> I w...
Patrick Donnelly
01:09 PM Bug #47294: client: thread hang in Client::_setxattr_maybe_wait_for_osdmap
Hi Patrick,
For this let's add more debug logs to check where it is stucked in ?
I went through the client_loc...
Xiubo Li
12:54 PM Bug #47423 (In Progress): volume rm throws Permissioned denied error
Rishabh Dave
11:22 AM Feature #47277: implement new mount "device" syntax for kcephfs
Patrick Donnelly wrote:
> Venky Shankar wrote:
> > Patrick Donnelly wrote:
> > > Venky Shankar wrote:
> > > > Pat...
Venky Shankar

09/14/2020

10:00 PM Feature #47277: implement new mount "device" syntax for kcephfs
There are other alternates too, fwiw (e.g.):
name@fs#/path
...or maybe just omit the ':' or anything to rep...
Jeff Layton
08:52 PM Feature #47277: implement new mount "device" syntax for kcephfs
Venky Shankar wrote:
> Patrick Donnelly wrote:
> > Venky Shankar wrote:
> > > Patrick Donnelly wrote:
> > > > Jef...
Patrick Donnelly
12:33 PM Feature #47277: implement new mount "device" syntax for kcephfs
Patrick Donnelly wrote:
> Venky Shankar wrote:
> > Patrick Donnelly wrote:
> > > Jeff Layton wrote:
> > > > Propo...
Venky Shankar
09:00 PM Bug #47423: volume rm throws Permissioned denied error
Rishabh Dave wrote:
> Unlike @volume rm@, @fs fail@ does not fail -
>
> [...]
>
> @volume rm@ too runs @fs fai...
Patrick Donnelly
03:12 PM Bug #47423: volume rm throws Permissioned denied error
The issue with ticket assignee was because my page wasn't refreshed before hitting submit button. Rishabh Dave
03:11 PM Bug #47423: volume rm throws Permissioned denied error
Unlike @volume rm@, @fs fail@ does not fail -... Rishabh Dave
02:39 PM Bug #47423: volume rm throws Permissioned denied error
... Sebastian Wagner
12:41 PM Bug #47423: volume rm throws Permissioned denied error
From what I see on master in my local repo, this issue (getting @Permissioned denied@ on @volume rm@) is not just lim... Rishabh Dave
08:51 AM Bug #47423: volume rm throws Permissioned denied error
Kefu Chai wrote:
> i suspect that it is https://github.com/ceph/ceph/pull/32581 which broke `test_cluster_set_reset_...
Varsha Rao
12:42 AM Bug #47423: volume rm throws Permissioned denied error
i suspect that it is https://github.com/ceph/ceph/pull/32581 which broke `test_cluster_set_reset_user_config` in `tas... Kefu Chai
08:30 PM Documentation #47449 (New): doc: complete ec pool configuration section with an example
https://docs.ceph.com/docs/master/cephfs/createfs/#using-erasure-coded-pools-with-cephfs
The section should provid...
Patrick Donnelly
07:41 PM Bug #47444 (Fix Under Review): crash in FSMap::parse_role
Neha Ojha
07:19 PM Bug #47444 (In Progress): crash in FSMap::parse_role
Patrick Donnelly
05:19 PM Bug #47444 (Resolved): crash in FSMap::parse_role
... Neha Ojha
01:34 PM Backport #47200 (In Progress): octopus: scheduled cephfs snapshots (via ceph manager)
Jan Fajerski

09/13/2020

06:02 PM Bug #47423 (Triaged): volume rm throws Permissioned denied error
Patrick Donnelly
05:59 PM Bug #47389 (Triaged): ceph fs volume create fails to create pool
Patrick Donnelly
05:55 PM Feature #47277: implement new mount "device" syntax for kcephfs
Venky Shankar wrote:
> Patrick Donnelly wrote:
> > Jeff Layton wrote:
> > > Proposed syntax looks wrong in the des...
Patrick Donnelly
05:48 PM Bug #47379 (Rejected): mds: mark no warn on killed request
PR was rejected Patrick Donnelly
05:47 PM Bug #47353 (Fix Under Review): mds: purge_queue's _calculate_ops is inaccurate
Patrick Donnelly
 

Also available in: Atom