Project

General

Profile

Activity

From 12/06/2021 to 01/04/2022

01/04/2022

04:41 PM Bug #53753 (Duplicate): mds: crash (assert hit) when merging dirfrags
Venky Shankar
04:39 PM Bug #53753: mds: crash (assert hit) when merging dirfrags
> mds_oft_prefetch_dirfrags is by default true which means we were already testing it in the past
Thanks. This is ...
Patrick Donnelly
04:28 PM Bug #53765 (Fix Under Review): mount helper mangles the new syntax device string by qualifying th...
Jeff Layton
04:20 PM Bug #53765 (Resolved): mount helper mangles the new syntax device string by qualifying the name
The new mount syntax allows you to add mounts with a syntax like this:
admin@.test=/
...but then reports th...
Jeff Layton
09:50 AM Backport #53761 (Resolved): pacific: mds: mds_oft_prefetch_dirfrags = false is not qa tested
https://github.com/ceph/ceph/pull/44504 Backport Bot
09:45 AM Fix #52591 (Pending Backport): mds: mds_oft_prefetch_dirfrags = false is not qa tested
Venky Shankar
09:45 AM Bug #44916 (Resolved): client: syncfs flush is only fast with a single MDS
Venky Shankar
09:15 AM Backport #53760 (Resolved): pacific: snap scheduler: cephfs snapshot schedule status doesn't list...
https://github.com/ceph/ceph/pull/45906 Backport Bot
09:15 AM Backport #53759 (Resolved): pacific: mds: heartbeat timeout by _prefetch_dirfrags during up:rejoin
https://github.com/ceph/ceph/pull/44551 Backport Bot
09:10 AM Bug #52642 (Pending Backport): snap scheduler: cephfs snapshot schedule status doesn't list the s...
Venky Shankar
09:10 AM Bug #53521 (Pending Backport): mds: heartbeat timeout by _prefetch_dirfrags during up:rejoin
Venky Shankar
09:09 AM Feature #46166 (Resolved): mds: store symlink target as xattr in data pool inode for disaster rec...
Venky Shankar
03:24 AM Bug #53750 (Fix Under Review): mds: FAILED ceph_assert(mut->is_wrlocked(&pin->filelock))
Xiubo Li

01/03/2022

05:15 PM Backport #53444: octopus: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-...
I don't think this is applicable to Octopus (unless the relevant tests also get backported). Jeff Layton
05:14 PM Backport #53443: pacific: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-...
I don't think this is applicable to Pacific (unless the relevant tests also get backported). Jeff Layton
05:14 PM Bug #53214: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-f42229a869be.c...
Yes. Please do close them out. No need to backport this to those. Jeff Layton
09:12 AM Bug #53753 (Duplicate): mds: crash (assert hit) when merging dirfrags
This run: https://pulpito.ceph.com/vshankar-2021-12-22_07:37:44-fs-wip-vshankar-testing-20211216-114012-testing-defau... Venky Shankar

12/31/2021

01:21 PM Bug #53750: mds: FAILED ceph_assert(mut->is_wrlocked(&pin->filelock))
This is reproducable very easily by using the following scripts in two different teriminal:... Xiubo Li
01:14 PM Bug #53750 (In Progress): mds: FAILED ceph_assert(mut->is_wrlocked(&pin->filelock))
Xiubo Li
01:12 PM Bug #53750 (Resolved): mds: FAILED ceph_assert(mut->is_wrlocked(&pin->filelock))
... Xiubo Li
09:51 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Sheng Xie wrote:
> I will modify it gracefully as soon as possible after fully understanding this logic but it may t...
Xiubo Li
09:11 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
I will modify it gracefully as soon as possible after fully understanding this logic but it may take some time. Sheng Xie

12/30/2021

05:32 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Sheng Xie wrote:
> Xiubo Li wrote:
> [...]
> Yes, I know this.
>
> This is the issue_Caps() code snippet before...
Xiubo Li
03:45 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Xiubo Li wrote:... Sheng Xie
02:40 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Sheng Xie wrote:
> Thank you for your prompt.
> I spent some time learning the logic of MDS caps, although these co...
Xiubo Li
02:25 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Thank you for your prompt.
I spent some time learning the logic of MDS caps, although these contents are complex and...
Sheng Xie
03:48 AM Bug #51705: qa: tasks.cephfs.fuse_mount:mount command failed
Not very sure will the above bug could lead to the auth fail issue:... Xiubo Li

12/29/2021

08:32 AM Bug #53741 (Resolved): crash just after MDS become active
FAILED ceph_assert(lock->get_state() == LOCK_PRE_SCAN) at mds/Locker.cc:5682... 玮文 胡
04:38 AM Bug #51705 (Fix Under Review): qa: tasks.cephfs.fuse_mount:mount command failed
Xiubo Li
03:10 AM Bug #51705 (In Progress): qa: tasks.cephfs.fuse_mount:mount command failed
... Xiubo Li
01:05 AM Bug #53724: mds: stray directories are not purged when all past parents are clear
Sorry, didn't carefully read the description, already renewed it :-) Xiubo Li
01:03 AM Bug #53724 (New): mds: stray directories are not purged when all past parents are clear
Xiubo Li
01:02 AM Bug #53724 (In Progress): mds: stray directories are not purged when all past parents are clear
Xiubo Li

12/28/2021

04:03 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Sheng Xie wrote:
> I tested case C. as you guessed, A-node can still see file0 after B-node deletes file0. but when ...
Xiubo Li
02:51 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
I tested case C. as you guessed, A-node can still see file0 after B-node deletes file0. but when 'll /cephfs' is exec... Sheng Xie

12/27/2021

06:00 PM Backport #53736 (Resolved): pacific: mds: recursive scrub does not trigger stray reintegration
https://github.com/ceph/ceph/pull/44514 Backport Bot
06:00 PM Backport #53735 (Rejected): octopus: mds: recursive scrub does not trigger stray reintegration
https://github.com/ceph/ceph/pull/44657 Backport Bot
05:57 PM Bug #53641 (Pending Backport): mds: recursive scrub does not trigger stray reintegration
Patrick Donnelly
12:33 PM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Could u try:
case C:
A-node: 'touch /cephfs/file0; ll /cephfs'
B-node: 'ln /cephfs/file0 /cephfs/file1; rm /ceph...
Xiubo Li
07:54 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Xiubo Li wrote:
> Why do you think the default value is 1s ? Should it be 0 for ceph-fuse. More detail please see ...
Sheng Xie
04:13 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
xie sheng wrote:
> Sorry, I didn't test kclient I just compared the performance of ceph-fuse mount point before and ...
Xiubo Li
03:42 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
Sorry, I didn't test kclient I just compared the performance of ceph-fuse mount point before and after setting 'timeo... Sheng Xie
03:07 AM Feature #53730: ceph-fuse: suppor "entry_timeout" and "attr_timeout" options for improve performance
BTW, have you test fuse vs kclient at the same time by setting the `entry_timeout` and `attr_timeout` ?... Xiubo Li
02:12 AM Feature #53730 (Fix Under Review): ceph-fuse: suppor "entry_timeout" and "attr_timeout" options f...
I noticed that use ceph-fuse to mount to a directory has lower performance than use the kernel mode to mount to a dir... Sheng Xie

12/24/2021

06:52 AM Bug #53726 (Fix Under Review): mds: crash when `ceph tell mds.0 dump tree ''`
Xiubo Li
04:40 AM Bug #53726 (Resolved): mds: crash when `ceph tell mds.0 dump tree ''`
... Xiubo Li
01:58 AM Bug #53724 (Pending Backport): mds: stray directories are not purged when all past parents are clear
(Note to experienced CephFS devs: let's save this for a newer dev!)
If a directory is not purged because it has pa...
Patrick Donnelly

12/23/2021

01:55 PM Backport #53715 (Resolved): octopus: mds: fails to reintegrate strays if destdn's directory is fu...
https://github.com/ceph/ceph/pull/44668 Backport Bot
01:55 PM Backport #53714 (Resolved): pacific: mds: fails to reintegrate strays if destdn's directory is fu...
https://github.com/ceph/ceph/pull/44513 Backport Bot
01:52 PM Bug #53619 (Pending Backport): mds: fails to reintegrate strays if destdn's directory is full (EN...
Patrick Donnelly

12/22/2021

07:08 PM Bug #53615 (Resolved): qa: upgrade test fails with "timeout expired in wait_until_healthy"
Patrick Donnelly

12/21/2021

10:48 AM Bug #53509: quota support for subvolumegroup
Introducing subvolumegroup quotas hits this known issue [1]. The issue is hit while removing subvolume
under the sub...
Kotresh Hiremath Ravishankar
10:46 AM Bug #53509 (In Progress): quota support for subvolumegroup
Kotresh Hiremath Ravishankar
06:49 AM Bug #53597 (Fix Under Review): mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_v...
Xiubo Li

12/16/2021

07:32 PM Bug #53641 (Fix Under Review): mds: recursive scrub does not trigger stray reintegration
Patrick Donnelly
03:44 PM Bug #53641 (Resolved): mds: recursive scrub does not trigger stray reintegration
One might think using recursive scrub would load a dentry and trigger reintegration, but it does not. Presently the c... Patrick Donnelly
06:34 PM Bug #53649 (New): allow teuthology to create more than one named filesystem
Varsha had asked that I create a test for this PR:
https://github.com/ceph/ceph/pull/44279
...but it wasn't...
Jeff Layton
04:48 PM Bug #53645 (New): MDCache::shutdown_pass: ceph_assert(!migrator->is_importing())
I'm running a pinning/multimds thrash test (see stressfs.sh attached) on a 3 node test cluster and occasionally seein... Dan van der Ster
01:42 PM Bug #53623 (Fix Under Review): mds: LogSegment will only save one ESubtreeMap event if the ESubtr...
Xiubo Li
02:24 AM Bug #53623 (Fix Under Review): mds: LogSegment will only save one ESubtreeMap event if the ESubtr...
... Xiubo Li
05:35 AM Bug #53459 (Won't Fix): mds: start a new MDLog segment if new coming event possibly exceeds the e...
Will fix it in another tracker https://tracker.ceph.com/issues/53623. Closing this one. Xiubo Li
02:44 AM Bug #53542: Ceph Metadata Pool disk throughput usage increasing
I have figured out one case may could cause this, please see the tracker https://tracker.ceph.com/issues/53623.
Ju...
Xiubo Li
02:20 AM Bug #40002: mds: not trim log under heavy load

More logs:...
Xiubo Li

12/15/2021

05:58 PM Bug #53615 (Fix Under Review): qa: upgrade test fails with "timeout expired in wait_until_healthy"
Patrick Donnelly
05:51 PM Bug #53615: qa: upgrade test fails with "timeout expired in wait_until_healthy"
regression caused by fix for #51984 Patrick Donnelly
10:11 AM Bug #53615 (Resolved): qa: upgrade test fails with "timeout expired in wait_until_healthy"
https://pulpito.ceph.com/vshankar-2021-12-15_07:13:38-fs-master-testing-default-smithi/6563822/
The test reached a...
Venky Shankar
04:05 PM Bug #53619 (Fix Under Review): mds: fails to reintegrate strays if destdn's directory is full (EN...
Patrick Donnelly
03:00 PM Bug #53619 (Resolved): mds: fails to reintegrate strays if destdn's directory is full (ENOSPC)
This should work because no stray needs created and the directory's size will not increase. Patrick Donnelly
03:00 PM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
> Do you have the logs when the last inode disappeared ?
I got the log for inode 0x20006fdf4cf being purged. log l...
玮文 胡
05:02 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
玮文 胡 wrote:
> Xiubo Li wrote:
> > Do you have the logs when the last inode disappeared ?
>
> No, I only have deb...
Xiubo Li
04:56 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Xiubo Li wrote:
> Do you have the logs when the last inode disappeared ?
No, I only have debug_mds set to 1/5. I ...
玮文 胡
04:48 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
玮文 胡 wrote:
> Xiubo Li wrote:
> > BTW, could your lasted of `ceph fs status` ?
>
> I don't quite understand thi...
Xiubo Li
04:37 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Xiubo Li wrote:
> BTW, could your lasted of `ceph fs status` ?
I don't quite understand this. the output don't c...
玮文 胡
02:34 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
玮文 胡 wrote:
> BTW, is there any way to traverse all the inodes in the stray dir, so that I can find out all such sta...
Xiubo Li
01:00 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
玮文 胡 wrote:
> The inode 0x200065b309d has gone, I don't know how. But I got another inode that crashes the rank 1. I...
Xiubo Li
10:39 AM Bug #40002: mds: not trim log under heavy load
Xiubo Li wrote:
> There has one case that could lead the journal logs to fill the metadata pool full, such as in cas...
Xiubo Li
03:39 AM Bug #53611 (Triaged): mds,client: can not identify pool id if pool name is positive integer when ...
... xinyu wang

12/14/2021

05:29 PM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Though almost identical, here are the logs before the crash in the new case.... 玮文 胡
05:25 PM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
BTW, is there any way to traverse all the inodes in the stray dir, so that I can find out all such stall caps in one ... 玮文 胡
05:18 PM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
The inode 0x200065b309d has gone, I don't know how. But I got another inode that crashes the rank 1. It is very simil... 玮文 胡
07:39 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
There is one thing that looks strange to me. It is rank 1 that wants to export the inode to rank 0. But when I issue ... 玮文 胡
07:19 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Xiubo Li wrote:
> 玮文 胡 wrote:
> > This dir should have been deleted about one month ago. Just found that one client...
玮文 胡
06:51 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
玮文 胡 wrote:
> This dir should have been deleted about one month ago. Just found that one client is still holding a c...
Xiubo Li
05:49 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Xiubo Li wrote:
> Are you using the fuse client or kclient ?
Both. But I believe only kclient have ever accessed ...
玮文 胡
05:37 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Are you using the fuse client or kclient ? Xiubo Li
03:42 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
This dir should have been deleted about one month ago. Just found that one client is still holding a cap on it. And I... 玮文 胡
02:57 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
The dir "200065b309d/" is already located in the stray/, I think it's queued for being purging, will it be possible t... Xiubo Li
02:29 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
If I didn't miss something important the system dirs shouldn't be migrated in theory. Xiubo Li
01:58 AM Bug #53597: mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Attached the logs after setting debug_mds to 1/20. This may be the most interesting part:... 玮文 胡
10:47 AM Bug #53601 (Fix Under Review): vstart_runner: Running test_data_scan test locally fails with trac...
Kotresh Hiremath Ravishankar
10:35 AM Bug #53601 (Resolved): vstart_runner: Running test_data_scan test locally fails with tracebacks
Following tracebacks are seen
1....
Kotresh Hiremath Ravishankar
10:08 AM Bug #40002: mds: not trim log under heavy load
There has one case that could lead the journal logs to fill the metadata pool full, such as in case of tracker #53597... Xiubo Li
01:06 AM Bug #44988 (Duplicate): client: track dirty inodes in a per-session list for effective cap flushing
Xiubo Li

12/13/2021

06:55 PM Bug #44100: cephfs rsync kworker high load.
We've recently started using cephfs snapshots and are running into a similar issue with the kernel client. It seems ... Andras Pataki
05:25 PM Bug #53597 (Resolved): mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
... 玮文 胡
03:37 PM Backport #53445 (In Progress): pacific: mds: opening connection to up:replay/up:creating daemon c...
Patrick Donnelly
01:38 PM Bug #53521 (Fix Under Review): mds: heartbeat timeout by _prefetch_dirfrags during up:rejoin
Venky Shankar
01:38 PM Bug #53542 (Triaged): Ceph Metadata Pool disk throughput usage increasing
Venky Shankar
01:15 PM Fix #52824 (Closed): qa: skip internal metadata directory when scanning ceph debugfs directory
Not relevant anymore. Venky Shankar
01:12 PM Feature #49942 (Resolved): cephfs-mirror: enable running in HA
Venky Shankar
01:12 PM Feature #50372 (Resolved): test: Implement cephfs-mirror trasher test for HA active/active
Venky Shankar
08:30 AM Bug #53509: quota support for subvolumegroup
Elaborating a bit more on the restriction mentioned by Ramana in the first comment.
If the quota is set on the sub...
Kotresh Hiremath Ravishankar

12/10/2021

02:49 PM Feature #50235 (Fix Under Review): allow cephfs-shell to mount named filesystems
Venky Shankar
11:13 AM Bug #53542: Ceph Metadata Pool disk throughput usage increasing
We are considering to increase the "activity" based thresholds to see if we get less metadata IO.
We were actually...
Andras Sali
10:06 AM Bug #53542: Ceph Metadata Pool disk throughput usage increasing
Thanks for the reply, we tried decreasing the mds_log_max_segments option, but didn't really notice a difference.
...
Andras Sali
09:18 AM Bug #53542: Ceph Metadata Pool disk throughput usage increasing
In heavy load case, the MDLog could accumulate many journal log events and could be submit in batch to metadata pool ... Xiubo Li
08:55 AM Bug #53542: Ceph Metadata Pool disk throughput usage increasing
We also did a dump of the objecter_requests and it seems there are some large objects written by the mds-es?
ceph ...
Andras Sali
10:08 AM Backport #53332 (In Progress): pacific: ceph-fuse seems to need root permissions to mount (ceph-f...
Nikhilkumar Shelke
10:01 AM Backport #53331 (In Progress): octopus: ceph-fuse seems to need root permissions to mount (ceph-f...
Nikhilkumar Shelke
09:12 AM Backport #53444 (In Progress): octopus: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6...
Venky Shankar
08:58 AM Backport #52951 (Rejected): octopus: qa: skip internal metadata directory when scanning ceph debu...
Not required since the relevant files are under sysfs. Venky Shankar
08:58 AM Backport #52950 (Rejected): pacific: qa: skip internal metadata directory when scanning ceph debu...
Not required since the relevant files are under sysfs. Venky Shankar
04:46 AM Backport #52952 (Resolved): pacific: mds: crash when journaling during replay
Venky Shankar

12/09/2021

09:03 PM Bug #53574 (New): qa: downgrade testing of MDS/mons in minor releases
Verify that an older mon/MDS can be brought up and can decode on-disk structures normally. Patrick Donnelly
08:59 PM Bug #53573 (Resolved): qa: test new clients against older Ceph clusters
Confirm that e.g. a Quincy client can still mount/use a Pacific CephFS cluster. Patrick Donnelly
04:34 PM Bug #53487 (Resolved): qa: mount error 22 = Invalid argument
Patrick Donnelly
09:43 AM Documentation #53558 (New): Document cephfs recursive accounting
I cannot find any user documentation for cephfs recursive accounting.
We should add something similar to https://b...
Dan van der Ster
04:44 AM Bug #44916 (Fix Under Review): client: syncfs flush is only fast with a single MDS
Xiubo Li
01:52 AM Bug #51956 (Resolved): mds: switch to use ceph_assert() instead of assert()
Xiubo Li
01:46 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
Dan van der Ster wrote:
> Xiubo Li wrote:
> > Dan van der Ster wrote:
> > > more of the client log is attached. (f...
Xiubo Li

12/08/2021

05:05 PM Bug #53542 (Fix Under Review): Ceph Metadata Pool disk throughput usage increasing
Hi All,
We have been observing that if we let our MDS run for some time, the bandwidth usage of the disks in the m...
Andras Sali
02:22 PM Bug #53509: quota support for subvolumegroup
Thanks, Ramana for the follow-up and validation of the use case. Sébastien Han
08:10 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
Xiubo Li wrote:
> Dan van der Ster wrote:
> > more of the client log is attached. (from yesterday)
> >
> > do yo...
Dan van der Ster
08:04 AM Bug #53521 (Resolved): mds: heartbeat timeout by _prefetch_dirfrags during up:rejoin
This timeout issue happens with v14.2.19. It may also be reproduced in the latest version.
2021-12-05 20:42:13.472...
Yongseok Oh
06:33 AM Bug #53520 (New): mds: put both fair mutex MDLog::submit_mutex and mds_lock to test under heavy load
The related trackers:
MDLog::submit_mutex: https://tracker.ceph.com/issues/40002
mds_lock: https://tracker.ceph.c...
Xiubo Li
01:58 AM Bug #53459: mds: start a new MDLog segment if new coming event possibly exceeds the expected segm...
Yeah, by creating a number of directories and set the distributed pin on each of them, the ESubtreeMap event can reac... Xiubo Li

12/07/2021

10:02 PM Bug #53509: quota support for subvolumegroup
As per Venky, we need to keep in mind the following limitation with CephFS quotas:
"Quotas must be configured carefu...
Ramana Raja
01:31 PM Bug #53509 (Resolved): quota support for subvolumegroup
Today, we can apply quota to individual subvolume. However when working on a multi-tenant environment, the storage ad... Sébastien Han
01:25 PM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
Dan van der Ster wrote:
> more of the client log is attached. (from yesterday)
>
> do you still need mds? which d...
Xiubo Li
08:16 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
more of the client log is attached. (from yesterday)
do you still need mds? which debug level?
Dan van der Ster
06:26 AM Bug #53504 (Fix Under Review): client: infinite loop "got ESTALE" after mds recovery
Xiubo Li
02:28 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
Dan van der Ster wrote:
> Cluster had max_mds 3 at the time of those logs. It's running 14.2.22 -- we didn't upgrade...
Xiubo Li
02:20 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
Cluster had max_mds 3 at the time of those logs. It's running 14.2.22 -- we didn't upgrade; the latest recovery was f... Dan van der Ster
12:37 AM Bug #53504: client: infinite loop "got ESTALE" after mds recovery
BTW, what's the `max_mds` in your setups ? And how many up MDSes after you upgraded ? It seems there only has one. Xiubo Li

12/06/2021

02:19 PM Bug #53504 (Resolved): client: infinite loop "got ESTALE" after mds recovery
After an MDS recovery we inevitably see a few clients hammering the MDSs in a loop, doing getattr on a stale fh.
On ...
Dan van der Ster
12:18 PM Bug #53487 (Fix Under Review): qa: mount error 22 = Invalid argument
Venky Shankar
11:13 AM Bug #53487: qa: mount error 22 = Invalid argument
http://pulpito.front.sepia.ceph.com/yuriw-2021-12-03_15:27:18-rados-wip-yuri11-testing-2021-12-02-1451-distro-default... Sridhar Seshasayee
09:23 AM Bug #52406 (Need More Info): cephfs_metadata pool got full after upgrade from Nautilus to Pacific...
Did you see any suspect logs in the mds logs ? Such as no mdlog->trim() got called, etc.
There have two similar t...
Xiubo Li
07:59 AM Bug #52280: Mds crash and fails with assert on prepare_new_inode
Hi xiubo Li
Thanks for information
The problem happen to us on nautilus 14.2.7

Are the fixes should be included...
Yael Azulay
01:18 AM Bug #52280: Mds crash and fails with assert on prepare_new_inode
... Xiubo Li
03:18 AM Bug #40002 (Fix Under Review): mds: not trim log under heavy load
Xiubo Li
01:21 AM Bug #40002: mds: not trim log under heavy load
The implementations of the Mutex (e.g. std::mutex in C++) do not guarantee fairness, they do not guarantee that the l... Xiubo Li
 

Also available in: Atom