Project

General

Profile

Activity

From 11/02/2021 to 12/01/2021

12/01/2021

07:35 PM Bug #53214: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-f42229a869be.c...
This patch doesn't appear to be applicable to Pacific or Octopus since the get_op_read_count method doesn't exist in ... Cory Snyder
04:09 AM Bug #53214 (Pending Backport): qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-...
Venky Shankar
10:54 AM Bug #53216 (Resolved): qa: "RuntimeError: value of attributes should be either str or None. clien...
Venky Shankar
09:50 AM Bug #53360: pacific: client: "handle_auth_bad_method server allowed_methods [2] but i only suppor...
`volume_client` script threw a traceback with "ModuleNotFoundError: No module named 'ceph_volume_client'"::... Venky Shankar
09:26 AM Bug #53360: pacific: client: "handle_auth_bad_method server allowed_methods [2] but i only suppor...
From monitor logs::... Venky Shankar
07:27 AM Bug #50223: qa: "client.4737 isn't responding to mclientcaps(revoke)"
Recent instances:
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-08_15:19:37-fs-wip-yuri2-testing-2021-11-06-1322...
Kotresh Hiremath Ravishankar
06:31 AM Bug #40002 (In Progress): mds: not trim log under heavy load
Xiubo Li
06:29 AM Feature #10764 (In Progress): optimize memory usage of MDSCacheObject
Xiubo Li
06:21 AM Bug #52397 (Resolved): pacific: qa: test_acls (tasks.cephfs.test_acls.TestACLs) failed
This has been fixed by the following commit in git://git.ceph.com/xfstests-dev.git:... Xiubo Li
04:18 AM Bug #53436 (Duplicate): mds, mon: mds beacon messages get dropped? (mds never reaches up:active s...
This is a known bug long time ago. Xiubo Li
02:26 AM Bug #53436: mds, mon: mds beacon messages get dropped? (mds never reaches up:active state)
From remote/smithi154/log/ceph-mds.d.log.gz, we can see that the mon.1 connection was broken:... Xiubo Li
04:10 AM Backport #53446 (Rejected): octopus: mds: opening connection to up:replay/up:creating daemon caus...
Backport Bot
04:10 AM Backport #53445 (Resolved): pacific: mds: opening connection to up:replay/up:creating daemon caus...
https://github.com/ceph/ceph/pull/44296 Backport Bot
04:10 AM Backport #53444 (Resolved): octopus: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731...
https://github.com/ceph/ceph/pull/44270 Backport Bot
04:10 AM Backport #53443 (New): pacific: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052...
Backport Bot
04:09 AM Bug #53082 (Resolved): ceph-fuse: segmenetation fault in Client::handle_mds_map
Venky Shankar
04:07 AM Bug #53194 (Pending Backport): mds: opening connection to up:replay/up:creating daemon causes mes...
Venky Shankar
04:06 AM Feature #52725 (Resolved): qa: mds_dir_max_entries workunit test case
Venky Shankar
04:05 AM Feature #47277 (Resolved): implement new mount "device" syntax for kcephfs
Venky Shankar

11/30/2021

01:48 PM Bug #53436: mds, mon: mds beacon messages get dropped? (mds never reaches up:active state)
Seems the same issue with https://tracker.ceph.com/issues/51705. Xiubo Li
01:39 PM Bug #53436 (Triaged): mds, mon: mds beacon messages get dropped? (mds never reaches up:active state)
Venky Shankar
12:46 PM Bug #53436 (Duplicate): mds, mon: mds beacon messages get dropped? (mds never reaches up:active s...
Seen in this run - https://pulpito.ceph.com/vshankar-2021-11-24_07:14:27-fs-wip-vshankar-testing-20211124-094330-test... Venky Shankar
11:37 AM Feature #40633 (In Progress): mds: dump recent log events for extraordinary events
Jos Collin
09:20 AM Bug #48711 (Closed): mds: standby-replay mds abort when replay metablob
No updates from haitao yet, closing this. Jos Collin
01:31 AM Bug #16739 (Fix Under Review): Client::setxattr always sends setxattr request to MDS
Xiubo Li
01:17 AM Feature #18514 (Resolved): qa: don't use a node for each kclient
I had pushed several patches by adding unsharing network namespace support to fix this. More detail please see https:... Xiubo Li

11/29/2021

11:21 AM Bug #52625 (Resolved): qa: test_kill_mdstable (tasks.cephfs.test_snapshots.TestSnapshots)
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
11:20 AM Bug #52949 (Resolved): RuntimeError: The following counters failed to be set on mds daemons: {'md...
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
11:20 AM Bug #52975 (Resolved): MDSMonitor: no active MDS after cluster deployment
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
11:20 AM Bug #52994 (Resolved): client: do not defer releasing caps when revoking
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
11:19 AM Bug #53155 (Resolved): MDSMonitor: assertion during upgrade to v16.2.5+
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
11:17 AM Backport #53121: pacific: mds: collect I/O sizes from client for cephfs-top
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43784
m...
Loïc Dachary
11:15 AM Backport #53217 (Resolved): pacific: test: Implement cephfs-mirror trasher test for HA active/active
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43924
m...
Loïc Dachary
11:15 AM Backport #53164: pacific: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43815
m...
Loïc Dachary
11:15 AM Backport #52678 (Resolved): pacific: qa: test_kill_mdstable (tasks.cephfs.test_snapshots.TestSnap...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43702
m...
Loïc Dachary
11:14 AM Backport #53231 (Resolved): pacific: MDSMonitor: assertion during upgrade to v16.2.5+
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43890
m...
Loïc Dachary
11:14 AM Backport #53006: pacific: RuntimeError: The following counters failed to be set on mds daemons: {...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43828
m...
Loïc Dachary
05:40 AM Bug #39634 (Fix Under Review): qa: test_full_same_file timeout
Xiubo Li
05:34 AM Bug #39634: qa: test_full_same_file timeout
In /ceph/teuthology-archive/yuriw-2021-11-08_20:21:11-fs-wip-yuri5-testing-2021-11-08-1003-pacific-distro-basic-smith... Xiubo Li

11/26/2021

09:50 AM Bug #48673: High memory usage on standby replay MDS
Patrick Donnelly wrote:
> I've been able to reproduce this. Will try to track down the cause...
The same situatio...
Yongseok Oh

11/25/2021

05:41 AM Bug #48812 (New): qa: test_scrub_pause_and_resume_with_abort failure
This has started to show up again: https://pulpito.ceph.com/vshankar-2021-11-24_07:14:27-fs-wip-vshankar-testing-2021... Venky Shankar
05:34 AM Bug #39634: qa: test_full_same_file timeout
Checked all the other OSDs, they all didn't reach the "mon osd full ratio: 0.7". Only the osd.4 did.
That means the ...
Xiubo Li
05:30 AM Bug #39634: qa: test_full_same_file timeout
When the test_full test case was deleting the "large_file_b" and "large_file_a", from /ceph/teuthology-archive/yuriw-... Xiubo Li

11/24/2021

11:40 AM Feature #53310: Add admin socket command to trim caps
Patrick mentioned about these config options:
- mds_session_cache_liveness_decay_rate
- mds_session_cache_livenes...
Venky Shankar
11:22 AM Bug #53360: pacific: client: "handle_auth_bad_method server allowed_methods [2] but i only suppor...
ceph-fuse fails way before `install.upgrade` in run. Looks like the failure is when everything is nautilus::... Venky Shankar

11/23/2021

06:18 PM Fix #52591 (Fix Under Review): mds: mds_oft_prefetch_dirfrags = false is not qa tested
Patrick Donnelly
04:08 PM Bug #52094 (Duplicate): Tried out Quincy: All MDS Standby
Patrick Donnelly
01:54 PM Feature #53310: Add admin socket command to trim caps
Brief background - this request came up from some community members. They run a file system scanning job every day (?... Venky Shankar
01:40 PM Bug #53360 (Triaged): pacific: client: "handle_auth_bad_method server allowed_methods [2] but i o...
Venky Shankar
01:22 PM Bug #52487 (Fix Under Review): qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.Test...
Venky Shankar
01:20 PM Bug #53300 (Duplicate): qa: cluster [WRN] Scrub error on inode
Duplicate of https://tracker.ceph.com/issues/50250 Kotresh Hiremath Ravishankar
09:24 AM Bug #50946 (Duplicate): mgr/stats: exception ValueError in perf stats
Jos Collin
09:19 AM Bug #50946: mgr/stats: exception ValueError in perf stats
This issue seems duplicate of https://tracker.ceph.com/issues/48473
It will automatically get resolved once https://...
Nikhilkumar Shelke
04:17 AM Feature #49811 (Resolved): mds: collect I/O sizes from client for cephfs-top
Xiubo Li
04:15 AM Backport #53121 (Resolved): pacific: mds: collect I/O sizes from client for cephfs-top
Xiubo Li

11/22/2021

05:37 PM Documentation #53236 (Resolved): doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
05:36 PM Backport #53245 (Resolved): pacific: doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
05:33 PM Bug #53360 (Duplicate): pacific: client: "handle_auth_bad_method server allowed_methods [2] but i...
Nautilus ceph-fuse client fails to start for Pacific upgrade tests:... Patrick Donnelly
02:58 PM Backport #53121: pacific: mds: collect I/O sizes from client for cephfs-top
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43784
merged
Yuri Weinstein
02:27 PM Bug #53293 (Resolved): qa: v16.2.4 mds crash caused by centos stream kernel
Patrick Donnelly
02:41 AM Bug #53082 (Fix Under Review): ceph-fuse: segmenetation fault in Client::handle_mds_map
Xiubo Li
01:24 AM Backport #53164 (Resolved): pacific: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
Xiubo Li

11/20/2021

05:03 PM Backport #53347 (Resolved): pacific: qa: v16.2.4 mds crash caused by centos stream kernel
Patrick Donnelly
12:47 AM Backport #53347 (In Progress): pacific: qa: v16.2.4 mds crash caused by centos stream kernel
Patrick Donnelly

11/19/2021

11:45 PM Backport #53347 (Resolved): pacific: qa: v16.2.4 mds crash caused by centos stream kernel
https://github.com/ceph/ceph/pull/44034 Backport Bot
11:44 PM Bug #53293 (Pending Backport): qa: v16.2.4 mds crash caused by centos stream kernel
Patrick Donnelly
02:17 PM Backport #53217: pacific: test: Implement cephfs-mirror trasher test for HA active/active
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43924
merged
Yuri Weinstein
02:14 PM Backport #53164: pacific: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43815
merged
Yuri Weinstein
04:45 AM Backport #53332 (Resolved): pacific: ceph-fuse seems to need root permissions to mount (ceph-fuse...
https://github.com/ceph/ceph/pull/44272 Backport Bot
04:45 AM Backport #53331 (Resolved): octopus: ceph-fuse seems to need root permissions to mount (ceph-fuse...
https://github.com/ceph/ceph/pull/44271 Backport Bot
04:41 AM Documentation #53054 (Pending Backport): ceph-fuse seems to need root permissions to mount (ceph-...
Venky Shankar

11/18/2021

03:23 PM Backport #53120: pacific: client: do not defer releasing caps when revoking
https://github.com/ceph/ceph/pull/43782 merged Yuri Weinstein
03:21 PM Backport #52678: pacific: qa: test_kill_mdstable (tasks.cephfs.test_snapshots.TestSnapshots)
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43702
merged
Yuri Weinstein
02:51 PM Bug #48473 (Fix Under Review): fs perf stats command crashes
Venky Shankar
02:31 PM Bug #53314 (Duplicate): qa: fs/upgrade/mds_upgrade_sequence test timeout
Patrick Donnelly
09:12 AM Bug #53314: qa: fs/upgrade/mds_upgrade_sequence test timeout
@Xiubo, I think the PR https://github.com/ceph/ceph/pull/43784 is causing this. Kotresh Hiremath Ravishankar
09:09 AM Bug #53314 (Duplicate): qa: fs/upgrade/mds_upgrade_sequence test timeout
The qa suite mds_upgrade_sequence becomes dead with job timeout because of mds crash.
-------------
ceph versi...
Kotresh Hiremath Ravishankar
12:31 PM Bug #48773: qa: scrub does not complete
Another Instance
http://qa-proxy.ceph.com/teuthology/yuriw-2021-11-17_19:02:43-fs-wip-yuri10-testing-2021-11-17-08...
Kotresh Hiremath Ravishankar
08:50 AM Bug #39634: qa: test_full_same_file timeout
Will work on it. Xiubo Li
08:47 AM Bug #52396 (Duplicate): pacific: qa: ERROR: test_perf_counters (tasks.cephfs.test_openfiletable.O...
This is duplicated to https://tracker.ceph.com/issues/53218. Xiubo Li
03:55 AM Cleanup #51406 (Fix Under Review): mgr/volumes/fs/operations/versions/op_sm.py: fix various flake...
Varsha Rao

11/17/2021

09:29 PM Feature #53310 (New): Add admin socket command to trim caps
Add an admin socket command to cause the MDS to reclaim state from a client. This would simply involve calling reclai... Douglas Fuller
04:30 PM Backport #53232 (Resolved): pacific: MDSMonitor: no active MDS after cluster deployment
Patrick Donnelly
03:59 PM Backport #53232: pacific: MDSMonitor: no active MDS after cluster deployment
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43891
merged
Yuri Weinstein
04:14 PM Backport #53231: pacific: MDSMonitor: assertion during upgrade to v16.2.5+
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43890
merged
Yuri Weinstein
03:49 PM Backport #53006 (Resolved): pacific: RuntimeError: The following counters failed to be set on mds...
Patrick Donnelly
03:16 PM Backport #53006: pacific: RuntimeError: The following counters failed to be set on mds daemons: {...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43828
merged
Yuri Weinstein
03:48 PM Backport #53120 (Resolved): pacific: client: do not defer releasing caps when revoking
Patrick Donnelly
01:41 PM Bug #50946: mgr/stats: exception ValueError in perf stats
Nikhil, please take this one. Venky Shankar
11:00 AM Backport #53304 (New): octopus: Improve API documentation for struct ceph_client_callback_args
Backport Bot
11:00 AM Backport #53303 (New): pacific: Improve API documentation for struct ceph_client_callback_args
Backport Bot
10:57 AM Documentation #53004 (Pending Backport): Improve API documentation for struct ceph_client_callbac...
Venky Shankar
10:56 AM Bug #51964: qa: test_cephfs_mirror_restart_sync_on_blocklist failure
I'll take a look at the failure soon. Venky Shankar
10:35 AM Bug #51964: qa: test_cephfs_mirror_restart_sync_on_blocklist failure
Also seen in this run
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-08_15:19:37-fs-wip-yuri2-testing-2021-11-0...
Kotresh Hiremath Ravishankar
08:46 AM Bug #53082 (In Progress): ceph-fuse: segmenetation fault in Client::handle_mds_map
Xiubo Li
07:35 AM Bug #53300 (Duplicate): qa: cluster [WRN] Scrub error on inode
"2021-11-09T18:59:42.703093+0000 mds.l (mds.0) 19 : cluster [WRN] Scrub error on inode 0x100000012cc (/client.0/tmp/b... Kotresh Hiremath Ravishankar
07:18 AM Bug #39634: qa: test_full_same_file timeout
Seen in the below pacific run
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-08_20:21:11-fs-wip-yuri5-testing-2...
Kotresh Hiremath Ravishankar
06:42 AM Bug #51705: qa: tasks.cephfs.fuse_mount:mount command failed
New instance
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-12_00:33:28-fs-wip-yuri7-testing-2021-11-11-1339-pa...
Kotresh Hiremath Ravishankar
06:34 AM Backport #52875: pacific: qa: test_dirfrag_limit
Seen in this pacific test run as well. Should go away once the fix is backported
http://pulpito.front.sepia.ceph.c...
Kotresh Hiremath Ravishankar
06:25 AM Bug #52396: pacific: qa: ERROR: test_perf_counters (tasks.cephfs.test_openfiletable.OpenFileTable)
Seen in this pacific run as well
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-12_00:33:28-fs-wip-yuri7-testin...
Kotresh Hiremath Ravishankar
06:00 AM Bug #53216 (Fix Under Review): qa: "RuntimeError: value of attributes should be either str or Non...
The parameters' order is incorrect and missing the 'client_config'. Xiubo Li
04:38 AM Backport #53218 (In Progress): pacific: qa: Test failure: test_perf_counters (tasks.cephfs.test_o...
Xiubo Li
01:04 AM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
Andras Pataki wrote:
> Thanks Patrick! How safe do you feel this patch is? Does it need a lot of testing or is it ...
Patrick Donnelly

11/16/2021

08:13 PM Bug #53293 (Fix Under Review): qa: v16.2.4 mds crash caused by centos stream kernel
Patrick Donnelly
07:59 PM Bug #53293 (Resolved): qa: v16.2.4 mds crash caused by centos stream kernel
breaks fs:upgrade:mds_upgrade_sequence tests:
http://pulpito.front.sepia.ceph.com/yuriw-2021-11-13_15:31:06-rados-...
Patrick Donnelly
07:23 AM Backport #52633 (Resolved): pacific: mds,client: add flag to MClientSession for reject reason
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43251
m...
Loïc Dachary
07:23 AM Backport #52628 (Resolved): pacific: pybind/mgr/volumes: first subvolume permissions set perms on...
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43223
m...
Loïc Dachary
04:37 AM Bug #48473: fs perf stats command crashes
Nikhil, please take this one. Venky Shankar
04:34 AM Documentation #53054 (Fix Under Review): ceph-fuse seems to need root permissions to mount (ceph-...
Venky Shankar

11/15/2021

05:42 PM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
Thanks Patrick! How safe do you feel this patch is? Does it need a lot of testing or is it safe to deploy? Andras Pataki
01:52 PM Bug #53179 (Triaged): Crash when unlink in corrupted cephfs
Most likely the same as: https://tracker.ceph.com/issues/41147.
I guess this should have been fixed a while back. ...
Venky Shankar
01:50 PM Bug #53216 (Triaged): qa: "RuntimeError: value of attributes should be either str or None. client...
Venky Shankar
01:46 PM Backport #53245 (In Progress): pacific: doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
01:38 PM Bug #53246 (Triaged): rhel 8.4 and centos stream unable to install cephfs-java
Venky Shankar
01:06 PM Backport #53217 (In Progress): pacific: test: Implement cephfs-mirror trasher test for HA active/...
Venky Shankar

11/12/2021

06:25 PM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
Andras, I think there are other inefficiencies I've not yet identified but this fix addresses your specific problem. Patrick Donnelly
06:09 PM Bug #53192 (Fix Under Review): High cephfs MDS latency and CPU load with snapshots and unlink ope...
Patrick Donnelly
10:28 AM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
That's great news! I'm also trying to set up a test environment to reproduce the issue outside our main cluster (eye... Andras Pataki
12:19 AM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
Just want to share an update that I was able to reproduce the problem. I'll test out a fix soon... Patrick Donnelly
03:51 PM Bug #53246 (Triaged): rhel 8.4 and centos stream unable to install cephfs-java
... Deepika Upadhyay
03:25 PM Backport #53245 (Resolved): pacific: doc: ephemeral pinning with subvolumegroups
https://github.com/ceph/ceph/pull/43925 Backport Bot
03:20 PM Documentation #53236 (Pending Backport): doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
06:09 AM Fix #48683: mds/MDSMap: print each flag value in MDSMap::dump
Sven Kieske wrote:
> any chance to backport this, at least to pacific or octopus?
This is a feature (classified w...
Jos Collin

11/11/2021

07:25 PM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
Thanks for opening a ticket and doing some digging, Andras,
Andras Pataki wrote:
> I've done some further debuggi...
Patrick Donnelly
05:31 PM Documentation #53236 (Fix Under Review): doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
05:04 PM Documentation #53236 (Resolved): doc: ephemeral pinning with subvolumegroups
Patrick Donnelly
05:00 PM Fix #48683: mds/MDSMap: print each flag value in MDSMap::dump
any chance to backport this, at least to pacific or octopus? Anonymous
03:39 PM Backport #53232 (In Progress): pacific: MDSMonitor: no active MDS after cluster deployment
Patrick Donnelly
03:20 PM Backport #53232 (Resolved): pacific: MDSMonitor: no active MDS after cluster deployment
https://github.com/ceph/ceph/pull/43891 Backport Bot
03:38 PM Backport #53231 (In Progress): pacific: MDSMonitor: assertion during upgrade to v16.2.5+
Patrick Donnelly
03:20 PM Backport #53231 (Resolved): pacific: MDSMonitor: assertion during upgrade to v16.2.5+
https://github.com/ceph/ceph/pull/43890 Backport Bot
03:18 PM Bug #52975 (Pending Backport): MDSMonitor: no active MDS after cluster deployment
Patrick Donnelly
03:17 PM Bug #53155 (Pending Backport): MDSMonitor: assertion during upgrade to v16.2.5+
Patrick Donnelly
03:17 PM Bug #53150 (Resolved): pybind/mgr/cephadm/upgrade: tolerate MDS failures during upgrade straddlin...
backport will be tracked by #53155 Patrick Donnelly
11:07 AM Feature #46166 (Fix Under Review): mds: store symlink target as xattr in data pool inode for disa...
Kotresh Hiremath Ravishankar
07:01 AM Feature #53228 (New): cephfs/quota: Set a limit on minimum quota setting
As of now there is no minimum limit on quota setting. This can be set to lower than the
CEPH_BLOCK size (i.e. 4M) a...
Kotresh Hiremath Ravishankar

11/10/2021

10:41 PM Bug #48805: mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/blogbench-1.0/s...
Came across a failure that looks related to this one in a recent Pacific run: http://pulpito.front.sepia.ceph.com/yur... Laura Flores
10:38 PM Backport #50253: pacific: mds: "cluster [WRN] Scrub error on inode 0x1000000039d (/client.0/tmp/b...
Saw another failure related to this issue in a recent Pacific run: http://pulpito.front.sepia.ceph.com/yuriw-2021-11-... Laura Flores
10:18 PM Bug #23797: qa: cluster [WRN] Health check failed: 1 osds down (OSD_DOWN)
Related failure found in a Pacific run: http://pulpito.front.sepia.ceph.com/yuriw-2021-11-02_19:49:55-rados-wip-yuri8... Laura Flores
07:05 PM Backport #53218 (Resolved): pacific: qa: Test failure: test_perf_counters (tasks.cephfs.test_open...
https://github.com/ceph/ceph/pull/43979 Backport Bot
07:01 PM Backport #53217 (Resolved): pacific: test: Implement cephfs-mirror trasher test for HA active/active
https://github.com/ceph/ceph/pull/43924 Backport Bot
07:00 PM Bug #52887 (Pending Backport): qa: Test failure: test_perf_counters (tasks.cephfs.test_openfileta...
Patrick Donnelly
06:59 PM Feature #50372 (Pending Backport): test: Implement cephfs-mirror trasher test for HA active/active
Patrick Donnelly
06:55 PM Bug #53216 (Resolved): qa: "RuntimeError: value of attributes should be either str or None. clien...
... Patrick Donnelly
06:18 PM Backport #52633: pacific: mds,client: add flag to MClientSession for reject reason
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43251
merged
Yuri Weinstein
06:18 PM Backport #52628: pacific: pybind/mgr/volumes: first subvolume permissions set perms on /volumes a...
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43223
merged
Yuri Weinstein
06:17 PM Bug #53214 (Fix Under Review): qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-...
Jeff Layton
05:59 PM Bug #53214: qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-f42229a869be.c...
Ahh, that's because it _is_ a directory now, since this patch:
https://lore.kernel.org/ceph-devel/19b30242-15ed-87...
Jeff Layton
05:32 PM Bug #53214 (Resolved): qa: "dd: error reading '/sys/kernel/debug/ceph/2a934501-6731-4052-a836-f42...
... Patrick Donnelly
01:23 PM Bug #53045: stat->fsid is not unique among filesystems exported by the ceph server
I have patches for the userland client that mimic what the kernel does, but building ceph under f35 is currently brok... Jeff Layton

11/09/2021

04:35 PM Bug #53192: High cephfs MDS latency and CPU load with snapshots and unlink operations
I've done some further debugging to understand the MDS performance problem that has been impacting us more. The fini... Andras Pataki
01:14 AM Bug #52975 (Fix Under Review): MDSMonitor: no active MDS after cluster deployment
Patrick Donnelly

11/08/2021

10:07 PM Bug #53194 (Fix Under Review): mds: opening connection to up:replay/up:creating daemon causes mes...
Patrick Donnelly
05:49 PM Bug #53194 (Resolved): mds: opening connection to up:replay/up:creating daemon causes message drop
Found a QA run where MDS was stuck in up:resolve:
https://pulpito.ceph.com/pdonnell-2021-11-05_19:13:39-fs:upgrade...
Patrick Donnelly
04:52 PM Bug #53192 (Fix Under Review): High cephfs MDS latency and CPU load with snapshots and unlink ope...
We have recently enabled snapshots on our large Nautilus cluster (running 14.2.20) and our fairly smooth running ceph... Andras Pataki
02:35 PM Backport #52953 (In Progress): octopus: mds: crash when journaling during replay
Venky Shankar
02:26 PM Backport #52952 (In Progress): pacific: mds: crash when journaling during replay
Venky Shankar
02:23 PM Bug #52975 (In Progress): MDSMonitor: no active MDS after cluster deployment
Patrick Donnelly
04:21 AM Bug #52975: MDSMonitor: no active MDS after cluster deployment
Thanks for the reproducer Igor.
commit cbd9a7b354abb06cd395753f93564bdc687cdb04 ("mon,mds: use per-MDS compat to i...
Venky Shankar
01:07 AM Bug #49132: mds crashed "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",
Andras Pataki wrote:
> I haven't had luck keeping the MDS running well with higher log levels unfortunately. Howeve...
Xiubo Li

11/06/2021

02:54 AM Bug #49922: MDS slow request lookupino #0x100 on rank 1 block forever on dispatched
We are triggering the new warning: https://tracker.ceph.com/issues/53180 玮文 胡
12:54 AM Bug #53179 (Duplicate): Crash when unlink in corrupted cephfs
We have a corrupted cephfs that breaks every time after the repair when files are removed.... Daniel Poelzleithner

11/05/2021

08:14 PM Backport #53006 (In Progress): pacific: RuntimeError: The following counters failed to be set on ...
Patrick Donnelly
08:02 PM Backport #53006 (Need More Info): pacific: RuntimeError: The following counters failed to be set ...
I will need to work on this because it pulls in some commits that don't have a tracker assigned. Patrick Donnelly
03:17 AM Backport #53163 (In Progress): octopus: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)...
Xiubo Li
02:56 AM Backport #53164 (In Progress): pacific: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)...
Xiubo Li

11/04/2021

09:00 PM Backport #53165 (New): pacific: qa/vstart_runner: tests crashes due incompatiblity
Backport Bot
08:56 PM Backport #53164 (Resolved): pacific: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
https://github.com/ceph/ceph/pull/43815 Backport Bot
08:56 PM Backport #53163 (Resolved): octopus: mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
https://github.com/ceph/ceph/pull/43816 Backport Bot
08:55 PM Backport #53162 (In Progress): pacific: qa: test_standby_count_wanted failure
https://github.com/ceph/ceph/pull/50760 Backport Bot
08:55 PM Bug #53043 (Pending Backport): qa/vstart_runner: tests crashes due incompatiblity
Patrick Donnelly
08:52 PM Bug #52995 (Pending Backport): qa: test_standby_count_wanted failure
Patrick Donnelly
08:51 PM Bug #51023 (Pending Backport): mds: tcmalloc::allocate_full_cpp_throw_oom(unsigned long)+0xf3)
Patrick Donnelly
08:29 PM Bug #51964: qa: test_cephfs_mirror_restart_sync_on_blocklist failure
/ceph/teuthology-archive/pdonnell-2021-11-04_15:43:53-fs-wip-pdonnell-testing-20211103.023355-distro-basic-smithi/648... Patrick Donnelly
02:27 PM Bug #53155 (Fix Under Review): MDSMonitor: assertion during upgrade to v16.2.5+
Patrick Donnelly
02:21 PM Bug #53155 (Resolved): MDSMonitor: assertion during upgrade to v16.2.5+
... Patrick Donnelly

11/03/2021

11:44 PM Bug #53150 (Fix Under Review): pybind/mgr/cephadm/upgrade: tolerate MDS failures during upgrade s...
Patrick Donnelly
08:38 PM Bug #53150 (Resolved): pybind/mgr/cephadm/upgrade: tolerate MDS failures during upgrade straddlin...
If a v16.2.4 or older MDS fails and rejoins, the compat set assigned to it is the empty set (because it sends no comp... Patrick Donnelly
11:49 AM Backport #52823 (In Progress): pacific: mgr/nfs: add more log messages
Alfonso Martínez
11:43 AM Bug #53074 (Resolved): pybind/mgr/cephadm: upgrade sequence does not continue if no MDS are active
Sebastian Wagner
10:42 AM Bug #51824: pacific scrub ~mds_dir causes stray related ceph_assert, abort and OOM
Main Name wrote:
> Same issue with roughly 1.6M folders.
>
> * Generated a Folder tree with 1611111 Folders
> * ...
Main Name
09:52 AM Bug #51824: pacific scrub ~mds_dir causes stray related ceph_assert, abort and OOM
Same issue with roughly 1.6M folders.
* Generated a Folder tree with 1611111 Folders
* Make snapshot
* Delete Fo...
Main Name
09:19 AM Bug #49132: mds crashed "assert_condition": "state == LOCK_XLOCK || state == LOCK_XLOCKDONE",
I haven't had luck keeping the MDS running well with higher log levels unfortunately. However, I do have one more da... Andras Pataki
06:57 AM Bug #53126 (Triaged): In the 5.4.0 kernel, the mount of ceph-fuse fails
Venky Shankar
06:54 AM Bug #52487 (In Progress): qa: Test failure: test_deep_split (tasks.cephfs.test_fragment.TestFragm...
The check here[0] results in `num_strays` being zero _right after_ the journal was flushed::... Venky Shankar
06:36 AM Feature #50372 (Fix Under Review): test: Implement cephfs-mirror trasher test for HA active/active
Venky Shankar
06:35 AM Backport #51415 (In Progress): octopus: mds: "FAILED ceph_assert(r == 0 || r == -2)"
Xiubo Li
02:55 AM Backport #53121 (In Progress): pacific: mds: collect I/O sizes from client for cephfs-top
Xiubo Li
02:53 AM Backport #53120 (In Progress): pacific: client: do not defer releasing caps when revoking
Xiubo Li

11/02/2021

10:15 PM Bug #50622 (Resolved): msg: active_connections regression
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
10:09 PM Bug #52572 (Resolved): "cluster [WRN] 1 slow requests" in smoke pacific
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
10:08 PM Bug #52820 (Resolved): Ceph monitor crash after upgrade from ceph 15.2.14 to 16.2.6
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
10:07 PM Bug #52874 (Resolved): Monitor might crash after upgrade from ceph to 16.2.6
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are ... Loïc Dachary
10:05 PM Backport #52999 (Resolved): pacific: Ceph monitor crash after upgrade from ceph 15.2.14 to 16.2.6
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43615
m...
Loïc Dachary
01:34 PM Backport #52999: pacific: Ceph monitor crash after upgrade from ceph 15.2.14 to 16.2.6
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43615
merged
Yuri Weinstein
10:04 PM Backport #52998 (Resolved): pacific: Monitor might crash after upgrade from ceph to 16.2.6
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43614
m...
Loïc Dachary
01:33 PM Backport #52998: pacific: Monitor might crash after upgrade from ceph to 16.2.6
Backport Bot wrote:
> https://github.com/ceph/ceph/pull/43614
merged
Yuri Weinstein
10:03 PM Backport #52679: pacific: "cluster [WRN] 1 slow requests" in smoke pacific
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43562
m...
Loïc Dachary
09:54 PM Backport #51199 (Resolved): octopus: msg: active_connections regression
This update was made using the script "backport-resolve-issue".
backport PR https://github.com/ceph/ceph/pull/43310
m...
Loïc Dachary
03:25 PM Bug #53126: In the 5.4.0 kernel, the mount of ceph-fuse fails
Might be related to #53082 Venky Shankar
06:41 AM Bug #53126 (Closed): In the 5.4.0 kernel, the mount of ceph-fuse fails
Hello everyone,
I use ubuntu18.04.5 server and the ceph version is 14.2.22.
After upgrading the kernel to 5.4.0, th...
Jiang Yu
01:41 PM Bug #53082: ceph-fuse: segmenetation fault in Client::handle_mds_map
Venky, I will take it. Xiubo Li
03:19 AM Bug #52887 (Fix Under Review): qa: Test failure: test_perf_counters (tasks.cephfs.test_openfileta...
Xiubo Li
03:02 AM Bug #52887: qa: Test failure: test_perf_counters (tasks.cephfs.test_openfiletable.OpenFileTable)
The `self.wait_until_true(lambda: self._check_oft_counter('omap_total_removes', 1), timeout=30)` last check was at `2... Xiubo Li
 

Also available in: Atom