Project

General

Profile

Activity

From 02/14/2019 to 03/15/2019

03/15/2019

04:37 PM Documentation #38728 (Resolved): doc: scrub administration docs need updated
Patrick Donnelly
04:11 PM Bug #23262: kclient: nofail option not supported
I've an attempt to a fix here https://github.com/ceph/ceph/pull/26992 Kenneth Waegeman
10:16 AM Documentation #38729 (In Progress): doc: add LAZYIO
Zheng Yan
03:34 AM Backport #38735 (In Progress): luminous: qa: tolerate longer heartbeat timeouts when using valgrind
Ashish Singh
03:31 AM Backport #38734 (In Progress): mimic: qa: tolerate longer heartbeat timeouts when using valgrind
Ashish Singh
02:00 AM Backport #38665 (In Progress): luminous: qa: powercycle suite reports MDS_SLOW_METADATA_IO
https://github.com/ceph/ceph/pull/26962 Prashant D
01:54 AM Backport #38666 (In Progress): mimic: qa: powercycle suite reports MDS_SLOW_METADATA_IO
https://github.com/ceph/ceph/pull/26961 Prashant D

03/14/2019

07:57 PM Bug #38739: cephfs-shell: python traceback with mkdir inside inexistant directory
To be clear, you're saying that it should just return some kind of error message like "ENOENT"? Patrick Donnelly
12:28 PM Bug #38739 (Resolved): cephfs-shell: python traceback with mkdir inside inexistant directory
CephFS:~/>>> mkdir jack/and/jill/went/up/the/hill
[Errno 2] error in mkdir 'b'/jack/and/jill/went/up/the/hill''
Tra...
Milind Changire
12:58 PM Bug #38743 (Resolved): cephfs-shell: mkdir creates directory with invalid octal mode
... Milind Changire
12:46 PM Bug #38742 (Resolved): cephfs-shell: entering unrecognized command does not print newline after m...
... Milind Changire
12:34 PM Bug #38741 (Resolved): cephfs-shell: python traceback with mkdir when reattempt of mkdir
... Milind Changire
12:30 PM Feature #38740 (Resolved): cephfs-shell: support mkdir with non-octal mode
... Milind Changire
09:06 AM Backport #38737 (Rejected): luminous: qa: "[WRN] Health check failed: 1/3 mons down, quorum b,c (...
Nathan Cutler
09:06 AM Backport #38736 (Resolved): mimic: qa: "[WRN] Health check failed: 1/3 mons down, quorum b,c (MON...
https://github.com/ceph/ceph/pull/27906 Nathan Cutler
09:06 AM Backport #38735 (Resolved): luminous: qa: tolerate longer heartbeat timeouts when using valgrind
https://github.com/ceph/ceph/pull/26964 Nathan Cutler
09:06 AM Backport #38734 (Resolved): mimic: qa: tolerate longer heartbeat timeouts when using valgrind
https://github.com/ceph/ceph/pull/26963
Nathan Cutler
03:51 AM Bug #38723 (Pending Backport): qa: tolerate longer heartbeat timeouts when using valgrind
Patrick Donnelly
03:44 AM Documentation #38728 (In Progress): doc: scrub administration docs need updated
Venky Shankar
03:32 AM Bug #38704 (Pending Backport): qa: "[WRN] Health check failed: 1/3 mons down, quorum b,c (MON_DOW...
Patrick Donnelly

03/13/2019

06:47 PM Documentation #38729 (Resolved): doc: add LAZYIO
to doc/cephfs/{experimental-features.rst,posix.rst}. A new documentation page could also be appropriate. Patrick Donnelly
06:29 PM Documentation #38728 (Resolved): doc: scrub administration docs need updated
For changes from #12282.
Mostly, we should document (with examples!) how to view, abort, and pause scrubs.
Patrick Donnelly
04:20 PM Bug #38723 (Fix Under Review): qa: tolerate longer heartbeat timeouts when using valgrind
Patrick Donnelly
04:11 PM Bug #38723 (Resolved): qa: tolerate longer heartbeat timeouts when using valgrind
... Patrick Donnelly
11:56 AM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> Greg Farnum wrote:
> >
> > Well, as someone noted, you also need to make sure you aren't cre...
Zheng Yan
11:48 AM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> Zheng Yan wrote:
> > For directory inode, current mds only issues Fsx (at most) caps to client...
Zheng Yan
09:53 AM Backport #38710 (Rejected): luminous: qa: kclient unmount hangs after file system goes down
Nathan Cutler
09:53 AM Backport #38709 (Resolved): mimic: qa: kclient unmount hangs after file system goes down
https://github.com/ceph/ceph/pull/29218 Nathan Cutler

03/12/2019

09:52 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Greg Farnum wrote:
>
> Well, as someone noted, you also need to make sure you aren't creating a dentry that alread...
Jeff Layton
09:21 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> That's sort of my point here (though I didn't put it quite as succinctly). I don't think adding...
Greg Farnum
08:25 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
That's sort of my point here (though I didn't put it quite as succinctly). I don't think adding more cap flags really... Jeff Layton
06:23 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Given that we already use non-cap flags, and directories are special anyway, I'm not sure extending the cap language ... Greg Farnum
12:49 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Patrick Donnelly wrote:
> Okay, here's how I think it should work but it may not be how it actually works:
>
> Fs...
Jeff Layton
12:10 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Patrick Donnelly wrote:
>
> Okay, here's how I think it should work but it may not be how it actually works:
>
...
Jeff Layton
11:10 AM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Zheng Yan wrote:
> For directory inode, current mds only issues Fsx (at most) caps to client. It never issues Frwcb ...
Jeff Layton
03:49 AM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> Patrick Donnelly wrote:
>
> > Fx would indicate the directory contents can be cached by the ...
Zheng Yan
05:48 PM Bug #38677 (Pending Backport): qa: kclient unmount hangs after file system goes down
Patrick Donnelly
05:40 PM Bug #38704 (Fix Under Review): qa: "[WRN] Health check failed: 1/3 mons down, quorum b,c (MON_DOW...
Patrick Donnelly
05:28 PM Bug #38704 (Resolved): qa: "[WRN] Health check failed: 1/3 mons down, quorum b,c (MON_DOWN) in cl...
... Patrick Donnelly
05:26 PM Bug #38676 (Resolved): qa: src/common/Thread.cc: 157: FAILED ceph_assert(ret == 0)
Patrick Donnelly
11:58 AM Backport #38689 (Resolved): mimic: mds: inode filtering on 'dump cache' asok
https://github.com/ceph/ceph/pull/27058 Nathan Cutler
11:57 AM Backport #38688 (Rejected): luminous: mds: inode filtering on 'dump cache' asok
Nathan Cutler
11:56 AM Backport #38687 (Resolved): mimic: kcephfs TestClientLimits.test_client_pin fails with "client ca...
https://github.com/ceph/ceph/pull/29211 Nathan Cutler
11:56 AM Backport #38686 (Resolved): luminous: kcephfs TestClientLimits.test_client_pin fails with "client...
https://github.com/ceph/ceph/pull/27040 Nathan Cutler
03:44 AM Support #38640 (Closed): fs: garbage in cephfs pool and tier pool after delete all files
Fyodor Ustinov wrote:
> Patrick Donnelly wrote:
> > The garbage is just inodes left behind, probably by some MDS bu...
Patrick Donnelly
03:27 AM Bug #38681 (Resolved): cephfs-shell: add commands to manipulate snapshots
Idea being to make this a trivial one-off command to manipulate quotas/snapshots on CephFS directories without mounting. Patrick Donnelly

03/11/2019

11:11 PM Support #38640: fs: garbage in cephfs pool and tier pool after delete all files
Patrick Donnelly wrote:
> The garbage is just inodes left behind, probably by some MDS bug in the past. Presumably y...
Fyodor Ustinov
09:06 PM Support #38640: fs: garbage in cephfs pool and tier pool after delete all files
The garbage is just inodes left behind, probably by some MDS bug in the past. Presumably you don't need these files a... Patrick Donnelly
09:52 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> Patrick Donnelly wrote:
>
> > Fx would indicate the directory contents can be cached by the ...
Patrick Donnelly
05:29 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Patrick Donnelly wrote:
> Fx would indicate the directory contents can be cached by the client, yes? Perhaps the c...
Jeff Layton
05:02 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> > EXCL and WR. I don't think the MDS ever considers handing out WR to clients.
>
> Just so I...
Patrick Donnelly
04:27 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
> EXCL and WR. I don't think the MDS ever considers handing out WR to clients.
Just so I'm clear...why do we care ...
Jeff Layton
09:24 PM Bug #11314 (New): qa: MDS crashed and the runs hung without ever timing out
Patrick Donnelly
09:22 PM Feature #12107 (Fix Under Review): mds: use versioned wire protocol; obviate CEPH_MDS_PROTOCOL
Patrick Donnelly
09:20 PM Feature #11172 (Pending Backport): mds: inode filtering on 'dump cache' asok
Patrick Donnelly
05:25 PM Bug #38679 (Resolved): mds: behind on trimming and "[dentry] was purgeable but no longer is!"
... Patrick Donnelly
04:22 PM Bug #38270 (Pending Backport): kcephfs TestClientLimits.test_client_pin fails with "client caps f...
Patrick Donnelly
04:14 PM Bug #38677 (Resolved): qa: kclient unmount hangs after file system goes down
... Patrick Donnelly
04:08 PM Bug #38676 (Fix Under Review): qa: src/common/Thread.cc: 157: FAILED ceph_assert(ret == 0)
Patrick Donnelly
04:00 PM Bug #38676: qa: src/common/Thread.cc: 157: FAILED ceph_assert(ret == 0)
We should just use the async messenger instead. According to Sage, other YAML bits are necessary to use v1 protocol. Patrick Donnelly
03:59 PM Bug #38676 (Resolved): qa: src/common/Thread.cc: 157: FAILED ceph_assert(ret == 0)
... Patrick Donnelly
02:39 PM Backport #38670 (Resolved): mimic: "log [WRN] : Health check failed: 1 clients failing to respond...
https://github.com/ceph/ceph/pull/27023 Nathan Cutler
02:39 PM Backport #38669 (Resolved): luminous: "log [WRN] : Health check failed: 1 clients failing to resp...
https://github.com/ceph/ceph/pull/27024 Nathan Cutler
02:38 PM Backport #38666 (Resolved): mimic: qa: powercycle suite reports MDS_SLOW_METADATA_IO
https://github.com/ceph/ceph/pull/26961 Nathan Cutler
02:38 PM Backport #38665 (Resolved): luminous: qa: powercycle suite reports MDS_SLOW_METADATA_IO
https://github.com/ceph/ceph/pull/26962 Nathan Cutler
01:52 PM Bug #38597: fs: "log [WRN] : failed to reconnect caps for missing inodes"
luminous does not need backport Zheng Yan
12:31 PM Bug #38652 (Fix Under Review): mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug ...
https://github.com/ceph/ceph/pull/26881 Zheng Yan
12:28 PM Bug #38636: Inline data compatibly check in Locker::issue_caps is buggy
https://github.com/ceph/ceph/pull/26881 Zheng Yan

03/10/2019

05:40 PM Bug #38651 (Pending Backport): qa: powercycle suite reports MDS_SLOW_METADATA_IO
Patrick Donnelly

03/09/2019

06:14 PM Bug #38491 (Pending Backport): "log [WRN] : Health check failed: 1 clients failing to respond to ...
Patrick Donnelly
06:13 PM Bug #38471 (Duplicate): mds does not wait for cap revoke when lock is in LOCK_LOCK_XLOCK state.
Reversing the duplicate since the commit cites the other tracker. Patrick Donnelly
12:24 AM Bug #14608 (Can't reproduce): snaptests.yaml failure: [WRN] open_snap_parents has:" in cluster log
Patrick Donnelly
12:19 AM Bug #1114 (Rejected): NFS export extreme slowdown
NFS over the kernel client is not a recommended configuration. Patrick Donnelly
12:19 AM Bug #10740 (Rejected): teuthology: nfs test getting EBUSY on umount
knfs is dead. Patrick Donnelly
12:18 AM Bug #3424 (Rejected): java: Add the correct JUnit package dependencies on supported platforms and...
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:18 AM Bug #19456 (Rejected): Hadoop suite failing on missing upstream tarball
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:18 AM Bug #21149 (Rejected): SubsystemMap.h: 62: FAILED assert(sub < m_subsys.size())
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:18 AM Bug #17621 (Rejected): Hadoop does a bad job closing files and we end up holding too many caps
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:18 AM Feature #3422 (Rejected): Only compile java tests if JUnit is present
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:17 AM Feature #3399 (Rejected): java: add accessor to Ceph version numbers
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:16 AM Bug #17184 (Rejected): "Segmentation fault" in samba-jewel---basic-mira
Patrick Donnelly
12:15 AM Feature #4208 (Rejected): Add more replication pool tests for Hadoop / Ceph bindings
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:15 AM Bug #3318 (Rejected): java: lock access to CephStat, CephStatVFS from native
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:15 AM Bug #3337 (Rejected): java: unit tests don't remove container directory after a failure
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:15 AM Feature #4576 (Rejected): java: support ByteBuffer interface for NIO and NIO.2 high-perf I/O
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:14 AM Fix #12296 (Rejected): cephfs-hadoop: do not stash libcephfs.jar in git repo
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:14 AM Feature #9287 (Rejected): qa: hadoop: add big top tests to suite
Java/Hadoop testing is no longer a priority. Patrick Donnelly
12:14 AM Feature #9286 (Rejected): qa: hadoop: test 2.x with teuthology
Java/Hadoop testing is no longer a priority. Patrick Donnelly

03/08/2019

08:42 PM Bug #38452: mds: assert crash loop while unlinking file
I had to repeat the procedure a couple time before everything cleaned up, but now it is back to normal.
I included...
Jérôme Poulin
07:58 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> To communicate this range to the client, we could use a MClientSession message. Maybe version (...
Patrick Donnelly
06:44 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Some notes about the preallocation piece:
The prealloc_inos interval set is tracked per-session in session_info_t....
Jeff Layton
02:07 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Patrick Donnelly wrote:
>
> EXCL and WR. I don't think the MDS ever considers handing out WR to clients.
>
...
Jeff Layton
07:13 PM Bug #38652: mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5
See also Zheng's analysis in the ticket he opened: #38636 Patrick Donnelly
07:03 PM Bug #38652 (Resolved): mds|kclient: MDS_CLIENT_LATE_RELEASE warning caused by inline bug on RHEL 7.5
... Patrick Donnelly
07:12 PM Bug #38636 (Duplicate): Inline data compatibly check in Locker::issue_caps is buggy
Patrick Donnelly
08:00 AM Bug #38636 (Duplicate): Inline data compatibly check in Locker::issue_caps is buggy
... Zheng Yan
06:51 PM Bug #38651 (Fix Under Review): qa: powercycle suite reports MDS_SLOW_METADATA_IO
Patrick Donnelly
06:48 PM Bug #38651 (Resolved): qa: powercycle suite reports MDS_SLOW_METADATA_IO
http://pulpito.ceph.com/yuriw-2019-03-07_00:07:42-powercycle-wip_yuri_nautilus_3.6.19-distro-basic-smithi/
We shou...
Patrick Donnelly
04:24 PM Feature #15070 (Fix Under Review): mon: client: multifs: auth caps on client->mon connections to ...
Douglas Fuller
02:46 PM Backport #38643 (Resolved): mimic: fs: "log [WRN] : failed to reconnect caps for missing inodes"
https://github.com/ceph/ceph/pull/31282 Nathan Cutler
01:07 PM Support #38640 (Closed): fs: garbage in cephfs pool and tier pool after delete all files
Hi!
# ceph version
ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable)
After remov...
Fyodor Ustinov

03/07/2019

11:56 PM Bug #38270 (Fix Under Review): kcephfs TestClientLimits.test_client_pin fails with "client caps f...
Patrick Donnelly
11:23 PM Bug #36730 (In Progress): mds: should apply policy to throttle client messages
Patrick Donnelly
11:23 PM Bug #36171 (In Progress): mds: ctime should not use client provided ctime/mtime
Patrick Donnelly
11:04 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
Jeff Layton wrote:
> I think this is not really a single project, but a set of them. At a high level:
> ensure ...
Patrick Donnelly
01:48 PM Feature #24461: cephfs: improve file create performance buffering file unlink/create operations
I think this is not really a single project, but a set of them. At a high level:
* ensure that the MDS can hand ou...
Jeff Layton
10:40 PM Bug #38597 (Pending Backport): fs: "log [WRN] : failed to reconnect caps for missing inodes"
Patrick Donnelly
02:47 PM Feature #36663 (In Progress): mds: adjust cache memory limit automatically via target that tracks...
Patrick Donnelly
10:18 AM Feature #17309 (In Progress): qa: mon_thrash test for CephFS
Jos Collin
03:50 AM Backport #38545 (In Progress): luminous: qa: "Loading libcephfs-jni: Failure!"
https://github.com/ceph/ceph/pull/26808 Prashant D
03:49 AM Backport #38544 (In Progress): mimic: qa: "Loading libcephfs-jni: Failure!"
https://github.com/ceph/ceph/pull/26807 Prashant D
01:18 AM Backport #38543 (In Progress): luminous: qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs be...
https://github.com/ceph/ceph/pull/26805 Prashant D
01:15 AM Backport #38542 (In Progress): mimic: qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs becau...
https://github.com/ceph/ceph/pull/26804 Prashant D

03/06/2019

04:44 PM Backport #38132 (Resolved): luminous: mds: stopping MDS with a large cache (40+GB) causes it to m...
Patrick Donnelly
03:52 PM Backport #38132: luminous: mds: stopping MDS with a large cache (40+GB) causes it to miss heartbeats
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/26232
merged
Yuri Weinstein
04:44 PM Backport #38130 (Resolved): luminous: mds: provide a limit for the maximum number of caps a clien...
Patrick Donnelly
03:52 PM Backport #38130: luminous: mds: provide a limit for the maximum number of caps a client may have
Patrick Donnelly wrote:
> https://github.com/ceph/ceph/pull/26232
merged
Yuri Weinstein
04:43 PM Bug #38488 (Resolved): luminous: mds: message invalid access
Patrick Donnelly
03:52 PM Bug #38488: luminous: mds: message invalid access
https://github.com/ceph/ceph/pull/26661 merged Yuri Weinstein
08:25 AM Bug #38597 (Fix Under Review): fs: "log [WRN] : failed to reconnect caps for missing inodes"
https://github.com/ceph/ceph/pull/26781 Zheng Yan
12:54 AM Bug #38597: fs: "log [WRN] : failed to reconnect caps for missing inodes"
Adding tag to #18461 in case they're related. Feel free to disconnect if they're not Zheng. Patrick Donnelly
12:53 AM Bug #38597 (Resolved): fs: "log [WRN] : failed to reconnect caps for missing inodes"
... Patrick Donnelly
03:03 AM Bug #38452: mds: assert crash loop while unlinking file
there is no clue in the log. If you want to fix the issue, you can:... Zheng Yan
02:19 AM Backport #38541 (In Progress): luminous: qa: fsstress with valgrind may timeout
https://github.com/ceph/ceph/pull/26776 Prashant D

03/05/2019

11:16 PM Bug #23429 (Can't reproduce): File corrupt after writing to cephfs
Patrick Donnelly
07:35 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
> Why would there be more than one uid/gid pair for a client request?
There wouldn't be, but a client like ganesha...
Jeff Layton
06:51 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Jeff Layton wrote:
> No. This problem is not specific to rename -- I just used that as an example.
Sorry can you ...
Patrick Donnelly
06:47 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Jeff Layton wrote:
> To elaborate, today the rule looks something like this:
>
> "If there is a uid= qualifier in...
Patrick Donnelly
03:38 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
No. This problem is not specific to rename -- I just used that as an example. The upshot here is that you cannot sepa... Jeff Layton
03:25 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Jeff Layton wrote:
> No, the order has already been established, because each operation _really_ starts with the per...
Patrick Donnelly
12:31 PM Backport #37906 (New): mimic: make cephfs-data-scan reconstruct snaptable
first attempted backport PR https://github.com/ceph/ceph/pull/26298 was closed Nathan Cutler
09:23 AM Bug #38452: mds: assert crash loop while unlinking file
... Zheng Yan

03/04/2019

06:17 PM Bug #21884 (Resolved): client: populate f_fsid in statfs output
The kernel FUSE fs hasn't gained the capability to present a f_fsid and adding it would open a big can of worms (sinc... Jeff Layton
06:02 PM Feature #18537 (Rejected): libcephfs cache invalidation upcalls
I looked at this, but I think the real solution to this problem is to just prevent ganesha from caching these objects... Jeff Layton
05:02 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
To elaborate, today the rule looks something like this:
"If there is a uid= qualifier in the cap, and the call is ...
Jeff Layton
03:23 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Thanks, Zheng! That worked. For the record, based on my testcase, I did:... Jeff Layton
02:05 PM Bug #17594: cephfs: permission checking not working (MDS should enforce POSIX permissions)
Dusting this bug off after quite some time. I'm trying to understand what it will take to get the MDS to enforce perm... Jeff Layton
03:40 PM Bug #38376 (Rejected): client:Repeated release of put_request() in make_request()
Turns out this isn't a bug. Thanks for making the issue anyway. Patrick Donnelly
03:03 PM Feature #12334 (Rejected): nfs-ganesha: handle client cache pressure in NFS Ganesha FSAL
I've not heard of anyone hitting this that has set up ganesha to use the CACHEINODE and EXPORT parameters recommended... Jeff Layton
02:54 PM Bug #38452: mds: assert crash loop while unlinking file
Here you go. Jérôme Poulin
02:43 PM Bug #38452: mds: assert crash loop while unlinking file
removing replay/rejoin/reconnect should be good Zheng Yan
02:15 PM Bug #38452: mds: assert crash loop while unlinking file
Sorry, my mistake, the new file with debug-mds=20 has 520 MB now.
Since the resulting compressed file has 10 MB, d...
Jérôme Poulin
07:51 AM Bug #38452: mds: assert crash loop while unlinking file
looks like you set debug_ms to 20, not debug_mds Zheng Yan
07:16 AM Bug #38452: mds: assert crash loop while unlinking file
Here is the whole log attached (8.4MB) Jérôme Poulin
07:09 AM Bug #38452: mds: assert crash loop while unlinking file
please add 'debug_mds = 20' to ceph.conf, then trigger the assertion again Zheng Yan
02:31 AM Bug #38452: mds: assert crash loop while unlinking file
Here is the last assert I got, would you think they're caused by the same source?... Jérôme Poulin
02:28 AM Bug #38452: mds: assert crash loop while unlinking file
I can reproduce the problem by deleting a specific file if you need more information. This is a development environme... Jérôme Poulin
02:14 PM Bug #38488 (Fix Under Review): luminous: mds: message invalid access
Patrick Donnelly
01:03 PM Bug #38326 (Fix Under Review): mds: evict stale client when one of its write caps are stolen
Zheng Yan
07:10 AM Bug #38498 (Closed): common/Thread.cc: 160: FAILED assert(ret == 0)--10.2.10
Zheng Yan
02:03 AM Bug #38498: common/Thread.cc: 160: FAILED assert(ret == 0)--10.2.10
sorry, it is my system threads-max reached.
lin zhou

03/01/2019

03:39 PM Backport #38545 (Resolved): luminous: qa: "Loading libcephfs-jni: Failure!"
https://github.com/ceph/ceph/pull/26820 Nathan Cutler
03:39 PM Backport #38544 (Resolved): mimic: qa: "Loading libcephfs-jni: Failure!"
https://github.com/ceph/ceph/pull/26807 Nathan Cutler
03:39 PM Backport #38543 (Resolved): luminous: qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs becau...
https://github.com/ceph/ceph/pull/26805 Nathan Cutler
03:39 PM Backport #38542 (Resolved): mimic: qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs because ...
https://github.com/ceph/ceph/pull/26804 Nathan Cutler
03:38 PM Backport #38541 (Resolved): luminous: qa: fsstress with valgrind may timeout
https://github.com/ceph/ceph/pull/26776 Nathan Cutler
03:38 PM Backport #38540 (Resolved): mimic: qa: fsstress with valgrind may timeout
https://github.com/ceph/ceph/pull/27432 Nathan Cutler
02:52 PM Bug #38487 (Pending Backport): qa: "Loading libcephfs-jni: Failure!"
Patrick Donnelly
02:46 PM Bug #38518 (Pending Backport): qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs because clie...
Patrick Donnelly
02:43 PM Bug #38520 (Pending Backport): qa: fsstress with valgrind may timeout
Patrick Donnelly
02:03 PM Bug #37929 (Resolved): MDSMonitor: missing osdmon writeable check
Nathan Cutler
02:03 PM Backport #37989 (Resolved): luminous: MDSMonitor: missing osdmon writeable check
Nathan Cutler
02:03 PM Bug #37639 (Resolved): mds: output client IP of blacklisted/evicted clients to cluster log
Nathan Cutler
02:01 PM Backport #37823 (Resolved): luminous: mds: output client IP of blacklisted/evicted clients to clu...
Nathan Cutler
02:01 PM Bug #36367 (Resolved): mds: wait shorter intervals to send beacon if laggy
Nathan Cutler
02:00 PM Backport #37908 (Resolved): luminous: mds: wait shorter intervals to send beacon if laggy
Nathan Cutler
02:00 PM Bug #38010 (Resolved): mds: cache drop should trim cache before flushing journal
Nathan Cutler
02:00 PM Backport #38102 (Resolved): luminous: mds: cache drop should trim cache before flushing journal
Nathan Cutler
01:59 PM Bug #38054 (Resolved): mds: broadcast quota message to client when disable quota
Nathan Cutler
01:59 PM Backport #38190 (Resolved): luminous: mds: broadcast quota message to client when disable quota
Nathan Cutler
01:58 PM Bug #38263 (Resolved): mds: fix potential re-evaluate stray dentry in _unlink_local_finish
Nathan Cutler
01:57 PM Backport #38336 (Resolved): luminous: mds: fix potential re-evaluate stray dentry in _unlink_loca...
Nathan Cutler
07:41 AM Bug #38471 (Fix Under Review): mds does not wait for cap revoke when lock is in LOCK_LOCK_XLOCK s...
Zheng Yan
04:26 AM Bug #38471: mds does not wait for cap revoke when lock is in LOCK_LOCK_XLOCK state.
Patrick Donnelly
04:25 AM Bug #38491 (Duplicate): "log [WRN] : Health check failed: 1 clients failing to respond to capabil...
Patrick Donnelly

02/28/2019

05:37 PM Bug #38520 (Fix Under Review): qa: fsstress with valgrind may timeout
Patrick Donnelly
05:35 PM Bug #38520 (Resolved): qa: fsstress with valgrind may timeout
/ceph/teuthology-archive/pdonnell-2019-02-28_08:44:44-fs-wip-pdonnell-testing-20190228.054239-distro-basic-smithi/364... Patrick Donnelly
05:04 PM Bug #38518 (Fix Under Review): qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs because clie...
Patrick Donnelly
05:02 PM Bug #38518 (Resolved): qa: tasks.cephfs.test_misc.TestMisc.test_fs_new hangs because clients cann...
When the file system is deleted, the client hangs during unmount.
http://pulpito.ceph.com/pdonnell-2019-02-28_08:4...
Patrick Donnelly

02/27/2019

04:01 PM Bug #38490: mds: multimds stuck
Zheng Yan wrote:
> Patrick Donnelly wrote:
> > Similar: /ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multi...
Patrick Donnelly
02:45 PM Bug #38490: mds: multimds stuck
For /ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multimds-wip-pdonnell-testing-20190226.051327-distro-basic-... Zheng Yan
02:37 PM Bug #38490: mds: multimds stuck
Patrick Donnelly wrote:
> Similar: /ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multimds-wip-pdonnell-testi...
Zheng Yan
01:13 PM Bug #38491: "log [WRN] : Health check failed: 1 clients failing to respond to capability release ...
dup of http://tracker.ceph.com/issues/38471. It's real bug, (I don't know which change reveal it) Zheng Yan
10:30 AM Bug #38498 (Closed): common/Thread.cc: 160: FAILED assert(ret == 0)--10.2.10
Hi, guys
So far, there have been 10 osd service exit because of this error.
the error messages are all the same.
...
lin zhou

02/26/2019

10:44 PM Bug #38491 (Resolved): "log [WRN] : Health check failed: 1 clients failing to respond to capabili...
... Patrick Donnelly
10:18 PM Backport #37989: luminous: MDSMonitor: missing osdmon writeable check
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26065
merged
Yuri Weinstein
10:14 PM Backport #37823: luminous: mds: output client IP of blacklisted/evicted clients to cluster log
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25858
merged
Yuri Weinstein
10:13 PM Backport #37908: luminous: mds: wait shorter intervals to send beacon if laggy
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25979
merged
Yuri Weinstein
10:11 PM Backport #38102: luminous: mds: cache drop should trim cache before flushing journal
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26215
merged
Yuri Weinstein
10:11 PM Backport #38190: luminous: mds: broadcast quota message to client when disable quota
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26293
merged
Yuri Weinstein
10:10 PM Backport #38336: luminous: mds: fix potential re-evaluate stray dentry in _unlink_local_finish
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26473
merged
Yuri Weinstein
10:04 PM Bug #38490: mds: multimds stuck
-Similar: /ceph/teuthology-archive/pdonnell-2019-02-26_07:49:50-multimds-wip-pdonnell-testing-20190226.051327-distro-... Patrick Donnelly
09:30 PM Bug #38490 (New): mds: multimds stuck
Sorry for vague $subject, not sure what's wrong yet.... Patrick Donnelly
08:26 PM Backport #38084: luminous: mds: log new client sessions with various metadata
Backport caused #38488. Patrick Donnelly
08:26 PM Bug #38488 (Resolved): luminous: mds: message invalid access
... Patrick Donnelly
07:31 PM Bug #38487 (Fix Under Review): qa: "Loading libcephfs-jni: Failure!"
Patrick Donnelly
07:27 PM Bug #38487 (Resolved): qa: "Loading libcephfs-jni: Failure!"
... Patrick Donnelly
07:28 PM Bug #24517 (Duplicate): "Loading libcephfs-jni: Failure!" in fs suite
Patrick Donnelly
07:27 PM Bug #16640 (Won't Fix): libcephfs: Java bindings failing to load on CentOS
Patrick Donnelly
03:24 AM Backport #38448 (In Progress): mimic: src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
https://github.com/ceph/ceph/pull/26643 Prashant D
01:18 AM Backport #38449 (In Progress): luminous: src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
https://github.com/ceph/ceph/pull/26642 Prashant D

02/25/2019

08:27 AM Bug #38471: mds does not wait for cap revoke when lock is in LOCK_LOCK_XLOCK state.
this can cause incorrect warning (pending == issued)
2019-02-25 16:24:15.384411 mds.b [WRN] client.31684173 isn't...
Zheng Yan
08:25 AM Bug #38471 (Duplicate): mds does not wait for cap revoke when lock is in LOCK_LOCK_XLOCK state.
shared cap gets revoked during 'LOCK_XLOCKDONE -> LOCK_LOCK_XLOCK' transition. Zheng Yan

02/23/2019

06:01 AM Bug #36507: client: connection failure during reconnect causes client to hang
The above error was solved in this PR:
https://github.com/ceph/ceph/pull/25343
huanwen ren

02/22/2019

07:37 PM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
I restored the backup of the meta and data pool for the CephFS on another cluster but I'm not able to attach a filesy... Jérôme Poulin
05:07 PM Bug #38452 (Need More Info): mds: assert crash loop while unlinking file
Here is the stack trace, it was caused by a Postgresql trying to unlink a file in the log archive.... Jérôme Poulin
03:16 PM Bug #37644 (Resolved): extend reconnect period when mds is busy
Nathan Cutler
03:16 PM Backport #37740 (Resolved): mimic: extend reconnect period when mds is busy
Nathan Cutler
03:16 PM Bug #36035 (Resolved): mds: MDCache.cc: 11673: abort()
Nathan Cutler
03:15 PM Backport #37480 (Resolved): mimic: mds: MDCache.cc: 11673: abort()
Nathan Cutler
03:15 PM Backport #38189 (Resolved): mimic: mds: broadcast quota message to client when disable quota
Nathan Cutler
02:35 PM Backport #38449 (Resolved): luminous: src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
https://github.com/ceph/ceph/pull/26642 Nathan Cutler
02:35 PM Backport #38448 (Resolved): mimic: src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
https://github.com/ceph/ceph/pull/26643 Nathan Cutler
02:34 PM Backport #38445 (Resolved): luminous: mds: drop cache does not timeout as expected
https://github.com/ceph/ceph/pull/27342 Nathan Cutler
02:34 PM Backport #38444 (Rejected): mimic: mds: drop cache does not timeout as expected
Nathan Cutler
07:14 AM Bug #36507: client: connection failure during reconnect causes client to hang
when I use "evict" feature, you can see this print, but in the end it can be automatically restored(about 10s).
Ceph...
huanwen ren

02/21/2019

07:53 PM Bug #15783 (In Progress): client: enable acls by default
Patrick Donnelly
06:43 PM Bug #36384 (Pending Backport): src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
Patrick Donnelly
06:27 PM Bug #38348 (Pending Backport): mds: drop cache does not timeout as expected
Patrick Donnelly
06:10 PM Bug #38347 (Resolved): ceph.in: cephfs status line may not show active ranks when standby-replay ...
Patrick Donnelly
12:02 AM Bug #38285 (Resolved): MDCache::finish_snaprealm_reconnect() create and drop MClientSnap message
Patrick Donnelly

02/20/2019

09:36 PM Bug #38009 (Resolved): client: session flush does not cause cap release message flush
Patrick Donnelly
09:36 PM Backport #38103 (Resolved): mimic: client: session flush does not cause cap release message flush
Patrick Donnelly
09:29 PM Backport #38103: mimic: client: session flush does not cause cap release message flush
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26424
merged
Yuri Weinstein
09:36 PM Backport #38335 (Resolved): mimic: mds: fix potential re-evaluate stray dentry in _unlink_local_f...
Patrick Donnelly
09:28 PM Backport #38335: mimic: mds: fix potential re-evaluate stray dentry in _unlink_local_finish
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26474
merged
Yuri Weinstein
09:35 PM Backport #38334 (Resolved): mimic: MDCache::finish_snaprealm_reconnect() create and drop MClientS...
Patrick Donnelly
09:28 PM Backport #38334: mimic: MDCache::finish_snaprealm_reconnect() create and drop MClientSnap message
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26472
merged
Yuri Weinstein
03:24 AM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
restart mds with debug_ms=1. I am curious what error OSD returned Zheng Yan

02/19/2019

10:19 PM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
All PGs were active+clean, the only indication was that the filesystem was failed and MDS were all missing. Jérôme Poulin
08:46 PM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
What the assert is indicating that the file could not be truncated due to RADOS errors (probably). The current status... Patrick Donnelly
06:09 AM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
I was able to mount the volume after a couple of asserts removed and backup/restore data in a new filesystem. All pro... Jérôme Poulin
03:38 AM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
# cephfs-journal-tool --rank=cephfs:0 journal inspect
Overall journal integrity: OK
# Extracts from the journal (...
Jérôme Poulin
02:35 AM Support #38374: Crash when using cephfs as /var/lib/docker in devicemapper mode
If I remove the assert, the MDS switches to damaged. Jérôme Poulin
02:34 AM Support #38374 (New): Crash when using cephfs as /var/lib/docker in devicemapper mode
It seems when Docker creates the first device with dmsetup, the MDS crashes and all further MDS replaying the log cra... Jérôme Poulin
07:38 PM Tasks #38386 (Closed): qa: write kernel fscache tests
The testing kernel is already being built with CONFIG_FSCACHE and CONFIG_CEPH_FSCACHE. We now need tests which exerci... Patrick Donnelly
06:28 AM Bug #38376 (Rejected): client:Repeated release of put_request() in make_request()
code repetition:
Client::make_request()
{
...
if (!request->reply) {
ceph_assert(request->aborted())...
huanwen ren
04:27 AM Backport #35975: mimic: mds: configurable timeout for client eviction
Follow-up: https://github.com/ceph/ceph/pull/26496
merged
Patrick Donnelly

02/18/2019

10:01 PM Backport #37740: mimic: extend reconnect period when mds is busy
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/25785
merged
Yuri Weinstein
10:00 PM Backport #37480: mimic: mds: MDCache.cc: 11673: abort()
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26252
merged
Yuri Weinstein
09:59 PM Backport #38189: mimic: mds: broadcast quota message to client when disable quota
Nathan Cutler wrote:
> https://github.com/ceph/ceph/pull/26292
merged
Yuri Weinstein
03:46 AM Backport #38335 (In Progress): mimic: mds: fix potential re-evaluate stray dentry in _unlink_loca...
https://github.com/ceph/ceph/pull/26474 Prashant D
01:46 AM Backport #38336 (In Progress): luminous: mds: fix potential re-evaluate stray dentry in _unlink_l...
https://github.com/ceph/ceph/pull/26473 Prashant D
12:54 AM Backport #38334 (In Progress): mimic: MDCache::finish_snaprealm_reconnect() create and drop MClie...
https://github.com/ceph/ceph/pull/26472 Prashant D

02/16/2019

10:35 PM Bug #38348 (Fix Under Review): mds: drop cache does not timeout as expected
Patrick Donnelly
01:42 AM Bug #38348 (Resolved): mds: drop cache does not timeout as expected
... Patrick Donnelly
10:54 AM Backport #38350 (Rejected): luminous: mds: decoded LogEvent may leak during shutdown
Nathan Cutler
10:54 AM Backport #38349 (Rejected): mimic: mds: decoded LogEvent may leak during shutdown
Nathan Cutler
01:09 AM Bug #38347 (Fix Under Review): ceph.in: cephfs status line may not show active ranks when standby...
Patrick Donnelly

02/15/2019

11:17 PM Bug #38347 (Resolved): ceph.in: cephfs status line may not show active ranks when standby-replay ...
e.g.... Patrick Donnelly
08:11 PM Bug #36384 (Fix Under Review): src/osdc/Journaler.cc: 420: FAILED ceph_assert(!r)
Patrick Donnelly
06:36 PM Bug #38297 (Resolved): ceph-mgr/volumes: fs subvolumes not created within specified fs volumes
Patrick Donnelly
06:29 PM Bug #38324 (Pending Backport): mds: decoded LogEvent may leak during shutdown
Patrick Donnelly
04:06 PM Backport #38340 (Resolved): luminous: mds: may leak gather during cache drop
https://github.com/ceph/ceph/pull/27342 Nathan Cutler
04:06 PM Backport #38339 (Rejected): mimic: mds: may leak gather during cache drop
Nathan Cutler
04:06 PM Backport #38336 (Resolved): luminous: mds: fix potential re-evaluate stray dentry in _unlink_loca...
https://github.com/ceph/ceph/pull/26473 Nathan Cutler
04:05 PM Backport #38335 (Resolved): mimic: mds: fix potential re-evaluate stray dentry in _unlink_local_f...
https://github.com/ceph/ceph/pull/26474 Nathan Cutler
04:05 PM Backport #38334 (Resolved): mimic: MDCache::finish_snaprealm_reconnect() create and drop MClientS...
https://github.com/ceph/ceph/pull/26472 Nathan Cutler

02/14/2019

11:04 PM Bug #38326 (Resolved): mds: evict stale client when one of its write caps are stolen
IIUC: After mdsmap.session_time, the current behavior is that a stale session's caps' issued set is revoked and chang... Patrick Donnelly
08:41 PM Bug #38297 (Fix Under Review): ceph-mgr/volumes: fs subvolumes not created within specified fs vo...
Patrick Donnelly
06:51 PM Bug #38324 (Fix Under Review): mds: decoded LogEvent may leak during shutdown
Patrick Donnelly
06:13 PM Bug #38324 (Resolved): mds: decoded LogEvent may leak during shutdown
... Patrick Donnelly
06:22 PM Bug #38137 (Pending Backport): mds: may leak gather during cache drop
Patrick Donnelly
06:21 PM Bug #38263 (Pending Backport): mds: fix potential re-evaluate stray dentry in _unlink_local_finish
Patrick Donnelly
06:21 PM Bug #38285 (Pending Backport): MDCache::finish_snaprealm_reconnect() create and drop MClientSnap ...
Patrick Donnelly
11:00 AM Backport #38085 (Need More Info): mimic: mds: log new client sessions with various metadata
non-trivial backport. Looks like it could be attempted by an ambitious backporter. Nathan Cutler
10:50 AM Backport #38097 (Need More Info): mimic: mds: optimize revoking stale caps
backport looks non-trivial Nathan Cutler
 

Also available in: Atom