Activity
From 03/09/2017 to 04/07/2017
04/07/2017
- 01:38 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- I suspect simple msgr also has this problem, maybe we can try this
04/06/2017
- 06:39 PM Feature #17204: Implement new-style ENOSPC handling in kclient
- Ilya seems to be OK with v7 of the set, and I merged that into the kernel testing branch yesterday. How do we turn on...
04/04/2017
- 12:50 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
> Yes, we also don't hold a reference to CEPH_CAP_FILE_SHARED in this code. I suppose that's the ... - 12:31 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
> So, my thinking was to call ceph_try_get_caps in ceph_readdir to grab CEPH_CAP_FILE_RD and CEPH... - 10:20 AM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- I read a little at kernel/net/ceph/messenger.cc it looks not respect on reconnect seq actually. maybe we could disabl...
04/03/2017
- 02:28 PM Bug #19127: NULL pointer dereference in ceph_readdir
- So, my thinking was to call ceph_try_get_caps in ceph_readdir to grab CEPH_CAP_FILE_RD and CEPH_CAP_FILE_SHARED. ceph...
- 01:51 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Zheng Yan wrote:
> Jeff Layton wrote:
> > Zheng Yan wrote:
> >
> > > no vmcore
> >
> > Bummer. Do you have th... - 01:45 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
> Zheng Yan wrote:
>
> > no vmcore
>
> Bummer. Do you have the text in the log from before ... - 01:12 PM Bug #19127: NULL pointer dereference in ceph_readdir
Zheng Yan wrote:
> no vmcore
Bummer. Do you have the text in the log from before the Oops line? It'd be goo...
04/01/2017
- 03:24 PM Bug #11555 (Resolved): lock inversion related to memory reclaim
- We were allocating (with GFP_KERNEL) and destroying the cipher context on each encrypt/decrypt operation:
https://... - 03:04 PM Bug #18543 (Closed): rbd map lun02 -p hdd2 rbd: sysfs write failed rbd: map failed: (5) Input/ou...
- 10:33 AM Bug #18130 (In Progress): soft lockups in ceph.ko
- No, not quite resolved yet. I have a couple of patches for this in the testing branch, but they're marked DNM for now...
- 02:44 AM Bug #18130 (Resolved): soft lockups in ceph.ko
- relevant code has been removed. (now vfs helpers are used)
- 02:50 AM Bug #15432: kcephfs: umount -f can fail after mds reconnect failure
- base on Jeff's ENOSPC work, It should be easy to implement function that abort pending osd requests for 'umount -f'
- 02:18 AM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
> Zheng Yan wrote:
> >
> > we call ceph_dir_clear_ordered() before splice_dentry(). But only f... - 01:11 AM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
>Another question too...why are we using CEPH_CAP_FILE_SHARED in this code. Shouldn't we require ...
03/31/2017
- 07:17 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- Hi Ilya,
There were Samba client tests occurring on the platform that were timing out with oplocks. the RBD was be... - 02:52 PM Bug #19419 (Need More Info): XFS filesystem on RBD image was corrupt after remount
- There is no way to tell at this point. It's a possibility, although if that were the case it would have manifested s...
- 03:29 PM Feature #17524 (In Progress): krbd: support disabling auto-exclusive lock transition logic
- 12:02 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Zheng Yan wrote:
>
> we call ceph_dir_clear_ordered() before splice_dentry(). But only for dentry's current parent... - 07:41 AM Bug #19127: NULL pointer dereference in ceph_readdir
- Jeff Layton wrote:
> We clear the parent's complete bit whenever we call splice_dentry, AFAICT. The only exception i...
03/30/2017
- 07:39 PM Bug #19127: NULL pointer dereference in ceph_readdir
- We clear the parent's complete bit whenever we call splice_dentry, AFAICT. The only exception is in the readdir code ...
- 06:20 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- the RBD client system was gracefully rebooted at Mar 17 10:22:36.
- 06:13 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- Bryan Apperson wrote:
> Hi Ilya,
>
> The log starts before the reboot - the corruption appeared before the reboot... - 06:13 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- Hi Ilya,
The log starts before the reboot - the corruption appeared before the reboot in the logs. The server had ... - 04:19 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- BTW not only firefly is EOL, but that RHEL 7.0 kernel client is also very old. I'd recommend upgrading to the 7.3 ke...
- 04:09 PM Bug #19419: XFS filesystem on RBD image was corrupt after remount
- Hi Bryan,
> Then on March 16th the image became unresponsive. After attempting a reboot of the server and a remoun... - 06:09 PM Feature #17204: Implement new-style ENOSPC handling in kclient
- Up to v6 of the set now, just re-posted it today. The main difference from v5 is some changes to address Ilya's comme...
- 04:01 PM Bug #19385 (Need More Info): rbd stuck on read/write op to ceph_osd
- Let me know if you can reproduce this on 4.9.z.
- 10:07 AM Bug #19122 (Resolved): pre-jewel "osd rm" incrementals are misinterpreted (kernel client)
- In 4.4.58, 4.9.19, 4.10.7.
03/29/2017
- 05:32 PM Bug #19419 (Need More Info): XFS filesystem on RBD image was corrupt after remount
- An RBD image was being used in production to back a Samba and NFS export. The 100TB image was formatted with XFS and ...
- 01:07 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- Still seeing this here:
http://pulpito.ceph.com/jspray-2017-03-29_01:19:13-multimds-wip-jcsp-testing-20170328-test...
03/28/2017
- 05:03 PM Bug #19385: rbd stuck on read/write op to ceph_osd
- Ilya, thanks.
I try it - 02:33 PM Bug #19385: rbd stuck on read/write op to ceph_osd
- There are known issues with resending code in older kernels and manually restarting OSDs is indeed one of the workaro...
- 11:18 AM Bug #19385: rbd stuck on read/write op to ceph_osd
- Ilya, Hi, it's 3.18.43-40
- 01:41 PM Bug #19127: NULL pointer dereference in ceph_readdir
- I suspect the first check does not properly handle the d_splice_alias() case.
- 01:39 PM Bug #19127: NULL pointer dereference in ceph_readdir
- The idea is borrowed from ncpfs. When receiving reply from server, the readdir code fills dentry pointers in page cac...
03/27/2017
- 01:12 PM Bug #19127: NULL pointer dereference in ceph_readdir
- Ouch...the ceph readdir code seems to keep pointers to dentries in the ceph_readdir_cache_control, but doesn't hold r...
- 01:10 PM Bug #19385: rbd stuck on read/write op to ceph_osd
- Which kernel is this on?
- 09:24 AM Bug #19309 (Pending Backport): IO Hang on raw rbd device using libceph kernel module
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=633ee407b9d15a75ac9740ba9d3338815e1fcb9...
03/24/2017
- 10:08 PM Bug #19385 (Closed): rbd stuck on read/write op to ceph_osd
- Sometimes, after massive osd restart (server with shelf) rbd image (maped and mounted xfs) stuck on read or write ope...
03/22/2017
- 11:13 AM Bug #19309 (Fix Under Review): IO Hang on raw rbd device using libceph kernel module
- [PATCH] libceph: force GFP_NOIO for socket allocations
- 02:12 AM Bug #19309 (In Progress): IO Hang on raw rbd device using libceph kernel module
03/20/2017
- 05:13 PM Bug #19309: IO Hang on raw rbd device using libceph kernel module
- sorry for "raw" mistake in subject
- 07:35 AM Bug #19309: IO Hang on raw rbd device using libceph kernel module
- Ilya, Hi.
I have three rbd devices mapped on this host. they are formatted to xfs and mounted,
It's stuck on "rm ...
03/19/2017
- 11:16 PM Bug #19309: IO Hang on raw rbd device using libceph kernel module
- Hi Sergey,
The title says "raw rbd device", but the traces suggest that there is an xfs filesystem on top of the d... - 10:00 PM Bug #19309: IO Hang on raw rbd device using libceph kernel module
- Interesting is:
Mar 19 22:18:00 kikimr0056 kernel: [288420.742789] Workqueue: events handle_osds_timeout [libceph]... - 09:55 PM Bug #19309 (Resolved): IO Hang on raw rbd device using libceph kernel module
- Ilya, Hi!
We have a periodical problem with ceph rbd kern module.
IO stuck, and our working process put in D stat...
03/14/2017
- 10:46 AM Bug #19275 (Resolved): stable-writes flag gets reset underneath rbd since 4.4
- ...
- 09:25 AM Bug #19095 (Resolved): handle image feature mismatches
- 08:07 AM Bug #19272 (New): kcephfs: got forward for unsafe request
- ...
03/13/2017
- 03:42 AM Bug #19189 (Resolved): cephfs kernel 4.9.13 file read hangs
- the fix is merged into stable-4.9 tree
03/12/2017
- 09:44 AM Bug #19122 (Pending Backport): pre-jewel "osd rm" incrementals are misinterpreted (kernel client)
- https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b581a5854eee4b7851dedb0f8c2ceb54fb902c0...
03/10/2017
- 10:15 PM Bug #19095 (Fix Under Review): handle image feature mismatches
- https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=8767b293a4ab6632f9288f34bcf2ab9ba20dca3a
...
03/09/2017
- 03:08 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- BTW it looks this crash only happen when multimds? kernel client + multimds is a clue.
- 03:08 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- Ah, I checked Patrick Donnelly's log before. But I'm really have no experience on ceph kernel codes, I don't have any...
- 02:56 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
- Lots more of these on latest run:
http://pulpito.ceph.com/jspray-2017-03-08_14:08:01-multimds-master-testing-basic-s...
Also available in: Atom