Project

General

Profile

Activity

From 01/19/2017 to 02/17/2017

02/17/2017

04:36 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Haomai, here's a test run:
http://pulpito.ceph.com/pdonnell-2017-02-17_15:35:17-multimds:thrash-master-testing-bas...
Patrick Donnelly

02/13/2017

04:07 PM Bug #17153: kernel hung task warnings on teuthology.front kernel
My guess at this point is that this may be a different manifestation of bug 18130. This is also occurring in the ceph... Jeff Layton
04:00 PM Bug #18686 (Resolved): too many on the wire revalidations from ceph_d_revalidate
Moving to Resolved under the assumption that we'll be merging this set into v4.11. Jeff Layton
03:59 PM Bug #18474 (Resolved): oops in __unregister_request
Jeff Layton

02/09/2017

12:16 PM Bug #18474: oops in __unregister_request
On second thought, I don't really like that since __register_request doesn't put the thing on the list. I'm going to ... Jeff Layton
06:50 AM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
yes.. Haomai Wang

02/08/2017

03:59 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
It looks like teuthology sets "ms die on old message = true" in its ceph.conf.template file.
Haomai, we don't *exp...
Greg Farnum
02:05 AM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
OH, I guess kernel client handle reconnect seq inconsistent with async msgr. if fuse client help, it would be great. Haomai Wang
02:25 PM Bug #18474: oops in __unregister_request
Jeff Layton wrote:
> Is there ever a time you'd want to remove it from the request tree but leave it on the s_unsafe...
Zheng Yan
12:05 PM Bug #18474: oops in __unregister_request
Zheng Yan wrote:
> I think wait_requests() should remove request from unsafe list before calling __unregister_reques...
Jeff Layton
10:20 AM Bug #18474: oops in __unregister_request
I think wait_requests() should remove request from unsafe list before calling __unregister_request() Zheng Yan

02/07/2017

11:11 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Client logs are missing because it's the kernel client. I will need to rerun the test suite to see if I can coax a fa... Patrick Donnelly
05:29 AM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Needs client log too.. http://qa-proxy.ceph.com/teuthology/pdonnell-2017-02-06_19:24:21-multimds:thrash-master-testin... Haomai Wang
02:58 PM Bug #18697: Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
I think we hit this again, but these time it hanging there.
[980732.927323] BUG: unable to handle kernel NULL po...
Xiaoxi Chen
12:06 PM Bug #18474: oops in __unregister_request
I added this just before the kfree(req) in ceph_mdsc_release_request:... Jeff Layton
09:08 AM Bug #18807 (Duplicate): I/O error on rbd device after adding new OSD to Crush map
Ilya Dryomov
08:41 AM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
Hi Ilya.
I have upgraded the kernel and it looks like, the bug was fixed - I can't reproduce it.
Thank you.
Nikita Shalnov
09:05 AM Bug #14901: misdirected requests on 4.2 during rebalancing
In 3.16.39. Ilya Dryomov

02/06/2017

08:23 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Here you go: /ceph/teuthology-archive/pdonnell-2017-02-06_19:24:21-multimds:thrash-master-testing-basic-smithi/791580... Patrick Donnelly
03:29 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
I has shown up many times in the multimds thrasher runs. I'll raise debug_ms for future runs. Patrick Donnelly
07:01 PM Bug #18686: too many on the wire revalidations from ceph_d_revalidate
Patchset posted last week:
http://marc.info/?l=ceph-devel&m=148614505021697&w=2
Jeff Layton
06:56 PM Feature #17204: Implement new-style ENOSPC handling in kclient
v2 of the series:
http://marc.info/?l=ceph-devel&m=148638777409413&w=2
Jeff Layton
03:22 PM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
Yes, I can. I will try it.
Thank you.
Nikita Shalnov
01:42 PM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
I wanted the host kernel -- looks like it's 3.16.36. This should be fixed in 3.16.39, which is in jessie AFAICT. Co... Ilya Dryomov
01:18 PM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
Hi Ilya,
I don't completely understand, what do you need - either a version of kernel of the VM or of the hoster, ...
Nikita Shalnov
12:49 PM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
Hi Nikita,
Which kernel are you running on test-hoster-kvm-buffer-01a?
Ilya Dryomov

02/04/2017

04:24 AM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
client <-> MDS with stateful_server policy, then mds crashed because of old msg seq.
is this teuthology job can re...
Haomai Wang

02/03/2017

11:58 AM Bug #18807: I/O error on rbd device after adding new OSD to Crush map
Distributor ID: Debian
Description: Debian GNU/Linux 8.6 (jessie)
Release: 8.6
Codename: jessie
Nikita Shalnov
11:53 AM Bug #18807 (Duplicate): I/O error on rbd device after adding new OSD to Crush map
Hello.
I run Ceph Jewel and KVM hypervisor....
Nikita Shalnov

02/01/2017

11:41 AM Bug #18474: oops in __unregister_request
Yeah, could be a use-after-free, or maybe a refcounting imbalance in the session handling? I kicked off another xfste... Jeff Layton
01:25 AM Bug #18474: oops in __unregister_request
if op == CEPH_SESSION_CLOSE, handle_session() unregisters the session. Maybe it's use-after-free bug. (I didn't check... Zheng Yan

01/31/2017

11:36 PM Bug #18474: oops in __unregister_request
Hit a very similar crash in testing today. This time it crashed while doing the list_del_init in cleanup_session_requ... Jeff Layton
01:26 PM Bug #18161: kernel client failing to look up mds_namespace gives ENOENT (but it exists)
Re-enable test: https://github.com/ceph/ceph/pull/13200 Nathan Cutler

01/30/2017

06:34 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Pinging Haomai. Sounds like the MDS is triggering cases we didn't hit in the OSD? Greg Farnum
04:35 PM Bug #18697: Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
@Jeff, can the ceph_setxattr introduced by customized location setting ? We set this subdir to a dedicated pool.
C...
Xiaoxi Chen
04:28 PM Bug #18697: Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
Jeff Layton wrote:
> Yeah, that looks like the problem to me too. If I had to guess then I'd say that setting the AC...
Xiaoxi Chen
02:08 PM Bug #18697: Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
Yeah, that looks like the problem to me too. If I had to guess then I'd say that setting the ACL raced with an unlink... Jeff Layton
05:33 AM Bug #18697: Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
ceph_set_acl() code of 4.4 stable kernel ... Zheng Yan

01/29/2017

02:32 PM Bug #18671: kernel 4.8.15: BUG: soft lockup
Zheng Yan
02:32 PM Bug #18671: kernel 4.8.15: BUG: soft lockup
Fixed by https://github.com/ceph/ceph-client/commit/80723652311a202dc8aa253c28883e733737f4a9 Zheng Yan

01/27/2017

04:19 PM Bug #18697 (Closed): Kernel panic on cephfs kernel client (4.4.0-57-generic #78-Ubuntu SMP)
Jan 26 00:53:07 drdd-plcy-srv-1004515 kernel: [12409.020572] BUG: unable to handle kernel NULL pointer dereference at... Xiaoxi Chen
12:21 AM Bug #18671: kernel 4.8.15: BUG: soft lockup
I think it's infinite loop of ceph_renew_caps. caused by the __cap_is_valid check in __ceph_caps_mds_wanted Zheng Yan

01/26/2017

11:22 PM Bug #18690: kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
Happened in this run too: http://pulpito.ceph.com/pdonnell-2017-01-26_18:37:20-multimds:thrash-wip-multimds-tests-tes... Patrick Donnelly
11:20 PM Bug #18690 (Resolved): kclient: FAILED assert(0 == "old msgs despite reconnect_seq feature")
... Patrick Donnelly
04:00 PM Bug #18686 (Resolved): too many on the wire revalidations from ceph_d_revalidate
Zheng says:
> A user reported he saw ceph_d_revalidate() sends large volume getattr requests when running vdbench....
Jeff Layton

01/25/2017

03:46 PM Bug #18671: kernel 4.8.15: BUG: soft lockup
We have a similar problem on another machine, in this case the host itself is accessible:
Kernel 4.9.2
[Wed Jan...
Burkhard Linke
03:13 PM Bug #18671 (Resolved): kernel 4.8.15: BUG: soft lockup
Running kernel 4.8.15 from Ubuntu mainline PPA, a machine is stuck in a kernel bug:
[Wed Jan 25 15:32:46 2017] NMI...
Burkhard Linke

01/20/2017

03:24 PM Feature #17204: Implement new-style ENOSPC handling in kclient
v1 of the patch series:
http://marc.info/?l=ceph-devel&m=148492546411549&w=2
Jeff Layton
11:36 AM Bug #18161 (Resolved): kernel client failing to look up mds_namespace gives ENOENT (but it exists)
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=cc8e8342930129aa2c9b629e1653e4681f0896ea i... Ilya Dryomov
10:41 AM Feature #4690: krbd: support arbitrary length responses to class operations
No, just some groundwork. Ilya Dryomov
10:39 AM Bug #18543: rbd map lun02 -p hdd2 rbd: sysfs write failed rbd: map failed: (5) Input/output error
It looks like you have header CRCs disabled ("ms crc header = false" in ceph.conf). This is not supported by the ker... Ilya Dryomov

01/19/2017

11:58 PM Feature #4690: krbd: support arbitrary length responses to class operations
@Ilya: was this already completed? Jason Dillaman
 

Also available in: Atom