Actions
Bug #8378
closedkrbd: Kernel oops in rbd_img_obj_callback
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Assertion failure in rbd_img_obj_callback() at line 2143: rbd_assert(more ^ (which == img_request->obj_request_count)); ------------[ cut here ]------------ kernel BUG at /build/buildd/linux-3.11.0/drivers/block/rbd.c:2143! invalid opcode: 0000 [#1] SMP Modules linked in: vxlan ip_tunnel cls_u32 ebt_mark ebtable_filter cls_fw ebt_arp ebt_ip vhost_net vhost macvtap macvlan rbd libceph sch_cbq ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables nbd 8021q garp mrp vesafb coretemp kvm_intel bridge stp llc kvm joydev hid_generic gpio_ich dcdbas dm_multipath scsi_dh microcode psmouse serio_raw usbhid hid lpc_ich i7core_edac edac_core wmi mac_hid acpi_power_meter bonding lp parport btrfs xor zlib_deflate raid6_pq libcrc32c ses enclosure megaraid_sas bnx2 CPU: 6 PID: 6340 Comm: kworker/6:4 Tainted: G I 3.11.0-20-generic #35-Ubuntu Hardware name: Dell Inc. PowerEdge R610/0YF3T8, BIOS 6.4.0 07/23/2013 Workqueue: ceph-msgr con_work [libceph] task: ffff88104b6b5dc0 ti: ffff880f72d52000 task.ti: ffff880f72d52000 RIP: 0010:[<ffffffffa03b66d0>] [<ffffffffa03b66d0>] rbd_img_obj_callback+0x530/0x540 [rbd] RSP: 0018:ffff880f72d53be8 EFLAGS: 00010082 RAX: 000000000000007b RBX: ffff8807892cd7b0 RCX: 0000000000000000 RDX: ffff88107fc70000 RSI: ffff88107fc6e428 RDI: 0000000000000046 RBP: ffff880f72d53c30 R08: 0000000000000000 R09: 0000000000001400 R10: 0000000000000050 R11: ffffc90017a2bff8 R12: 0000000000000001 R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88107fc60000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002b628abb3000 CR3: 0000000001c0e000 CR4: 00000000000027e0 Stack: ffff8807892cd7bc ffffffffa036bb18 ffff8807892cd7e0 ffff8807892cd780 ffff880793cdbe40 ffff88078a91e8a0 0000000000000000 0000000000000004 ffff8806ad06a000 ffff880f72d53c48 ffffffffa03b48b7 ffff880793cdbe40 Call Trace: [<ffffffffa036bb18>] ? ceph_tcp_recvmsg+0x48/0x60 [libceph] [<ffffffffa03b48b7>] rbd_obj_request_complete+0x27/0x70 [rbd] [<ffffffffa03b7d6f>] rbd_osd_req_callback+0xdf/0x4e0 [rbd] [<ffffffffa0378049>] dispatch+0x489/0x8e0 [libceph] [<ffffffffa036e98b>] try_read+0x4ab/0x10a0 [libceph] [<ffffffffa0370829>] con_work+0xb9/0x630 [libceph] [<ffffffff8107d10c>] process_one_work+0x17c/0x430 [<ffffffff8107e0cc>] worker_thread+0x11c/0x3c0 [<ffffffff8107dfb0>] ? manage_workers.isra.25+0x2a0/0x2a0 [<ffffffff810848f0>] kthread+0xc0/0xd0 [<ffffffff816edea9>] ? schedule+0x29/0x70 [<ffffffff81084830>] ? kthread_create_on_node+0x120/0x120 [<ffffffff816f8a6c>] ret_from_fork+0x7c/0xb0 [<ffffffff81084830>] ? kthread_create_on_node+0x120/0x120 Code: a0 31 c0 e8 3f cb 32 e1 0f 0b 48 c7 c1 10 cf 3b a0 ba 5f 08 00 00 48 c7 c6 50 e8 3b a0 48 c7 c7 d0 ca 3b a0 31 c0 e8 1c cb 32 e1 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 RIP [<ffffffffa03b66d0>] rbd_img_obj_callback+0x530/0x540 [rbd] RSP <ffff880f72d53be8> ---[ end trace 260af37d80202466 ]---
Ubuntu Linux Kernel 3.11.0-20 is rebased to Kernel 3.11.10. And I read the issue about BUG #5647
And the kernel call trace looks almost the same, but Kernel 3.11.10 should have BUG #5647 patch.
Out CEPH cluster run on v0.72.2
Updated by Ilya Dryomov almost 10 years ago
Hi Rain,
We have a patch pending that I hope will fix this, but it needs more testing.
I'll notify you when it's ready.
Updated by Rain Li almost 10 years ago
Hi Ilya,
I would like to know some details abount this patch, is this patch will applied to RBD Kernel Module or OSD Server? Could you show me some code about this patch?
Updated by Ilya Dryomov almost 10 years ago
- Status changed from New to Resolved
Should be fixed by commit 0f2d5be792b0 ("rbd: use reference counts for image requests"), which went into 3.16-rc1.
Actions