Project

General

Profile

Bug #20998

RHEL74 GA kernel paniced on client node running smallfile tests with 3 active MDS

Added by Barry Marson about 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
Start date:
08/14/2017
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

While running Ben England's small file test with 2 clients and 3 active MDS servers, one of the clients went down with:

[339141.117253] BUG: unable to handle kernel NULL pointer dereference at 0000000000000530
[339141.117345] IP: [<ffffffff816abb3c>] _raw_spin_lock+0xc/0x30
[339141.117406] PGD 0
[339141.117430] Oops: 0002 [#1] SMP
[339141.117467] Modules linked in: ceph libceph dns_resolver ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_con
ntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filt
er ebtables ip6table_filter ip6_tables iptable_filter sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_he
lper cryptd pcspkr sg joydev iTCO_wdt iTCO_vendor_support ipmi_ssif dcdbas acpi_power_meter wmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad mei_me mei lpc_ich
[339141.118223] shpchp ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci ixgbe libata crct10dif_pclmul cr
ct10dif_common tg3 crc32c_intel megaraid_sas i2c_core mdio dca ptp pps_core dm_mirror dm_region_hash dm_log dm_mod
[339141.118538] CPU: 14 PID: 10021 Comm: kworker/14:1 Not tainted 3.10.0-693.el7.x86_64 #1
[339141.118605] Hardware name: Dell Inc. PowerEdge R620/0KCKR5, BIOS 1.3.6 09/11/2012
[339141.118687] Workqueue: events delayed_work [ceph]
[339141.118731] task: ffff88081db36eb0 ti: ffff8808163dc000 task.ti: ffff8808163dc000
[339141.118793] RIP: 0010:[<ffffffff816abb3c>] [<ffffffff816abb3c>] _raw_spin_lock+0xc/0x30
[339141.118866] RSP: 0018:ffff8808163dfbf8 EFLAGS: 00010246
[339141.118913] RAX: 0000000000000000 RBX: ffff880258dc7f78 RCX: 0000000000000000
[339141.118973] RDX: 0000000000000001 RSI: ffff8808163dfd4c RDI: 0000000000000530
[339141.119033] RBP: ffff8808163dfc20 R08: 0000000000000000 R09: 0000000000000000
[339141.119092] R10: dfbf0dc47ab2e8f8 R11: 7fffffffffffffff R12: ffff8808163dfd4c
[339141.119152] R13: 0000000000000000 R14: ffff88080cf126f0 R15: ffff880258dc7f80
[339141.119212] FS: 0000000000000000(0000) GS:ffff88081fbc0000(0000) knlGS:0000000000000000
[339141.119280] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[339141.119329] CR2: 0000000000000530 CR3: 0000000816e0e000 CR4: 00000000000407e0
[339141.119390] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[339141.119450] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[339141.119508] Stack:
[339141.119529] ffffffffc0672c15 0000000000000000 ffff880258dc7f78 ffff8808163dfd4c
[339141.119600] 0000000000000000 ffff8808163dfc60 ffffffffc067416c ffff88080cf12a30
[339141.119670] 0000000000000000 ffff88101e92e990 ffff88080cf126f0 ffff88080cf126f0
[339141.119741] Call Trace:
[339141.119779] [<ffffffffc0672c15>] ? __cap_is_valid+0x25/0xb0 [ceph]
[339141.119845] [<ffffffffc067416c>] __ceph_caps_issued+0x5c/0xe0 [ceph]
[339141.119911] [<ffffffffc067620f>] ceph_check_caps+0x12f/0xba0 [ceph]
[339141.119977] [<ffffffffc067a3d6>] ceph_check_delayed_caps+0x86/0xf0 [ceph]
[339141.120047] [<ffffffffc0681705>] delayed_work+0x35/0x260 [ceph]
[339141.120103] [<ffffffff810a881a>] process_one_work+0x17a/0x440
[339141.120156] [<ffffffff810a94e6>] worker_thread+0x126/0x3c0
[339141.120207] [<ffffffff810a93c0>] ? manage_workers.isra.24+0x2a0/0x2a0
[339141.120264] [<ffffffff810b098f>] kthread+0xcf/0xe0
[339141.120309] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[339141.122548] [<ffffffff816b4f18>] ret_from_fork+0x58/0x90
[339141.124773] [<ffffffff810b08c0>] ? insert_kthread_work+0x40/0x40
[339141.126999] Code: 5d c3 0f 1f 44 00 00 85 d2 74 e4 0f 1f 40 00 eb ed 66 0f 1f 44 00 00 b8 01 00 00 00 5d c3 90 66 66 66 66 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 a4 2a ff ff 5d
[339141.131737] RIP [<ffffffff816abb3c>] _raw_spin_lock+0xc/0x30
[339141.134001] RSP <ffff8808163dfbf8>
[339141.136236] CR2: 0000000000000530

I do have a vmcore file. I just need to know if some one wants it and where to place it.

We are running luminous ceph-*-12.1.2-0.el7 bits

Barry

History

#1 Updated by Zheng Yan about 2 years ago

  • Status changed from New to Verified

it's likely fixed by

commit 4b9f2042fd2a9da7e6c7b4dd49eff19dc3754e4f
Author: Yan, Zheng <zyan@redhat.com>
Date:   Tue Jun 27 17:17:24 2017 +0800

    ceph: avoid accessing freeing inode in ceph_check_delayed_caps()

    Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
    Signed-off-by: Ilya Dryomov <idryomov@gmail.com>

diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index f555245..7007ae2 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -3809,6 +3809,7 @@ void ceph_handle_caps(struct ceph_mds_session *session,
  */
 void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
 {
+       struct inode *inode;
        struct ceph_inode_info *ci;
        int flags = CHECK_CAPS_NODELAY;

@@ -3824,9 +3825,15 @@ void ceph_check_delayed_caps(struct ceph_mds_client *mdsc)
                    time_before(jiffies, ci->i_hold_caps_max))
                        break;
                list_del_init(&ci->i_cap_delay_list);
+
+               inode = igrab(&ci->vfs_inode);
                spin_unlock(&mdsc->cap_delay_lock);
-               dout("check_delayed_caps on %p\n", &ci->vfs_inode);
-               ceph_check_caps(ci, flags, NULL);
+
+               if (inode) {
+                       dout("check_delayed_caps on %p\n", inode);
+                       ceph_check_caps(ci, flags, NULL);
+                       iput(inode);
+               }
        }
        spin_unlock(&mdsc->cap_delay_lock);
 }

But we haven't backported it to RHEL

#2 Updated by Patrick Donnelly about 2 years ago

  • Project changed from fs to Linux kernel client

Zheng, what's the path forward on getting this resolved? How do we get that backported?

#3 Updated by Barry Marson about 2 years ago

I ran into this problem again with my much larger testbed in the scale lab. Bits used were RHEL74 GA and RHCEPH-3.0-RHEL-7-20170817.ci.0

Any idea when this will be back ported ? If given the specific upstream kernel that has the fix, I can check with Jarod Wilson about trying one of his unsupported upstream kernels

Barry

#4 Updated by Patrick Donnelly about 2 years ago

Barry Marson wrote:

I ran into this problem again with my much larger testbed in the scale lab. Bits used were RHEL74 GA and RHCEPH-3.0-RHEL-7-20170817.ci.0

Any idea when this will be back ported ? If given the specific upstream kernel that has the fix, I can check with Jarod Wilson about trying one of his unsupported upstream kernels

It's merged in Torvald's master but not yet released.

#5 Updated by Barry Marson about 2 years ago

Ive been running with an upstream kernel built for easy installation on RHEL. The kernel ... 4.13.0-0.rc7.git0.1.el7_UNSUPPORTED.x86_64 has been running for 2 weeks and has not crashed.

So how do we expedite getting the patch to the RHEL75 and RHEL74z (and potentially older) code bases ?

Barry

#6 Updated by Zheng Yan almost 2 years ago

  • Status changed from Verified to Resolved

Also available in: Atom PDF