Bug #43272: unable to handle kernel NULL pointer at __ceph_remove_cap+0x2a/0x220 - Linux kernel client - Ceph

Actions

Copy link

Bug #43272

closed

unable to handle kernel NULL pointer at __ceph_remove_cap+0x2a/0x220

Added by Lei Liu over 4 years ago. Updated over 3 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Jeff Layton

Category:

fs/ceph

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Crash signature (v1):

Crash signature (v2):

Description

First I'm not sure if the problem has been fixed in other higher version kernels, Just shows what i see.

system info¶

# uname -r
3.10.0-862.14.4.el7.x86_64

# ceph -v
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)

vmcore-dmesg.txt¶

[6140799.924743] ceph: mds2 reconnect start
[6140800.528283] ceph: mds2 reconnect success
[6140912.699148] ceph: mds2 recovery completed
[6140942.057748] ceph: mds2 reconnect start
[6140942.114748] ceph: mds2 reconnect success
[6141019.065594] ceph: mds2 recovery completed
[6141060.573951] ceph: mds2 reconnect start
[6141060.625504] ceph: mds2 reconnect success
[6141127.939777] ceph: mds2 recovery completed
[6141170.655019] ceph: mds2 caps stale
[6141172.869985] ceph: mds2 caps renewed
[6141179.047206] ceph: mds1 reconnect start
[6141179.074907] ceph: mds1 reconnect success
[6141236.642484] ceph: mds1 recovery completed
[6180295.864627] BUG: unable to handle kernel NULL pointer dereference at 0000000000000358
[6180295.866361] IP: [<ffffffffc0d60dea>] __ceph_remove_cap+0x2a/0x220 [ceph]
[6180295.868009] PGD 0 
[6180295.869592] Oops: 0000 [#1] SMP 
[6180295.871210] Modules linked in: nfsv3 nfs_acl nfs lockd grace fscache nvidia_uvm(POE) nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ceph libceph dns_resolver sunrpc ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ucm nvidia_drm(POE) nvidia_modeset(POE) svcrdma(OE) xprtrdma(OE) ib_uverbs mlx_compat(OE) libiscsi snd_hda_codec_hdmi scsi_transport_iscsi sb_edac nvidia(POE) intel_powerclamp coretemp intel_rapl ib_umad iosf_mbi ib_ipoib kvm_intel ib_cm iTCO_wdt
[6180295.881025]  kvm iTCO_vendor_support irqbypass crc32_pclmul snd_hda_codec_realtek ghash_clmulni_intel snd_hda_codec_generic aesni_intel mlx4_ib lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel ib_core snd_hda_codec dm_service_time snd_hda_core snd_hwdep snd_seq pcspkr snd_seq_device joydev snd_pcm megaraid_sas i2c_i801 sg mei_me snd_timer snd ioatdma mei lpc_ich shpchp soundcore ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad dm_multipath ip_tables xfs libcrc32c mlx4_en sd_mod crc_t10dif crct10dif_generic ast drm_kms_helper mxm_wmi syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm igb mlx4_core libahci libata crct10dif_pclmul ptp crct10dif_common pps_core crc32c_intel dca i2c_algo_bit i2c_core devlink wmi dm_mirror dm_region_hash dm_log dm_mod overlay
[6180295.892444] CPU: 7 PID: 19585 Comm: kworker/7:0 Kdump: loaded Tainted: P           OE  ------------   3.10.0-862.14.4.el7.x86_64 #1
[6180295.895617] Hardware name: Supermicro SYS-7048GR-TR/X10DRG-Q, BIOS 2.0a 08/29/2016
[6180295.897255] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[6180295.898796] task: ffff9b1d4c4b0fd0 ti: ffff9b1c4921c000 task.ti: ffff9b1c4921c000
[6180295.900387] RIP: 0010:[<ffffffffc0d60dea>]  [<ffffffffc0d60dea>] __ceph_remove_cap+0x2a/0x220 [ceph]
[6180295.901937] RSP: 0000:ffff9b1c4921fb68  EFLAGS: 00010246
[6180295.903525] RAX: 00000000fffff2aa RBX: ffff9b2d20be1258 RCX: 0000000000000004
[6180295.905139] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff9b2d20be1258
[6180295.906666] RBP: ffff9b1c4921fba0 R08: 0000000000000000 R09: 0000000000000000
[6180295.908168] R10: ffff9b39a7f2d800 R11: 0000000000000d55 R12: 0000000000000000
[6180295.909666] R13: ffff9b1c6f263180 R14: ffff9b39a7f2d800 R15: 0000000000000000
[6180295.911268] FS:  0000000000000000(0000) GS:ffff9b2c7fdc0000(0000) knlGS:0000000000000000
[6180295.912686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[6180295.914091] CR2: 0000000000000358 CR3: 000000117620e000 CR4: 00000000003607e0
[6180295.915496] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[6180295.916859] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[6180295.918323] Call Trace:
[6180295.919739]  [<ffffffffc0d6aca1>] trim_caps_cb+0xd1/0x240 [ceph]
[6180295.921157]  [<ffffffffb703c42b>] ? destroy_inode+0x3b/0x60
[6180295.922490]  [<ffffffffb703c565>] ? evict+0x115/0x180
[6180295.923782]  [<ffffffffc0d6a7bd>] iterate_session_caps+0xbd/0x240 [ceph]
[6180295.925088]  [<ffffffffc0d6abd0>] ? wake_up_session_cb+0x60/0x60 [ceph]
[6180295.926366]  [<ffffffffc0d72439>] dispatch+0x519/0xb90 [ceph]
[6180295.927608]  [<ffffffffb73d155a>] ? kernel_recvmsg+0x3a/0x50
[6180295.928825]  [<ffffffffc0d03ff4>] try_read+0x4e4/0x1210 [libceph]
[6180295.930080]  [<ffffffffb6edc0ce>] ? dequeue_task_fair+0x41e/0x660
[6180295.931313]  [<ffffffffb6e2a59e>] ? __switch_to+0xce/0x580
[6180295.932517]  [<ffffffffc0d04dd9>] ceph_con_workfn+0xb9/0x670 [libceph]
[6180295.933659]  [<ffffffffb6eb613f>] process_one_work+0x17f/0x440
[6180295.934789]  [<ffffffffb6eb71d6>] worker_thread+0x126/0x3c0
[6180295.935914]  [<ffffffffb6eb70b0>] ? manage_workers.isra.24+0x2a0/0x2a0
[6180295.937020]  [<ffffffffb6ebdf21>] kthread+0xd1/0xe0
[6180295.938123]  [<ffffffffb6ebde50>] ? insert_kthread_work+0x40/0x40
[6180295.939264]  [<ffffffffb75255f7>] ret_from_fork_nospec_begin+0x21/0x21
[6180295.940351]  [<ffffffffb6ebde50>] ? insert_kthread_work+0x40/0x40
[6180295.941452] Code: 00 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 10 4c 8b 27 f6 05 27 c8 02 00 04 89 75 d4 4c 8b 77 20 <49> 8b 84 24 58 03 00 00 48 8b 80 50 03 00 00 48 8b 40 28 48 89 
[6180295.943637] RIP  [<ffffffffc0d60dea>] __ceph_remove_cap+0x2a/0x220 [ceph]
[6180295.944695]  RSP <ffff9b1c4921fb68>
[6180295.945721] CR2: 0000000000000358

Files

0001-ceph-fix-race-in-concurrent-__ceph_remove_cap-invoca.patch (1.7 KB) 0001-ceph-fix-race-in-concurrent-__ceph_remove_cap-invoca.patch

Luis Henriques, 11/12/2020 10:50 AM

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

Assignee set to Jeff Layton

Looks like a NULL cap->ci:

mov    (%rdi),%r12
...
mov    0x358(%r12),%rax

struct ceph_inode_info *ci = cap->ci;
struct ceph_mds_client *mdsc = ceph_sb_to_client(ci->vfs_inode.i_sb)->mdsc;

(gdb) p &((struct ceph_cap *)0)->ci
$5 = (struct ceph_inode_info **) 0x0
(gdb) p &((struct ceph_inode_info *)0)->vfs_inode.i_sb
$6 = (struct super_block **) 0x358 <ceph_statfs+56>

Actions

Copy link

Updated by Jeff Layton over 4 years ago

Thanks Ilya. This doesn't look familiar to me at first glance, and I don't see anything that's gone in since -862 that looks like it would address a problem like this.

cap->ci does get zeroed out in __ceph_remove_cap, and there is some code to handle that in iterate_session_caps, but it's rather hard to follow with a lot of complex locking and special cases.

I'm trying to determine whether it's possible that you could have had racing calls to __ceph_remove_cap. When the s_cap_lock is dropped in iterate_session_caps() prior to calling the callback, I don't see what prevents another task from racing in with a call to __ceph_remove_cap and taking care of it beforehand.

I wonder whether we ought to check whether cap->ci might already be NULL in __ceph_remove_cap once we take the s_cap_lock, and not do anything if it is.

Actions

Copy link

Updated by Jeff Layton over 4 years ago

RFC patch posted here:

https://marc.info/?l=ceph-devel&m=157617192514897&w=2

I'm not sure this is correct, but it seems like it ought to be safe enough.

Actions

Copy link

Updated by Jeff Layton over 4 years ago

I think after going over this a few times, this problem has already been fixed upstream with commit d6e47819721ae2d. We'll probably need to pull that into RHEL7. Lei Lui, do you have the ability to open a bug at bugzilla.redhat.com? If so, could you report it there, and we'll see about getting this fixed in RHEL7.

Actions

Copy link

Updated by Lei Liu over 4 years ago

Jeff Layton wrote:

I think after going over this a few times, this problem has already been fixed upstream with commit d6e47819721ae2d. We'll probably need to pull that into RHEL7. Lei Lui, do you have the ability to open a bug at bugzilla.redhat.com? If so, could you report it there, and we'll see about getting this fixed in RHEL7.

I am glad to do it.

Bug link: https://bugzilla.redhat.com/show_bug.cgi?id=1784016

Actions

Copy link

Updated by Lei Liu over 4 years ago

add upstream pull request: https://github.com/ceph/ceph-client/commit/d6e47819721ae2d9d090058ad5570a66f3c42e39

Actions

Copy link

Updated by Ilya Dryomov over 4 years ago

Looks like it needs to be forwarded to stable as well.

Actions

Copy link

Updated by Jeff Layton over 4 years ago

Ilya Dryomov wrote:

Looks like it needs to be forwarded to stable as well.

We probably should have explicitly marked it for stable, but it looks like the stable maintainers picked that one up anyway. It seems to be in all the stable series kernels that I checked.

Actions

Copy link

Updated by Jeff Layton over 4 years ago

Status changed from New to Closed

Closing this out since upstream kernels seem to have the requisite patch, and this bug was opened vs a downstream (RHEL) kernel.

Actions

Copy link

#10

Updated by Luis Henriques over 3 years ago

File 0001-ceph-fix-race-in-concurrent-__ceph_remove_cap-invoca.patch 0001-ceph-fix-race-in-concurrent-__ceph_remove_cap-invoca.patch added

I've seen this same issue showing up on a 5.3-based kernel that definitely has commit d6e47819721a ("ceph: hold i_ceph_lock when removing caps for freeing inode"). I'm attaching a patch that I've just submitted for review. Maybe it's worth opening this bug again.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Linux kernel client

Custom queries

Bug #43272

unable to handle kernel NULL pointer at __ceph_remove_cap+0x2a/0x220

system info¶

vmcore-dmesg.txt¶

Updated by Ilya Dryomov over 4 years ago

Updated by Jeff Layton over 4 years ago

Updated by Jeff Layton over 4 years ago

Updated by Jeff Layton over 4 years ago

Updated by Lei Liu over 4 years ago

Updated by Lei Liu over 4 years ago

Updated by Ilya Dryomov over 4 years ago

Updated by Jeff Layton over 4 years ago

Updated by Jeff Layton over 4 years ago

Updated by Luis Henriques over 3 years ago