Project

General

Profile

Bug #45563

__list_add_valid kernel NULL pointer in _ceph_remove_cap

Added by joe h about 2 months ago. Updated about 2 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
fs/ceph
Target version:
% Done:

50%

Source:
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
05/15/2020
Affected Versions:
ceph-qa-suite:
Crash signature:

Description

【描述】
recently,I encountered the same bug(unable to handle kernel NULL pointer dereference in __list_add_valid) for many times while cluster process running deleting data business for 12 hours;
After analysis and debug for a long time, there is no reason and wrong code was found. Because I had a similar problem before so that we suppose the session is closed or rejected, session is destoried, however the session is still in use at the __ceph_remove_cap.

my cluster kernel version is 4.14.0, the backtrace information are as follows;

【backtrace】
[145278.236178] BUG: unable to handle kernel NULL pointer dereference at (null)
[145278.236947] IP: _list_add_valid+0x10/0x80
[145278.237669] PGD 0 P4D 0
[145278.238340] Oops: 0000 [#1] SMP
[145278.238996] Modules linked in: rpcsec_gss_krb5(OE) iptable_filter tcp_diag inet_diag rpcrdma(OE) nfsd(OE) auth_rpcgss(OE) nfs_acl(OE) lockd(OE) grace(OE) fscache sunrpc(OE) ceph(OE) libceph(OE) dns_resolver dev_pmc_scsi(OE) flashcache(OE) rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_core(OE) mlxfw(OE) mlx4_en(OE) mlx4_ib(OE) ib_core(OE) mlx4_core(OE) devlink mlx_compat(OE) ip_vs nf_conntrack sr_mod vfat fat cdrom dm_mirror dm_region_hash dm_log dm_mod intel_rapl x86_pkg_temp_thermal ext4 intel_powerclamp mbcache jbd2 coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel uas ses crypto_simd enclosure usb_storage glue_helper cryptd sg intel_cstate iTCO_wdt iTCO_vendor_support intel_uncore
[145278.244064] intel_rapl_perf pcspkr joydev ioatdma mei_me i2c_i801 mei lpc_ich shpchp ipmi_si wmi ipmi_devintf ipmi_msghandler nfit acpi_power_meter libnvdimm acpi_pad ip_tables xfs libcrc32c sd_mod ast drm_kms_helper syscopyarea sysfillrect crc32c_intel sysimgblt fb_sys_fops ixgbe ttm igb ahci mdio smartpqi drm libahci ptp scsi_transport_sas i2c_algo_bit dca libata pps_core i2c_core [last unloaded: mlxfw]
[145278.247449] CPU: 32 PID: 291 Comm: kswapd1 Kdump: loaded Tainted: G W OEL ------------ 4.14.0-xxx #1
[145278.249190] task: ffff9ff33beb1e80 task.stack: ffffb2cf0eb44000
[145278.250057] RIP: 0010:
_list_add_valid+0x10/0x80
[145278.250910] RSP: 0018:ffffb2cf0eb47a90 EFLAGS: 00010246
[145278.251758] RAX: ffffa00379383560 RBX: ffffa0036b2cd860 RCX: ffffa0036b2cd978
[145278.252625] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffa0036b2cd888
[145278.253497] RBP: ffffb2cf0eb47a90 R08: 0000000000000000 R09: ffffa00379383560
[145278.254381] R10: ffffffffffffff9c R11: ffffffffffffff83 R12: ffff9fd7c5092798
[145278.255275] R13: ffffa00379383000 R14: 0000000000000000 R15: ffffa0036b2cd888
[145278.256182] FS: 0000000000000000(0000) GS:ffffa0037d700000(0000) knlGS:0000000000000000
[145278.257094] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[145278.257990] CR2: 0000000000000000 CR3: 0000002379a09006 CR4: 00000000007606e0
[145278.258881] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[145278.259754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[145278.260632] PKRU: 55555554
[145278.261430] Call Trace:
[145278.262238] __ceph_remove_cap+0xe6/0x250 [ceph]
[145278.263048] ceph_queue_caps_release+0x50/0x70 [ceph]
[145278.263864] ceph_destroy_inode+0x2d/0x1c0 [ceph]
[145278.264686] destroy_inode+0x3b/0x60
[145278.265506] evict+0x142/0x1a0
[145278.266331] iput+0x17d/0x1d0
[145278.267149] dentry_unlink_inode+0xb9/0xf0
[145278.267953] __dentry_kill+0xc7/0x170
[145278.268742] shrink_dentry_list+0x122/0x280
[145278.269515] prune_dcache_sb+0x5a/0x80
[145278.270275] super_cache_scan+0x107/0x190
[145278.271027] shrink_slab+0x26b/0x480
[145278.271769] shrink_node+0x2f7/0x310
[145278.272510] kswapd+0x2cf/0x730
[145278.273257] kthread+0x109/0x140
[145278.274008] ? mem_cgroup_shrink_node+0x180/0x180
[145278.274753] ? kthread_park+0x60/0x60
[145278.275487] ret_from_fork+0x2a/0x40
[145278.276220] Code: b8 f4 ff ff ff e9 3b ff ff ff b8 f4 ff ff ff e9 31 ff ff ff 90 90 90 90 90 90 90 55 48 89 d0 48 8b 52 08 48 89 e5 48 39 f2 75 19 <48> 8b 32 48 39 f0 75 42 48 39 c7 74 23 48 39 fa 74 1e b8 01 00
[145278.277774] RIP: __list_add_valid+0x10/0x80 RSP: ffffb2cf0eb47a90
[145278.278518] CR2: 0000000000000000

【相同问题】
on the www.tracker.ceph.com, there is a similar question which link is: tracker.ceph.com/issues/37769,and Zheng Yan gave the patch which can explains this Oops(commit: 0a07fc8cd01b6838d999a5eacaa99fe90b8f768b)
The main thing is my code has been modified according to this commit

History

#1 Updated by joe h about 2 months ago

while reviewing code,I think there is something wrong with the code as follows:
one cap is stored in two data structures when excute ceph_add_cap, which are cap rbtree and cap list in session; and when excute __ceph_remove_cap, first remove a cap from session list,then remove the cap from cap rbtree;
however, when excute __unregister_session,there is not any check or handle section of cap which ever belongs to its session;
so, if session has been unregistered, but those caps which belong to the session doesn't know the session has been unregistered, when excute __ceph_remove_cap, because there is not any check to judge the session is or not exist, Will a kernel panic(kernel NULL pointer dereference) happened ???

#2 Updated by Zheng Yan about 2 months ago

joe h wrote:

while reviewing code,I think there is something wrong with the code as follows:
one cap is stored in two data structures when excute ceph_add_cap, which are cap rbtree and cap list in session; and when excute __ceph_remove_cap, first remove a cap from session list,then remove the cap from cap rbtree;
however, when excute __unregister_session,there is not any check or handle section of cap which ever belongs to its session;
so, if session has been unregistered, but those caps which belong to the session doesn't know the session has been unregistered, when excute __ceph_remove_cap, because there is not any check to judge the session is or not exist, Will a kernel panic(kernel NULL pointer dereference) happened ???

remove_session_caps() is always called for __unregister_session() case

#3 Updated by joe h about 2 months ago

thanks Zheng Yan. and I have another question, have you done this test(deleting the file When the memory is full)? Will a kernel panic(kernel NULL pointer dereference or soft lockup stuck for 22s!) happened ???

Also available in: Atom PDF