Project

General

Profile

Actions

Bug #52283

closed

kclient: cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256

Added by Xiubo Li over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
kcephfs
Crash signature (v1):
Crash signature (v2):

Description

<5>[  838.476519] Key type dns_resolver registered
<5>[  838.547115] Key type ceph registered
<6>[  838.547837] libceph: loaded (mon/osd proto 15/24)
<6>[  838.615632] ceph: loaded (mds proto 32)
<6>[  838.635485] libceph: mon1 10.72.47.117:40380 session established
<6>[  838.636471] libceph: mon0 10.72.47.117:40378 session established
<6>[  838.638967] libceph: client4273 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  838.639185] libceph: client4284 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  838.642044] libceph: mon2 10.72.47.117:40382 session established
<6>[  838.643905] libceph: client4274 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<4>[  842.670677] ceph:  dropping dirty Fw state for 00000000944e98a9 1099511627776
<4>[  842.670679] ceph:  dropping dirty+flushing - state for 00000000944e98a9 1099511627776
<3>[  842.670682] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  842.670984] WARNING: CPU: 27 PID: 6160 at mm/slab.h:451 kmem_cache_free+0x11e/0x1b0
<4>[  842.670985] Modules linked in: ceph libceph dns_resolver tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag uinput nf_tables_set nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nft_chain_route_ipv6 nft_chain_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nft_chain_route_ipv4 nf_conntrack ip6_tables nft_compat ip_set nf_tables nfnetlink sunrpc squashfs loop nfit libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr sg joydev i2c_piix4 ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net ata_piix drm libata net_failover crc32c_intel virtio_console
<4>[  842.671056]  serio_raw virtio_scsi failover dm_mirror dm_region_hash dm_log dm_mod
<4>[  842.671080] CPU: 27 PID: 6160 Comm: umount Kdump: loaded Not tainted 4.18.0-147.el8.x86_64 #1
<4>[  842.671081] Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014
<4>[  842.671084] RIP: 0010:kmem_cache_free+0x11e/0x1b0
<4>[  842.671086] Code: 0f 84 24 ff ff ff 48 3b a8 10 01 00 00 74 6a 48 8b 48 58 48 8b 55 58 48 c7 c6 60 68 e3 93 48 c7 c7 40 5b 0a 94 e8 a0 e0 e8 ff <0f> 0b e9 f9 fe ff ff 65 8b 05 ec 64 d8 6c 89 c0 48 0f a3 05 1a c1
<4>[  842.671087] RSP: 0018:ffffc1a4c65b7d30 EFLAGS: 00010246
<4>[  842.671089] RAX: 000000000000004f RBX: ffff9c6eb2ebc318 RCX: 0000000000000006
<4>[  842.671094] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff9c749fad6a00
<4>[  842.671095] RBP: ffff9c747f685000 R08: 00000000000002ca R09: 0000000000000004
<4>[  842.671096] R10: 0000000000000000 R11: ffffffff94a39b2d R12: ffff9c74168af1d0
<4>[  842.671097] R13: dead000000000200 R14: dead000000000100 R15: ffff9c6eb2ebc338
<4>[  842.671098] FS:  000014feaae6f080(0000) GS:ffff9c749fac0000(0000) knlGS:0000000000000000
<4>[  842.671099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  842.671100] CR2: 000055c29fafe018 CR3: 0000000797ade001 CR4: 00000000007606e0
<4>[  842.671105] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  842.671106] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  842.671107] PKRU: 55555554
<4>[  842.671107] Call Trace:
<4>[  842.671196]  remove_session_caps_cb+0xd8/0x4a0 [ceph]
<4>[  842.671207]  iterate_session_caps+0x9b/0x240 [ceph]
<4>[  842.671216]  ? remove_session_caps+0x190/0x190 [ceph]
<4>[  842.671224]  remove_session_caps+0x50/0x190 [ceph]
<4>[  842.671233]  ? cleanup_session_requests+0x92/0x100 [ceph]
<4>[  842.671241]  ceph_mdsc_force_umount+0xbc/0x100 [ceph]
<4>[  842.671291]  ksys_umount+0x1b7/0x470
<4>[  842.671410]  ? syscall_trace_enter+0x1d3/0x2c0
<4>[  842.671413]  __x64_sys_umount+0x12/0x20
<4>[  842.671415]  do_syscall_64+0x5b/0x1b0
<4>[  842.671538]  entry_SYSCALL_64_after_hwframe+0x65/0xca
<4>[  842.671559] RIP: 0033:0x14fea9eb316b
<4>[  842.671568] Code: 0d 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed 0c 2c 00 f7 d8 64 89 01 48
<4>[  842.671570] RSP: 002b:00007fff783acba8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
<4>[  842.671572] RAX: ffffffffffffffda RBX: 000055c29faee5d0 RCX: 000014fea9eb316b
<4>[  842.671573] RDX: 0000000000000003 RSI: 0000000000000003 RDI: 000055c29faf3e00
<4>[  842.671575] RBP: 0000000000000003 R08: 000014feaa179710 R09: 000055c29fae9010
<4>[  842.671576] R10: 0000000000000000 R11: 0000000000000206 R12: 000055c29faf3e00
<4>[  842.671577] R13: 000014feaac60184 R14: 000055c29fafd840 R15: 00000000ffffffff
<4>[  842.671581] ---[ end trace 69dcee496d59f4ca ]---
<4>[  842.672602] ceph:  dropping dirty Fw state for 0000000091a689ee 1099511627776
<4>[  842.672606] ceph:  dropping dirty+flushing - state for 0000000091a689ee 1099511627776
<3>[  842.672610] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  842.672635] ceph:  dropping dirty Fw state for 000000004986f10f 1099511627776
<4>[  842.672640] ceph:  dropping dirty+flushing - state for 000000004986f10f 1099511627776
<3>[  842.672644] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  842.692718] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<4>[  842.692764] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<4>[  842.692778] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<6>[  842.785863] libceph: mon1 10.72.47.117:40380 session established
<6>[  842.786090] libceph: mon1 10.72.47.117:40380 session established
<6>[  842.787500] libceph: mon1 10.72.47.117:40380 session established
<6>[  842.787509] libceph: client4276 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  842.787903] libceph: client4279 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  842.788680] libceph: client4282 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<4>[  846.488243] ceph:  dropping dirty Fw state for 0000000024f5b9b7 1099511627777
<4>[  846.488246] ceph:  dropping dirty+flushing - state for 0000000024f5b9b7 1099511627777
<3>[  846.488249] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  846.496218] ceph:  dropping dirty Fw state for 00000000a7425f84 1099511627777
<4>[  846.496221] ceph:  dropping dirty+flushing - state for 00000000a7425f84 1099511627777
<3>[  846.496223] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  846.498911] ceph:  dropping dirty Fw state for 00000000ceb0b810 1099511627777
<4>[  846.498913] ceph:  dropping dirty+flushing - state for 00000000ceb0b810 1099511627777
<3>[  846.498917] cache_from_obj: Wrong slab cache. ceph_cap_flush but object is from kmalloc-256
<4>[  846.507864] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<4>[  846.511785] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<4>[  846.512985] VFS: Busy inodes after unmount of ceph. Self-destruct in 5 seconds.  Have a nice day...
<6>[  846.600654] libceph: mon2 10.72.47.117:40382 session established
<6>[  846.601438] libceph: mon0 10.72.47.117:40378 session established
<6>[  846.602253] libceph: client4277 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  846.602948] libceph: client4287 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa
<6>[  846.605304] libceph: mon2 10.72.47.117:40382 session established
<6>[  846.606563] libceph: client4280 fsid 09ace2a9-dd56-4941-8b12-c7bc9e03cbfa

Actions #1

Updated by Xiubo Li over 2 years ago

This should be caused by:

178 struct ceph_cap_snap {
179         refcount_t nref;
180         struct list_head ci_item;
181 
182         struct ceph_cap_flush cap_flush;                                                                      
183 

The `cap_flush` memory in `struct ceph_cap_snap` was allocated by `kzalloc`, and then it will be added to `ci->i_cap_flush_list`, which will do call the `ceph_free_cap_flush()` to flush it.

Will send one patch to fix it later.

Actions #2

Updated by Xiubo Li over 2 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Xiubo Li over 2 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF