Project

General

Profile

Bug #52295

kclient: force umount cause Objects remaining in ceph_inode_info on __kmem_cache_shutdown()

Added by Xiubo Li about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
kcephfs
Crash signature (v1):
Crash signature (v2):

Description

<3>[ 1639.496019] =============================================================================
<3>[ 1639.496022] BUG ceph_inode_info (Tainted: G B E --------- - ): Objects remaining in ceph_inode_info on __kmem_cache_shutdown()
<3>[ 1639.496034] ----------------------------------------------------------------------------

<3>[ 1639.496034]
<3>[ 1639.496038] INFO: Slab 0x00000000024a4b14 objects=20 used=1 fp=0x000000004eca2f79 flags=0x17ffffc0008100
<4>[ 1639.496042] CPU: 27 PID: 15054 Comm: rmmod Kdump: loaded Tainted: G B E --------- - - 4.18.0+ #10
<4>[ 1639.496043] Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014
<4>[ 1639.496044] Call Trace:
<4>[ 1639.496047] dump_stack+0x5c/0x80
<4>[ 1639.496050] slab_err+0xb0/0xd4
<4>[ 1639.496053] ? printk+0x58/0x6f
<4>[ 1639.496056] ? __kmalloc+0x16f/0x210
<4>[ 1639.496059] ? __kmem_cache_shutdown+0x238/0x290
<4>[ 1639.496062] __kmem_cache_shutdown.cold.102+0x1c/0x10d
<4>[ 1639.496066] shutdown_cache+0x15/0x200
<4>[ 1639.496068] kmem_cache_destroy+0x21f/0x250
<4>[ 1639.496084] destroy_caches+0x16/0x52 [ceph]
<4>[ 1639.496088] __x64_sys_delete_module+0x139/0x270
<4>[ 1639.496093] do_syscall_64+0x5b/0x1b0
<4>[ 1639.496098] entry_SYSCALL_64_after_hwframe+0x65/0xca
<4>[ 1639.496099] RIP: 0033:0x15224df5ba8b
<4>[ 1639.496101] Code: 73 01 c3 48 8b 0d fd 03 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cd 03 2c 00 f7 d8 64 89 01 48
<4>[ 1639.496103] RSP: 002b:00007ffe11373cb8 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
<4>[ 1639.496105] RAX: ffffffffffffffda RBX: 00005585ef0aa750 RCX: 000015224df5ba8b
<4>[ 1639.496107] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005585ef0aa7b8
<4>[ 1639.496108] RBP: 0000000000000000 R08: 00007ffe11372c31 R09: 0000000000000000
<4>[ 1639.496110] R10: 000015224dfcdf60 R11: 0000000000000206 R12: 00007ffe11373ee0
<4>[ 1639.496111] R13: 00007ffe11374338 R14: 00005585ef0aa260 R15: 00005585ef0aa750
<3>[ 1639.496119] INFO: Object 0x00000000b8e7df67 @offset=23520

History

#1 Updated by Xiubo Li about 1 year ago

  • Status changed from New to In Progress
  • Priority changed from Normal to High

I can reproduce this very easily by using both the upstream and downstream ceph-client kernels.

#2 Updated by Xiubo Li about 1 year ago

Another warning:

<4>[20482.387776] ------------[ cut here ]------------
<4>[20482.387782] WARNING: CPU: 26 PID: 170360 at fs/ceph/caps.c:1125 __ceph_remove_cap+0x227/0x2b0 [ceph]
<4>[20482.387840] Modules linked in: ceph(E) libceph dns_resolver netfs tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag uinput nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6_tables nft_compat ip_set rfkill nf_tables nfnetlink sunrpc squashfs loop intel_rapl_msr intel_rapl_common nfit libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev i2c_piix4 pcspkr ip_tables xfs libcrc32c sr_mod cdrom ata_generic qxl drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops sd_mod cec t10_pi ata_piix sg virtio_net drm net_failover virtio_scsi virtio_console failover libata crc32c_intel serio_raw dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libceph]
<4>[20482.388021] CPU: 26 PID: 170360 Comm: umount Tainted: G        W   E     5.14.0-rc4+ #72
<4>[20482.388026] Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014
<4>[20482.388037] RIP: 0010:__ceph_remove_cap+0x227/0x2b0 [ceph]
<4>[20482.388074] Code: 83 c4 08 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 8b 95 80 01 00 00 48 8d 85 80 01 00 00 48 39 c2 74 0b 49 8b 06 80 78 2c 00 75 02 <0f> 0b 48 c7 85 70 01 00 00 00 00 00 00 e9 40 fe ff ff 48 8b 4b 20
<4>[20482.388079] RSP: 0018:ffff9aa480767cd0 EFLAGS: 00010246
<4>[20482.388083] RAX: ffff88ce40efd5c0 RBX: ffff88ce90bbb258 RCX: 0000000000000000
<4>[20482.388094] RDX: ffff88cfcd2827a8 RSI: ffff88cfadd2d6b8 RDI: ffff88ce90bbb260
<4>[20482.388097] RBP: ffff88cfadd2d550 R08: 0000000000000000 R09: 000000000000038c
<4>[20482.388100] R10: 000000000000001d R11: 0000000000000000 R12: ffff88cfcd282000
<4>[20482.388103] R13: 0000000000000000 R14: ffff88ce48405000 R15: ffff88cfcd282000
<4>[20482.388106] FS:  00001528c266c080(0000) GS:ffff88d55fc80000(0000) knlGS:0000000000000000
<4>[20482.388114] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[20482.388118] CR2: 000055da8ba19088 CR3: 0000000107d54004 CR4: 00000000007706e0
<4>[20482.388120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[20482.388123] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[20482.388126] PKRU: 55555554
<4>[20482.388128] Call Trace:
<4>[20482.388197]  remove_session_caps_cb+0x71/0x640 [ceph]
<4>[20482.388247]  ceph_iterate_session_caps+0x9f/0x220 [ceph]
<4>[20482.388282]  ? __schedule+0x392/0x8b0
<4>[20482.388347]  ? wake_up_session_cb+0xb0/0xb0 [ceph]
<4>[20482.388384]  remove_session_caps+0x56/0x1f0 [ceph]
<4>[20482.388458]  ? preempt_schedule_common+0xa/0x20
<4>[20482.388464]  ? cleanup_session_requests+0xc6/0x130 [ceph]
<4>[20482.388501]  ceph_mdsc_force_umount+0xf4/0x130 [ceph]
<4>[20482.388539]  ceph_umount_begin+0x39/0x60 [ceph]
<4>[20482.388567]  path_umount+0x145/0x4b0
<4>[20482.388640]  ksys_umount+0x51/0x80
<4>[20482.388647]  __x64_sys_umount+0x12/0x20
<4>[20482.388653]  do_syscall_64+0x3a/0x80
<4>[20482.388687]  entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[20482.388715] RIP: 0033:0x1528c16b016b
<4>[20482.388760] Code: 0d 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed 0c 2c 00 f7 d8 64 89 01 48
<4>[20482.388763] RSP: 002b:00007ffdd94ea668 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
<4>[20482.388768] RAX: ffffffffffffffda RBX: 000055da8ba095d0 RCX: 00001528c16b016b
<4>[20482.388770] RDX: 0000000000000003 RSI: 0000000000000003 RDI: 000055da8ba0ee30
<4>[20482.388773] RBP: 0000000000000003 R08: 00001528c1976710 R09: 000055da8ba04010
<4>[20482.388775] R10: 0000000000000000 R11: 0000000000000206 R12: 000055da8ba0ee30
<4>[20482.388777] R13: 00001528c245d184 R14: 000055da8ba18d10 R15: 00000000ffffffff
<4>[20482.388789] ---[ end trace a42e1ede1fac4a5e ]---

#3 Updated by Xiubo Li about 1 year ago

One more:

<4>[17012.990201] ------------[ cut here ]------------
<4>[17012.990202] WARNING: CPU: 20 PID: 85642 at fs/ceph/mds_client.c:4570 check_session_state+0x55/0x60 [ceph]
<4>[17012.990230] Modules linked in: ceph(E) libceph dns_resolver netfs tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag uinput nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6_tables nft_compat ip_set rfkill nf_tables nfnetlink sunrpc squashfs loop intel_rapl_msr intel_rapl_common nfit libnvdimm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev i2c_piix4 pcspkr ip_tables xfs libcrc32c sr_mod cdrom ata_generic qxl drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops sd_mod cec t10_pi ata_piix sg virtio_net drm net_failover virtio_scsi virtio_console failover libata crc32c_intel serio_raw dm_mirror dm_region_hash dm_log dm_mod [last unloaded: libceph]
<4>[17012.990320] CPU: 20 PID: 85642 Comm: umount Tainted: G        W   E     5.14.0-rc4+ #72
<4>[17012.990323] Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014
<4>[17012.990325] RIP: 0010:check_session_state+0x55/0x60 [ceph]
<4>[17012.990349] Code: 00 00 a8 08 74 19 48 8b 47 10 48 85 c0 74 10 48 8b 15 af be 12 eb 48 39 d0 0f 88 29 6b 00 00 89 d8 5b c3 48 83 7f 10 00 74 f5 <0f> 0b eb f1 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 57 41 56 49 89
<4>[17012.990352] RSP: 0018:ffff9aa48198fe10 EFLAGS: 00010202
<4>[17012.990354] RAX: 0000000000000080 RBX: 0000000000000000 RCX: 0000000000000007
<4>[17012.990356] RDX: 0000000000000003 RSI: ffffffffc0ad8e10 RDI: ffff88cea0fa7000
<4>[17012.990358] RBP: ffff88cee01b9000 R08: ffff88ce40402678 R09: ffff88ce40402480
<4>[17012.990359] R10: 0000000000000000 R11: ffff88ce40402480 R12: ffff88cee01b9008
<4>[17012.990361] R13: 0000000000000001 R14: ffffffffc0ad8e10 R15: ffff88cea0fa7000
<4>[17012.990363] FS:  0000154d024ea080(0000) GS:ffff88d55fb00000(0000) knlGS:0000000000000000
<4>[17012.990369] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[17012.990370] CR2: 0000559c65540088 CR3: 0000000313f4a003 CR4: 00000000007706e0
<4>[17012.990372] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[17012.990393] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[17012.990395] PKRU: 55555554
<4>[17012.990396] Call Trace:
<4>[17012.990398]  ceph_mdsc_iterate_sessions+0x5d/0xb0 [ceph]
<4>[17012.990423]  ceph_mdsc_pre_umount+0x39/0x1c0 [ceph]
<4>[17012.990447]  ceph_kill_sb+0x1c/0x70 [ceph]
<4>[17012.990464]  deactivate_locked_super+0x34/0x70
<4>[17012.990469]  cleanup_mnt+0xb8/0x140
<4>[17012.990474]  task_work_run+0x70/0xb0
<4>[17012.990478]  exit_to_user_mode_prepare+0x1f0/0x200
<4>[17012.990482]  syscall_exit_to_user_mode+0x12/0x30
<4>[17012.990487]  do_syscall_64+0x46/0x80
<4>[17012.990490]  entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[17012.990494] RIP: 0033:0x154d0152e16b
<4>[17012.990497] Code: 0d 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed 0c 2c 00 f7 d8 64 89 01 48
<4>[17012.990499] RSP: 002b:00007ffc7742cd68 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
<4>[17012.990502] RAX: 0000000000000000 RBX: 0000559c655305d0 RCX: 0000154d0152e16b
<4>[17012.990504] RDX: 0000000000000003 RSI: 0000000000000003 RDI: 0000559c65535e30
<4>[17012.990505] RBP: 0000000000000003 R08: 0000154d017f4710 R09: 0000559c6552b010
<4>[17012.990507] R10: 0000000000000000 R11: 0000000000000206 R12: 0000559c65535e30
<4>[17012.990508] R13: 0000154d022db184 R14: 0000559c6553fd10 R15: 00000000ffffffff
<4>[17012.990511] ---[ end trace a42e1ede1fac4a5b ]---

#4 Updated by Xiubo Li about 1 year ago

  • Status changed from In Progress to Fix Under Review

#5 Updated by Xiubo Li about 1 year ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF