Bug #56531
CephFS Mounts via Linux kernel not releasing locks
100%
Description
Hello,
We are using Ubuntu 20.04.4 LTS and have several client-mounted CephFS filesystems. The Ceph cluster is configured with 3 active OSD/MON/MDS nodes and 1 MON/MGR node.
Linux lms-prod-01 5.13.0-1031-azure #37~20.04.1-Ubuntu x86_64 GNU/Linux
There is a bug somewhere in the kernel Ceph FS driver that under some circumstances (possibly network related) fails to release a lock when the MDS requests it. This will create a chain reaction that causes the FS to be unusable by any other systems which have it mounted. The issue will be isolated to that particular FS.
Here is our SOP for getting things going again:
Ceph unhealthy - 1 clients failing to respond to capability release
To find the system mount responsible, go to the current FS controller:
> $ ceph health detail > HEALTH_WARN 1 clients failing to respond to capability release > [WRN] MDS_CLIENT_LATE_RELEASE: 1 clients failing to respond to capability release > mds.fs_lms_prod.cephfs-cluster-01.yxptxa(mds.0): Client lms-prod-01:lms_prod failing to respond to capability release client_id: 404125 >The rogue client machine is listed along with the FS which is getting held up. At this point, all that really needs to happen is to unmount the FS which will release the lock.
In theory, we should stop php7.X-fpm and apache2, umount the FS and remount, and start the same.
However, in most cases, the umount fails and reports busy. Under the pressure of getting it working again, I have resorted to restarting the target (on cli and then in azure) and putting it in MAINT mode at HAProxys. Once the system is offline, the FS and Ceph report health again within minutes.
I have attached the kernel logs which occur at the same time.
We can go weeks without having this happening, then it does. The last two days have both had incidents.
We are going to try and switch to using the userspace FUSE mounts which might bring the used version of libceph up a couple of notches and possibly bring some stability, but I don't know if that's the real answer.
Is there anyone that can shed some light on what the problem is?
Subtasks
History
#1 Updated by Ramana Raja over 1 year ago
- Tracker changed from Support to Bug
- Category deleted (
NFS (Linux Kernel)) - Regression set to No
- Severity set to 3 - minor
- Component(FS) Client added
#2 Updated by Xiubo Li over 1 year ago
Paste the kernel logs here:
Jul 11 09:11:52 lms-prod-01 kernel: [1141483.822919] BUG: kernel NULL pointer dereference, address: 0000000000000402 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.828988] #PF: supervisor read access in kernel mode Jul 11 09:11:52 lms-prod-01 kernel: [1141483.832840] #PF: error_code(0x0000) - not-present page Jul 11 09:11:52 lms-prod-01 kernel: [1141483.836707] PGD 0 P4D 0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.838884] Oops: 0000 [#1] SMP PTI Jul 11 09:11:52 lms-prod-01 kernel: [1141483.841777] CPU: 4 PID: 2159765 Comm: php-fpm7.1 Not tainted 5.13.0-1031-azure #37~20.04.1-Ubuntu Jul 11 09:11:52 lms-prod-01 kernel: [1141483.848266] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.855101] RIP: 0010:netfs_rreq_assess+0x3db/0x730 [netfs] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.859309] Code: 6f 03 00 00 49 8d 44 24 48 4c 8b 45 88 45 31 c9 45 31 ff 48 89 45 98 4c 89 e0 45 89 cc 4c 89 ad 78 ff ff ff 49 89 c1 4d 89 c5 <48> 8b 17 48 8b 47 20 48 2b 45 90 48 c1 ea 10 c1 e0 0c 83 e2 01 80 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.875617] RSP: 0018:ffffafb1c84a38e8 EFLAGS: 00010246 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.879652] RAX: ffff90f7e34a9f00 RBX: ffff90f839991ae0 RCX: 0000000000000002 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.884411] RDX: 0000000000000000 RSI: ffff90f80146eb58 RDI: 0000000000000402 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.889093] RBP: ffffafb1c84a3970 R08: 0000000000000000 R09: ffff90f7e34a9f00 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.894026] R10: ffffffffb0606a58 R11: 0000000000000000 R12: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.899006] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.904044] FS: 00007f8d81e815c0(0000) GS:ffff90fe23d00000(0000) knlGS:0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.910629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.914539] CR2: 0000000000000402 CR3: 00000002eb4d4006 CR4: 00000000003706e0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.919517] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.924972] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.929894] Call Trace: Jul 11 09:11:52 lms-prod-01 kernel: [1141483.931879] <TASK> Jul 11 09:11:52 lms-prod-01 kernel: [1141483.933481] netfs_readpage+0x180/0x390 [netfs] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.936685] ? init_wait_var_entry+0x50/0x50 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.940225] ceph_readpage+0x12c/0x180 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.943921] filemap_read_page.isra.0+0x4e/0x380 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.947771] ? scan_shadow_nodes+0x30/0x30 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.950753] ? lru_cache_add+0x47/0x70 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.955260] filemap_get_pages+0x36a/0x570 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.957623] filemap_read+0xbb/0x3c0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.959718] ? ceph_get_caps+0xf6/0x5d0 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.962763] ? schedule_timeout+0x202/0x290 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.965309] generic_file_read_iter+0xe2/0x140 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.968650] ceph_read_iter+0x182/0x6a0 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.971317] ? _copy_to_user+0x20/0x30 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.973859] new_sync_read+0x110/0x1a0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.976228] ? ceph_direct_read_write+0x9c0/0x9c0 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141483.979500] ? new_sync_read+0x110/0x1a0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.985191] vfs_read+0xfe/0x190 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.987329] ksys_pread64+0x6d/0xa0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.990210] __x64_sys_pread64+0x1e/0x20 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.992778] do_syscall_64+0x61/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.995094] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141483.997379] ? syscall_exit_to_user_mode+0x17/0x40 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.001021] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.004328] ? syscall_exit_to_user_mode+0x17/0x40 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.008376] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.011563] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.014579] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.017361] ? sysvec_hyperv_stimer0+0x4e/0x90 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.020667] ? asm_sysvec_hyperv_stimer0+0xa/0x20 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.024946] entry_SYSCALL_64_after_hwframe+0x44/0xae Jul 11 09:11:52 lms-prod-01 kernel: [1141484.028821] RIP: 0033:0x7f8d83ff01af Jul 11 09:11:52 lms-prod-01 kernel: [1141484.031663] Code: 08 89 3c 24 48 89 4c 24 18 e8 4d 84 f8 ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 04 24 e8 7d 84 f8 ff 48 8b Jul 11 09:11:52 lms-prod-01 kernel: [1141484.047894] RSP: 002b:00007ffccfa40e40 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.053501] RAX: ffffffffffffffda RBX: 00007ffccfa40f40 RCX: 00007f8d83ff01af Jul 11 09:11:52 lms-prod-01 kernel: [1141484.058511] RDX: 0000000000011702 RSI: 00007f8d81ddb018 RDI: 0000000000000006 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.064097] RBP: 00007f8d81dcaf30 R08: 0000000000000000 R09: 00007f8d81c00000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.069525] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.074146] R13: 0000000000011702 R14: 00007f8d81c13560 R15: 00007f8d63813ef0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.078947] </TASK> Jul 11 09:11:52 lms-prod-01 kernel: [1141484.080705] Modules linked in: iptable_filter binfmt_misc ceph libceph fscache netfs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua mlx5_ib ib_uverbs ib_core mlx5_core tls psample mlxfw xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd hid_hyperv hv_netvsc hid hv_balloon hv_utils hyperv_keyboard serio_raw hyperv_fb pata_acpi sch_fq_codel ipmi_devintf ipmi_msghandler msr drm i2c_core ip_tables x_tables autofs4 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.120304] CR2: 0000000000000402 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.124374] ---[ end trace cb03a5e9c52eb1e9 ]--- Jul 11 09:11:52 lms-prod-01 kernel: [1141484.128647] RIP: 0010:netfs_rreq_assess+0x3db/0x730 [netfs] Jul 11 09:11:52 lms-prod-01 kernel: [1141484.132893] Code: 6f 03 00 00 49 8d 44 24 48 4c 8b 45 88 45 31 c9 45 31 ff 48 89 45 98 4c 89 e0 45 89 cc 4c 89 ad 78 ff ff ff 49 89 c1 4d 89 c5 <48> 8b 17 48 8b 47 20 48 2b 45 90 48 c1 ea 10 c1 e0 0c 83 e2 01 80 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.149385] RSP: 0018:ffffafb1c84a38e8 EFLAGS: 00010246 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.153693] RAX: ffff90f7e34a9f00 RBX: ffff90f839991ae0 RCX: 0000000000000002 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.160041] RDX: 0000000000000000 RSI: ffff90f80146eb58 RDI: 0000000000000402 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.166756] RBP: ffffafb1c84a3970 R08: 0000000000000000 R09: ffff90f7e34a9f00 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.172196] R10: ffffffffb0606a58 R11: 0000000000000000 R12: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.178187] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.183952] FS: 00007f8d81e815c0(0000) GS:ffff90fe23d00000(0000) knlGS:0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.190221] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.195561] CR2: 0000000000000402 CR3: 00000002eb4d4006 CR4: 00000000003706e0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.201397] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.207581] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.220824] ------------[ cut here ]------------ Jul 11 09:11:52 lms-prod-01 kernel: [1141484.224620] WARNING: CPU: 1 PID: 2159765 at fs/ceph/file.c:829 ceph_release+0x12c/0x150 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141484.230373] Modules linked in: iptable_filter binfmt_misc ceph libceph fscache netfs nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua mlx5_ib ib_uverbs ib_core mlx5_core tls psample mlxfw xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c xt_owner iptable_security xt_tcpudp bpfilter joydev hid_generic crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd hid_hyperv hv_netvsc hid hv_balloon hv_utils hyperv_keyboard serio_raw hyperv_fb pata_acpi sch_fq_codel ipmi_devintf ipmi_msghandler msr drm i2c_core ip_tables x_tables autofs4 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.265289] CPU: 1 PID: 2159765 Comm: php-fpm7.1 Tainted: G D 5.13.0-1031-azure #37~20.04.1-Ubuntu Jul 11 09:11:52 lms-prod-01 kernel: [1141484.272484] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.278842] RIP: 0010:ceph_release+0x12c/0x150 [ceph] Jul 11 09:11:52 lms-prod-01 kernel: [1141484.282452] Code: 04 00 4c 89 e6 e8 b4 93 66 ee e9 57 ff ff ff 48 89 fa 48 c7 c6 60 21 8a c0 48 c7 c7 30 8d 8b c0 e8 09 88 94 ee e9 77 ff ff ff <0f> 0b e9 14 ff ff ff e8 18 02 02 00 eb af 0f 0b e9 75 ff ff ff be Jul 11 09:11:52 lms-prod-01 kernel: [1141484.294737] RSP: 0018:ffffafb1c84a3e48 EFLAGS: 00010212 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.298763] RAX: ffff90f790fbe988 RBX: ffff90f8baab25b0 RCX: ffff90f7c3fa1300 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.303842] RDX: ffffafb1c84a3cf0 RSI: ffff90f7c3fa1300 RDI: ffff90f8baab25b0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.308959] RBP: ffffafb1c84a3e60 R08: 0000000000000001 R09: ffff90f8baab2260 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.313981] R10: 0000000000000000 R11: 0000000000000000 R12: ffff90f790fbe980 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.319407] R13: ffff90f8baab2250 R14: ffff90f7916c7ce0 R15: ffff90f8b81e1cc0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.324692] FS: 0000000000000000(0000) GS:ffff90fe23c40000(0000) knlGS:0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.331013] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.335234] CR2: 0000559d2e289200 CR3: 0000000319610004 CR4: 00000000003706e0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.340390] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.346246] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.351853] Call Trace: Jul 11 09:11:52 lms-prod-01 kernel: [1141484.354190] <TASK> Jul 11 09:11:52 lms-prod-01 kernel: [1141484.356381] __fput+0x9f/0x250 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.358888] ____fput+0xe/0x10 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.361431] task_work_run+0x6a/0xa0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.364283] do_exit+0x371/0xad0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.366839] ? syscall_exit_to_user_mode+0x17/0x40 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.370603] ? do_syscall_64+0x6e/0xb0 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.373628] rewind_stack_do_exit+0x17/0x20 Jul 11 09:11:52 lms-prod-01 kernel: [1141484.376949] RIP: 0033:0x7f8d83ff01af Jul 11 09:11:52 lms-prod-01 kernel: [1141484.379763] Code: Unable to access opcode bytes at RIP 0x7f8d83ff0185. Jul 11 09:11:53 lms-prod-01 kernel: [1141484.384375] RSP: 002b:00007ffccfa40e40 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 Jul 11 09:11:53 lms-prod-01 kernel: [1141484.389708] RAX: ffffffffffffffda RBX: 00007ffccfa40f40 RCX: 00007f8d83ff01af Jul 11 09:11:53 lms-prod-01 kernel: [1141484.394594] RDX: 0000000000011702 RSI: 00007f8d81ddb018 RDI: 0000000000000006 Jul 11 09:11:53 lms-prod-01 kernel: [1141484.400475] RBP: 00007f8d81dcaf30 R08: 0000000000000000 R09: 00007f8d81c00000 Jul 11 09:11:53 lms-prod-01 kernel: [1141484.405760] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000 Jul 11 09:11:53 lms-prod-01 kernel: [1141484.411264] R13: 0000000000011702 R14: 00007f8d81c13560 R15: 00007f8d63813ef0 Jul 11 09:11:53 lms-prod-01 kernel: [1141484.416050] </TASK> Jul 11 09:11:53 lms-prod-01 kernel: [1141484.418142] ---[ end trace cb03a5e9c52eb1ea ]---
#3 Updated by Xiubo Li over 1 year ago
- Project changed from CephFS to Linux kernel client
- Assignee set to Xiubo Li
A kclient bug.
#4 Updated by Xiubo Li over 1 year ago
- Status changed from New to Need More Info
This is a little old kernel and it crashed in the fs/netfs/ layer, which the framework has changed a lot. Could you try the latest kernel if possible ? To see whether this has been fixed.