Bug #50983
closedcephfs kernel client use-after-free
0%
Description
While testing the io500 benchmark I've periodically been running into hard lockups during the file deletion phase of testing. This has occurred about 30-40% of the time roughly 1 hour into tests. It may or may not be related to using ephemeral pinning, though all failures have occurred when ephemeral random pinning has been enabled. I was watching dmesg the most recent time this happened and managed to grab the output:
[53938.250488] ceph: ceph_add_cap: couldn't find snap realm 115 [53938.250527] ------------[ cut here ]------------ [53938.250528] WARNING: CPU: 82 PID: 26833 at fs/ceph/caps.c:729 ceph_add_cap.cold.51+0x14/0x1b [ceph] [53938.250553] Modules linked in: ceph libceph dns_resolver iscsi_target_mod target_core_mod rfkill vfat fat ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common skx_edac nfit ipmi_ssif libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate intel_uncore pcspkr acpi_ipmi ioatdma i2c_i801 mei_me ipmi_si i2c_smbus mei lpc_ich dca ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ahci ttm libahci nvme ice drm i40e crc32c_intel libata nvme_core t10_pi wmi dm_mirror dm_region_hash dm_log dm_mod [53938.250598] CPU: 82 PID: 26833 Comm: kworker/82:0 Tainted: G S 5.12.6-1.el8.elrepo.x86_64 #1 [53938.250601] Hardware name: Quanta Cloud Technology Inc. QuantaGrid D52B-1U/S5B-MB (LBG-1G), BIOS 3B13 03/27/2019 [53938.250603] Workqueue: ceph-msgr ceph_con_workfn [libceph] [53938.250624] RIP: 0010:ceph_add_cap.cold.51+0x14/0x1b [ceph] [53938.250643] Code: 01 ab c0 03 4c 24 10 bd f4 ff ff ff e8 44 af 4d ce e9 97 69 fe ff 48 8b b4 24 80 00 00 00 48 c7 c7 e8 03 ab c0 e8 2b af 4d ce <0f> 0b e9 ef 78 fe ff 49 8b 54 24 08 4c 8b 43 30 48 c7 c7 38 05 ab [53938.250645] RSP: 0018:ffffbe78a3dc7b78 EFLAGS: 00010246 [53938.250647] RAX: 0000000000000030 RBX: ffff9ca0995519f8 RCX: 0000000000000000 [53938.250648] RDX: 0000000000000000 RSI: ffff9c8ec1198370 RDI: ffff9c8ec1198370 [53938.250650] RBP: 0000000000000d45 R08: 0000000000000000 R09: c0000000ffff7fff [53938.250651] R10: 0000000000000001 R11: ffffbe78a3dc7988 R12: 0000000000000015 [53938.250652] R13: ffff9c70f88745a0 R14: ffff9ca0995516a0 R15: 0000000000000001 [53938.250653] FS: 0000000000000000(0000) GS:ffff9c8ec1180000(0000) knlGS:0000000000000000 [53938.250655] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [53938.250656] CR2: 0000563db1142100 CR3: 0000000218b02002 CR4: 00000000007706e0 [53938.250657] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [53938.250658] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [53938.250659] PKRU: 55555554 [53938.250660] Call Trace: [53938.250663] ? __cond_resched+0x15/0x30 [53938.250669] ? __cap_is_valid+0x1c/0xa0 [ceph] [53938.250683] ceph_handle_caps+0xc23/0x17b0 [ceph] [53938.250698] mds_dispatch+0x136/0xc60 [ceph] [53938.250713] ? read_partial.isra.8+0x4a/0x70 [libceph] [53938.250732] ceph_con_process_message+0x79/0x140 [libceph] [53938.250747] ceph_con_v1_try_read+0x2ee/0x850 [libceph] [53938.250764] ceph_con_workfn+0x320/0x670 [libceph] [53938.250777] process_one_work+0x233/0x3d0 [53938.250782] worker_thread+0x2d/0x3e0 [53938.250784] ? process_one_work+0x3d0/0x3d0 [53938.250786] kthread+0x116/0x130 [53938.250788] ? kthread_park+0x80/0x80 [53938.250789] ret_from_fork+0x1f/0x30 [53938.250795] ---[ end trace ce88d842cf359791 ]--- [53938.250797] ------------[ cut here ]------------ [53938.250798] refcount_t: underflow; use-after-free. [53938.250802] WARNING: CPU: 82 PID: 26833 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0 [53938.250809] Modules linked in: ceph libceph dns_resolver iscsi_target_mod target_core_mod rfkill vfat fat ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common skx_edac nfit ipmi_ssif libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate intel_uncore pcspkr acpi_ipmi ioatdma i2c_i801 mei_me ipmi_si i2c_smbus mei lpc_ich dca ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ahci ttm libahci nvme ice drm i40e crc32c_intel libata nvme_core t10_pi wmi dm_mirror dm_region_hash dm_log dm_mod [53938.250841] CPU: 82 PID: 26833 Comm: kworker/82:0 Tainted: G S W 5.12.6-1.el8.elrepo.x86_64 #1 [53938.250843] Hardware name: Quanta Cloud Technology Inc. QuantaGrid D52B-1U/S5B-MB (LBG-1G), BIOS 3B13 03/27/2019 [53938.250844] Workqueue: ceph-msgr ceph_con_workfn [libceph] [53938.250856] RIP: 0010:refcount_warn_saturate+0xab/0xf0 [53938.250860] Code: 05 20 28 4c 01 01 e8 98 31 4d 00 0f 0b c3 80 3d 0e 28 4c 01 00 75 90 48 c7 c7 20 f0 75 8f c6 05 fe 27 4c 01 01 e8 79 31 4d 00 <0f> 0b c3 80 3d ed 27 4c 01 00 0f 85 6d ff ff ff 48 c7 c7 78 f0 75 [53938.250861] RSP: 0018:ffffbe78a3dc7bc8 EFLAGS: 00010282 [53938.250863] RAX: 0000000000000000 RBX: ffff9c61b7923500 RCX: 0000000000000027 [53938.250864] RDX: 0000000000000027 RSI: ffff9c8ec1198370 RDI: ffff9c8ec1198378 [53938.250865] RBP: ffff9c8f87f03800 R08: 0000000000000000 R09: c0000000ffff7fff [53938.250865] R10: 0000000000000001 R11: ffffbe78a3dc79d0 R12: ffff9c8f87f038d0 [53938.250866] R13: 0000000000000000 R14: ffff9c63008ec000 R15: ffff9c6f8952cff0 [53938.250867] FS: 0000000000000000(0000) GS:ffff9c8ec1180000(0000) knlGS:0000000000000000 [53938.250868] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [53938.250869] CR2: 0000563db1142100 CR3: 0000000218b02002 CR4: 00000000007706e0 [53938.250871] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [53938.250871] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [53938.250872] PKRU: 55555554 [53938.250873] Call Trace: [53938.250874] __destroy_snap_realm+0x8b/0xf0 [ceph] [53938.250889] ceph_put_snap_realm+0x64/0xc0 [ceph] [53938.250903] ceph_handle_caps+0xc8c/0x17b0 [ceph] [53938.250917] mds_dispatch+0x136/0xc60 [ceph] [53938.250932] ? read_partial.isra.8+0x4a/0x70 [libceph] [53938.250950] ceph_con_process_message+0x79/0x140 [libceph] [53938.250964] ceph_con_v1_try_read+0x2ee/0x850 [libceph] [53938.250980] ceph_con_workfn+0x320/0x670 [libceph] [53938.251004] process_one_work+0x233/0x3d0 [53938.251006] worker_thread+0x2d/0x3e0 [53938.251008] ? process_one_work+0x3d0/0x3d0 [53938.251010] kthread+0x116/0x130 [53938.251011] ? kthread_park+0x80/0x80 [53938.251013] ret_from_fork+0x1f/0x30 [53938.251015] ---[ end trace ce88d842cf359792 ]--- [53938.251037] BUG: unable to handle page fault for address: fffffffffffffff8 [53938.251057] #PF: supervisor read access in kernel mode [53938.251071] #PF: error_code(0x0000) - not-present page [53938.251083] PGD 2602e0f067 P4D 2602e0f067 PUD 2602e11067 PMD 0 [53938.251119] Oops: 0000 [#1] SMP NOPTI [53938.251129] CPU: 82 PID: 26833 Comm: kworker/82:0 Tainted: G S W 5.12.6-1.el8.elrepo.x86_64 #1 [53938.251152] Hardware name: Quanta Cloud Technology Inc. QuantaGrid D52B-1U/S5B-MB (LBG-1G), BIOS 3B13 03/27/2019 [53938.251175] Workqueue: ceph-msgr ceph_con_workfn [libceph] [53938.251203] RIP: 0010:__lookup_snap_realm.isra.9+0x18/0x70 [ceph] [53938.251236] Code: ce eb a3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 48 89 f8 48 85 ff 75 0b eb 1e 48 8b 40 10 48 85 c0 74 14 <48> 8b 50 e8 48 39 f2 77 ee 73 0b 48 8b 40 08 48 85 c0 75 ec c3 c3 [53938.251277] RSP: 0018:ffffbe78a3dc7b80 EFLAGS: 00010202 [53938.251291] RAX: 0000000000000010 RBX: ffff9c6b689e9ae0 RCX: ffff9c6b689e9ab0 [53938.251309] RDX: 00d694df00000000 RSI: 0000000000000115 RDI: ffff9c6120a26a18 [53938.251325] RBP: ffff9c8f87f03800 R08: ffffbe78a3dc7c80 R09: 0000000000000015 [53938.251342] R10: 0000000000000000 R11: 0000000000000015 R12: ffff9c6b689e9ae0 [53938.251359] R13: ffff9c6b689e9ae0 R14: ffff9c6b689e9ab0 R15: 0000000000000000 [53938.251375] FS: 0000000000000000(0000) GS:ffff9c8ec1180000(0000) knlGS:0000000000000000 [53938.251394] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [53938.251408] CR2: fffffffffffffff8 CR3: 0000000218b02002 CR4: 00000000007706e0 [53938.251424] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [53938.251441] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [53938.251457] PKRU: 55555554 [53938.251465] Call Trace: [53938.251473] ceph_lookup_snap_realm+0x16/0x30 [ceph] [53938.251504] ceph_update_snap_trace+0xe1/0x500 [ceph] [53938.251530] ceph_handle_caps+0x868/0x17b0 [ceph] [53938.251556] mds_dispatch+0x136/0xc60 [ceph] [53938.251582] ? read_partial.isra.8+0x4a/0x70 [libceph] [53938.251618] ceph_con_process_message+0x79/0x140 [libceph] [53938.251646] ceph_con_v1_try_read+0x2ee/0x850 [libceph] [53938.251675] ceph_con_workfn+0x320/0x670 [libceph] [53938.251700] process_one_work+0x233/0x3d0 [53938.251714] worker_thread+0x2d/0x3e0 [53938.251725] ? process_one_work+0x3d0/0x3d0 [53938.251738] kthread+0x116/0x130 [53938.251750] ? kthread_park+0x80/0x80 [53938.251761] ret_from_fork+0x1f/0x30 [53938.251774] Modules linked in: ceph libceph dns_resolver iscsi_target_mod target_core_mod rfkill vfat fat ext4 mbcache jbd2 intel_rapl_msr intel_rapl_common skx_edac nfit ipmi_ssif libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt intel_pmc_bxt iTCO_vendor_support kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate intel_uncore pcspkr acpi_ipmi ioatdma i2c_i801 mei_me ipmi_si i2c_smbus mei lpc_ich dca ipmi_devintf ipmi_msghandler acpi_power_meter acpi_pad ip_tables xfs libcrc32c sd_mod sg ast i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_ttm_helper ahci ttm libahci nvme ice drm i40e crc32c_intel libata nvme_core t10_pi wmi dm_mirror dm_region_hash dm_log dm_mod
Updated by Jeff Layton almost 3 years ago
- Assignee set to Jeff Layton
This happened during an import, so ephemeral pinning is probably a factor. The bad address in the page fault looks a bit like a ERR_PTR value, but all __lookup_snap_realm does is walk an rbtree, so this may be indicative of some memory corruption (likely caused during the UAF that the refcount_t handling caught).
In any case, the snap_rwsem handling in this codepath looks a bit sketchy. __lookup_snap_realm has this comment:
/*
* lookup the realm rooted at @ino.
*
* caller must hold snap_rwsem for write.
*/
...but the snap_rwsem is not held for write in that codepath. It may not actually be necessary since we're just searching the rbtree at that point, but I'm not sure.
Updated by Jeff Layton almost 3 years ago
- Project changed from CephFS to Linux kernel client