Project

General

Profile

Actions

Bug #48569

closed

KASAN use after free warning when evicting inode

Added by Jeff Layton over 3 years ago. Updated almost 2 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

I got this KASAN warning when I was testing something with dynamic_debug turned up:

[13389.775362] ==================================================================
[13389.775364] BUG: KASAN: use-after-free in string+0xe1/0x1c0
[13389.775367] Read of size 1 at addr ffff888117cc8408 by task kworker/u32:0/149349
[13389.775368] 
[13389.775370] CPU: 9 PID: 149349 Comm: kworker/u32:0 Tainted: G            E     5.10.0-rc7+ #223
[13389.775376] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[13389.775380] Workqueue: ceph-inode ceph_inode_work [ceph]
[13389.777380] Call Trace:
[13389.777382]  dump_stack+0xae/0xe5
[13389.777383]  ? string+0xe1/0x1c0
[13389.777385]  print_address_description.constprop.0+0x1c/0x250
[13389.777387]  ? string+0xe1/0x1c0
[13389.777389]  ? string+0xe1/0x1c0
[13389.777390]  kasan_report.cold+0x37/0x7c
[13389.777392]  ? put_dec_full8+0x90/0xe0
[13389.777394]  ? string+0xe1/0x1c0
[13389.777395]  string+0xe1/0x1c0
[13389.778396]  ? ip6_addr_string_sa+0x400/0x400
[13389.778397]  vsnprintf+0x3f8/0x820
[13389.778399]  ? pointer+0x650/0x650
[13389.778400]  ? mark_lock+0xea/0xd30
[13389.778402]  va_format.constprop.0+0xe1/0x1d0
[13389.778404]  ? lock_chain_count+0x20/0x20
[13389.778405]  ? sprintf+0xb0/0xb0
[13389.778407]  ? widen_string+0x65/0x130
[13389.778409]  pointer+0xd1/0x650
[13389.778410]  ? format_decode+0x1cf/0x4e0
[13389.778412]  ? resource_string.isra.0+0xc10/0xc10
[13389.778413]  vsnprintf+0x2e0/0x820
[13389.780406]  ? pointer+0x650/0x650
[13389.780408]  ? pvclock_clocksource_read+0xd9/0x1a0
[13389.780410]  vscnprintf+0x9/0x30
[13389.780411]  vprintk_store+0x89/0x2c0
[13389.780413]  ? kvm_sched_clock_read+0x14/0x30
[13389.780415]  ? sched_clock+0x5/0x10
[13389.780416]  ? do_syslog+0x50/0x50
[13389.780418]  ? lock_contended+0x620/0x620
[13389.780420]  vprintk_emit+0x239/0x400
[13389.780421]  printk+0x96/0xb2
[13389.780423]  ? record_print_text.cold+0x11/0x11
[13389.781424]  ? __dynamic_pr_debug+0xeb/0x130
[13389.781426]  ? dynamic_emit_prefix+0x2a/0x200
[13389.781427]  __dynamic_pr_debug+0xeb/0x130
[13389.781429]  ? dynamic_emit_prefix+0x200/0x200
[13389.781431]  ? kvm_sched_clock_read+0x14/0x30
[13389.781433]  ? sched_clock+0x5/0x10
[13389.781434]  ? sched_clock_local+0x10/0xc0
[13389.781436]  ? __ceph_remove_cap+0x532/0x5e0 [ceph]
[13389.781438]  __ceph_destroy_xattrs+0xf7/0x1c0 [ceph]
[13389.781440]  ceph_evict_inode+0x267/0x4f0 [ceph]
[13389.783435]  evict+0x18c/0x320
[13389.783437]  process_one_work+0x501/0x990
[13389.783439]  ? pwq_dec_nr_in_flight+0x110/0x110
[13389.783441]  worker_thread+0x2ea/0x6e0
[13389.783442]  ? __kthread_parkme+0xc0/0xe0
[13389.783444]  ? process_one_work+0x990/0x990
[13389.783445]  kthread+0x1fb/0x220
[13389.783447]  ? kthread_create_worker_on_cpu+0xb0/0xb0
[13389.783449]  ret_from_fork+0x22/0x30
[13389.783450] 
[13389.783452] Allocated by task 107824:
[13389.784452]  kasan_save_stack+0x1b/0x40
[13389.784454]  __kasan_kmalloc.constprop.0+0xc2/0xd0
[13389.784456]  ceph_kvmalloc+0xc7/0x110 [libceph]
[13389.784458]  ceph_buffer_new+0x62/0x110 [libceph]
[13389.784460]  ceph_fill_inode+0x1072/0x1bb0 [ceph]
[13389.784462]  ceph_fill_trace+0x49b/0x1360 [ceph]
[13389.784463]  dispatch+0xe85/0x23f0 [ceph]
[13389.784465]  ceph_con_workfn+0x21c6/0x3570 [libceph]
[13389.784467]  process_one_work+0x501/0x990
[13389.784468]  worker_thread+0x2ea/0x6e0
[13389.784470]  kthread+0x1fb/0x220
[13389.784472]  ret_from_fork+0x22/0x30
[13389.784473] 
[13389.786458] Freed by task 107824:
[13389.786460]  kasan_save_stack+0x1b/0x40
[13389.786462]  kasan_set_track+0x1c/0x30
[13389.786464]  kasan_set_free_info+0x1b/0x30
[13389.786465]  __kasan_slab_free+0x110/0x150
[13389.786467]  slab_free_freelist_hook+0x82/0x1b0
[13389.786469]  kfree+0xdf/0x610
[13389.786470]  ceph_buffer_release+0x20/0x60 [libceph]
[13389.786472]  __send_cap+0xd0/0x180 [ceph]
[13389.786474]  ceph_check_caps+0x863/0xf30 [ceph]
[13389.786476]  ceph_check_delayed_caps+0x143/0x1d0 [ceph]
[13389.787472]  delayed_work+0x22f/0x300 [ceph]
[13389.787474]  process_one_work+0x501/0x990
[13389.787476]  worker_thread+0x2ea/0x6e0
[13389.787478]  kthread+0x1fb/0x220
[13389.787479]  ret_from_fork+0x22/0x30
[13389.787481] 
[13389.787483] The buggy address belongs to the object at ffff888117cc8400
[13389.787485]  which belongs to the cache kmalloc-128 of size 128
[13389.787487] The buggy address is located 8 bytes inside of
[13389.789482]  128-byte region [ffff888117cc8400, ffff888117cc8480)
[13389.789484] The buggy address belongs to the page:
[13389.789486] page:000000000e5b4508 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x117cc8
[13389.789488] head:000000000e5b4508 order:1 compound_mapcount:0
[13389.789490] flags: 0x17ffffc0010200(slab|head)
[13389.789493] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888100043680
[13389.789495] raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
[13389.789497] page dumped because: kasan: bad access detected
[13389.790489] 
[13389.790490] Memory state around the buggy address:
[13389.790493]  ffff888117cc8300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13389.790495]  ffff888117cc8380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[13389.790497] >ffff888117cc8400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13389.790499]                       ^
[13389.790501]  ffff888117cc8480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[13389.790504]  ffff888117cc8500: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[13389.790506] ==================================================================
[13389.792498] Disabling lock debugging due to kernel taint
[13481.660751] ------------[ cut here ]------------
[13481.660753] precision 65536 too large
[13481.660755] WARNING: CPU: 1 PID: 152995 at lib/vsprintf.c:2515 set_precision+0x6d/0x80
[13481.660756] Modules linked in: loop(E) ceph(E) libceph(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) ip_set(E) rfkill(E) nf_tables(E) nfnetlink(E) cachefiles(E) fscache(E) netfs(E) sunrpc(E) i2c_i801(E) iTCO_wdt(E) intel_pmc_bxt(E) iTCO_vendor_support(E) i2c_smbus(E) joydev(E) pcspkr(E) virtio_balloon(E) lpc_ich(E) zram(E) ip_tables(E) xfs(E) crct10dif_pclmul(E) crc32_pclmul(E) crc32c_intel(E) virtio_console(E) ghash_clmulni_intel(E) serio_raw(E) virtio_net(E) net_failover(E) failover(E) virtio_blk(E) qxl(E) drm_ttm_helper(E) ttm(E) drm_kms_helper(E) cec(E) drm(E) qemu_fw_cfg(E)
[13481.660810] CPU: 1 PID: 152995 Comm: attr Tainted: G    B       E     5.10.0-rc7+ #223
[13481.660812] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[13481.660813] RIP: 0010:set_precision+0x6d/0x80
[13481.660817] Code: 0f 48 de 81 fb ff 7f 00 00 0f 4f d8 88 5d 06 88 7d 07 5b 5d c3 89 de 48 c7 c7 c0 49 d1 b4 c6 05 52 4a 73 02 01 e8 c7 8f 8b 00 <0f> 0b eb b9 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 b8 00
[13481.660819] RSP: 0018:ffff88810ff77428 EFLAGS: 00010082
[13481.660821] RAX: 0000000000000000 RBX: 0000000000010000 RCX: 0000000000000000
[13481.660823] RDX: 0000000000000023 RSI: 0000000000000004 RDI: ffffed1021feee7b
[13481.660825] RBP: ffff88810ff77490 R08: ffffffffb3361d0e R09: ffff88841cde8293
[13481.660826] R10: ffffed10839bd052 R11: 0000000000000001 R12: ffffffffc128952c
[13481.660828] R13: ffff88810ff77538 R14: ffffffffb797c640 R15: ffffffffc128952c
[13481.660830] FS:  00007f693e8ab740(0000) GS:ffff88841cc00000(0000) knlGS:0000000000000000
[13481.660832] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13481.660833] CR2: 0000565511ac4000 CR3: 00000001296ba000 CR4: 00000000003506e0
[13481.660834] Call Trace:
[13481.660836]  vsnprintf+0x397/0x820
[13481.660837]  ? pointer+0x650/0x650
[13481.660838]  va_format.constprop.0+0xe1/0x1d0
[13481.660840]  ? sprintf+0xb0/0xb0
[13481.660841]  ? desc_read_finalized_seq+0xd0/0xd0
[13481.660842]  ? widen_string+0x65/0x130
[13481.660844]  ? lock_release+0x9c/0x6b0
[13481.660845]  pointer+0xd1/0x650
[13481.660846]  ? format_decode+0x1cf/0x4e0
[13481.660848]  ? resource_string.isra.0+0xc10/0xc10
[13481.660849]  ? lock_release+0x6b0/0x6b0
[13481.660851]  vsnprintf+0x2e0/0x820
[13481.660852]  ? pointer+0x650/0x650
[13481.660853]  vscnprintf+0x9/0x30
[13481.660855]  vprintk_store+0x89/0x2c0
[13481.660856]  ? do_syslog+0x50/0x50
[13481.660857]  ? lock_acquired+0xa3/0x550
[13481.661844]  ? lock_contended+0x620/0x620
[13481.661845]  vprintk_emit+0x239/0x400
[13481.661847]  printk+0x96/0xb2
[13481.661848]  ? record_print_text.cold+0x11/0x11
[13481.661850]  ? dynamic_emit_prefix+0x2a/0x200
[13481.661851]  ? __zone_watermark_ok+0x270/0x270
[13481.661852]  ? __dynamic_pr_debug+0xeb/0x130
[13481.661854]  ? dynamic_emit_prefix+0x2a/0x200
[13481.661855]  __dynamic_pr_debug+0xeb/0x130
[13481.661856]  ? dynamic_emit_prefix+0x200/0x200
[13481.661858]  ? __alloc_pages_slowpath.constprop.0+0x1670/0x1670
[13481.661859]  ? vma_policy_mof+0x120/0x120
[13481.661861]  ? policy_node+0x56/0x60
[13481.661862]  ? __might_sleep+0x31/0xd0
[13481.661863]  ? memcpy+0x39/0x60
[13481.662849]  ? ceph_pagelist_append+0x2c1/0x2e0 [libceph]
[13481.662850]  __ceph_setxattr+0x8d5/0xbe0 [ceph]
[13481.662852]  ? ceph_listxattr+0x300/0x300 [ceph]
[13481.662853]  ? selinux_inode_permission+0x1fc/0x2f0
[13481.662855]  ? ceph_is_valid_xattr+0x71/0x80 [ceph]
[13481.662856]  __vfs_setxattr+0xb8/0xf0
[13481.662857]  ? xattr_resolve_name+0x170/0x170
[13481.662859]  ? security_inode_setxattr+0x77/0xd0
[13481.662860]  __vfs_setxattr_noperm+0x88/0x270
[13481.662861]  vfs_setxattr+0xae/0x190
[13481.662863]  ? __vfs_setxattr_locked+0x150/0x150
[13481.662864]  setxattr+0x17a/0x2c0
[13481.662866]  ? vfs_setxattr+0x190/0x190
[13481.662867]  ? may_linkat+0xe0/0xe0
[13481.662868]  ? lock_acquire+0xb3/0x4d0
[13481.662870]  ? lock_release+0x6b0/0x6b0
[13481.662871]  ? strncpy_from_user+0x226/0x270
[13481.662872]  ? __might_sleep+0x31/0xd0
[13481.662874]  ? __mnt_want_write+0xc2/0xd0
[13481.662875]  path_setxattr+0xf7/0x110
[13481.662876]  ? setxattr+0x2c0/0x2c0
[13481.662878]  __x64_sys_lsetxattr+0x65/0x70
[13481.662879]  do_syscall_64+0x33/0x40
[13481.662881]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[13481.662882] RIP: 0033:0x7f693e9ad6be
[13481.662885] Code: 48 8b 0d bd 47 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 bd 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8a 47 0c 00 f7 d8 64 89 01 48
[13481.662887] RSP: 002b:00007ffce4b2bb48 EFLAGS: 00000202 ORIG_RAX: 00000000000000bd
[13481.662890] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f693e9ad6be
[13481.662892] RDX: 0000565511ab52f0 RSI: 00007ffce4b2bb70 RDI: 000

It looks like ceph_check_caps called from ceph_check_delayed_caps raced with inode eviction, which shouldn't happen. That function goes an igrab on the inode before calling ceph_check_caps. Either the igrab should return NULL, or that should prevent the eviction from occurring.

The inode work queuing functions take a reference with i_hold though. I wonder if we may be racing with I_WILL_FREE being set, and are taking an extra reference after the point where the eviction has begun.

Actions

Also available in: Atom PDF