Project

General

Profile

Actions

Bug #53180

closed

Attempt to access reserved inode number 0x101

Added by 玮文 胡 over 2 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

While investigating https://tracker.ceph.com/issues/49922, A new warning is added to the kernel cephfs client. Now we are triggering this warning multiple times. the following is an example:

Nov 03 14:49:19 gpu015 kernel: ------------[ cut here ]------------
Nov 03 14:49:19 gpu015 kernel: Attempt to access reserved inode number 0x101
Nov 03 14:49:19 gpu015 kernel: WARNING: CPU: 15 PID: 1256107 at fs/ceph/super.h:548 __lookup_inode+0x162/0x1a0 [ceph]
Nov 03 14:49:19 gpu015 kernel: Modules linked in: ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid ib_core erofs rbd ipt_rpfilter iptable_raw ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel xt_multiport xt_set ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipportnet ip_set_hash_ipport ip_set dummy ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs binfmt_misc ip6table_nat ip6_tables iptable_mangle xt_comment xt_mark ceph libceph fscache xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bpfilter br_netfilter bridge stp llc aufs overlay dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua ipmi_ssif intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi snd_hda_intel kvm_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core kvm snd_hwdep soundwire_bus snd_soc_core snd_compress
Nov 03 14:49:19 gpu015 kernel:  ac97_bus snd_pcm_dmaengine snd_pcm rapl snd_timer snd intel_cstate soundcore mei_me mei mxm_wmi acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler acpi_pad mac_hid nvidia_uvm(POE) sch_fq_codel msr sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear nvidia_drm(POE) nvidia_modeset(POE) nvidia(POE) ast drm_vram_helper i2c_algo_bit drm_ttm_helper ttm crct10dif_pclmul drm_kms_helper crc32_pclmul syscopyarea ghash_clmulni_intel sysfillrect ixgbe sysimgblt aesni_intel fb_sys_fops cec ahci xfrm_algo rc_core crypto_simd i2c_i801 dca cryptd libahci i2c_smbus glue_helper drm i40e mdio lpc_ich xhci_pci xhci_pci_renesas wmi
Nov 03 14:49:19 gpu015 kernel: CPU: 15 PID: 1256107 Comm: node Tainted: P        W  OE     5.11.0-34-generic #36~20.04.1-Ubuntu
Nov 03 14:49:19 gpu015 kernel: Hardware name: TYAN B7079F77CV10HR-2T-N/S7079GM2NR-2T-N, BIOS V2.05.B10 02/27/2018
Nov 03 14:49:19 gpu015 kernel: RIP: 0010:__lookup_inode+0x162/0x1a0 [ceph]
Nov 03 14:49:19 gpu015 kernel: Code: 7e 2f 48 85 c0 0f 85 21 ff ff ff 48 63 c3 85 db 0f 89 51 ff ff ff e9 11 ff ff ff 4c 89 e6 48 c7 c7 e0 1d e7 c0 e8 fb 78 34 e6 <0f> 0b e9 36 ff ff ff be 03 00 00 00 48 89 45 c0 e8 b9 4e d4 e5 48
Nov 03 14:49:19 gpu015 kernel: RSP: 0018:ffffa95d70aa7c30 EFLAGS: 00010286
Nov 03 14:49:19 gpu015 kernel: RAX: 0000000000000000 RBX: ffff98708a884540 RCX: 0000000000000027
Nov 03 14:49:19 gpu015 kernel: RDX: 0000000000000027 RSI: 000000010001ae5a RDI: ffff98a03f958ac8
Nov 03 14:49:19 gpu015 kernel: RBP: ffffa95d70aa7c70 R08: ffff98a03f958ac0 R09: ffffa95d70aa79f0
Nov 03 14:49:19 gpu015 kernel: R10: 000000000193a510 R11: 000000000193a570 R12: 0000000000000101
Nov 03 14:49:19 gpu015 kernel: R13: ffff98708a884568 R14: ffff98708a884540 R15: ffff9880c6dcd8a8
Nov 03 14:49:19 gpu015 kernel: FS:  00007f9d87540780(0000) GS:ffff98a03f940000(0000) knlGS:0000000000000000
Nov 03 14:49:19 gpu015 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 03 14:49:19 gpu015 kernel: CR2: 00007fa8f0003ba2 CR3: 0000003a5bbc0006 CR4: 00000000003706e0
Nov 03 14:49:19 gpu015 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 03 14:49:19 gpu015 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 03 14:49:19 gpu015 kernel: Call Trace:
Nov 03 14:49:19 gpu015 kernel:  ceph_lookup_inode+0xe/0x30 [ceph]
Nov 03 14:49:19 gpu015 kernel:  lookup_quotarealm_inode.isra.0+0x168/0x220 [ceph]
Nov 03 14:49:19 gpu015 kernel:  check_quota_exceeded+0x1c5/0x230 [ceph]
Nov 03 14:49:19 gpu015 kernel:  ceph_quota_is_max_bytes_exceeded+0x59/0x60 [ceph]
Nov 03 14:49:19 gpu015 kernel:  ceph_write_iter+0x1a3/0x780 [ceph]
Nov 03 14:49:19 gpu015 kernel:  ? aa_file_perm+0x118/0x480
Nov 03 14:49:19 gpu015 kernel:  new_sync_write+0x117/0x1b0
Nov 03 14:49:19 gpu015 kernel:  vfs_write+0x1ca/0x280
Nov 03 14:49:19 gpu015 kernel:  ksys_write+0x67/0xe0
Nov 03 14:49:19 gpu015 kernel:  __x64_sys_write+0x1a/0x20
Nov 03 14:49:19 gpu015 kernel:  do_syscall_64+0x38/0x90
Nov 03 14:49:19 gpu015 kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Nov 03 14:49:19 gpu015 kernel: RIP: 0033:0x7f9d8765621f
Nov 03 14:49:19 gpu015 kernel: Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 59 65 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 44 24 08 e8 8c 65 f8 ff 48
Nov 03 14:49:19 gpu015 kernel: RSP: 002b:00007ffec4811220 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
Nov 03 14:49:19 gpu015 kernel: RAX: ffffffffffffffda RBX: 0000000000000057 RCX: 00007f9d8765621f
Nov 03 14:49:19 gpu015 kernel: RDX: 0000000000000057 RSI: 00000000064fbd30 RDI: 0000000000000019
Nov 03 14:49:19 gpu015 kernel: RBP: 00000000064fbd30 R08: 0000000000000000 R09: 00007f9d84237f00
Nov 03 14:49:19 gpu015 kernel: R10: 0000000000000064 R11: 0000000000000293 R12: 0000000000000057
Nov 03 14:49:19 gpu015 kernel: R13: 0000000006513b50 R14: 00007f9d877324a0 R15: 00007f9d877318a0
Nov 03 14:49:19 gpu015 kernel: ---[ end trace 216b86ebc3c91378 ]---

This is another slightly different stack trace

ceph_lookup_inode+0xe/0x30 [ceph]
lookup_quotarealm_inode.isra.0+0x168/0x220 [ceph]
check_quota_exceeded+0x1c5/0x230 [ceph]
ceph_quota_is_max_bytes_exceeded+0x59/0x60 [ceph]
ceph_write_iter+0x1a3/0x780 [ceph]
? aa_file_perm+0x118/0x480
? do_wp_page+0x1bd/0x330
new_sync_write+0x117/0x1b0
vfs_write+0x1ca/0x280
ksys_write+0x67/0xe0
__x64_sys_write+0x1a/0x20
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9

This may be related to OOM, some of these warnings are just go after an OOM message

Actions #1

Updated by Venky Shankar over 2 years ago

  • Status changed from New to Triaged
  • Assignee set to Jeff Layton
Actions #2

Updated by Jeff Layton over 2 years ago

Interesting. It looks like what happened is that this file got moved into a stray directory while the client application was still writing data to it. It then tried to get quota information from the parent which involved doing a lookup for it, at which point the warning popped.

That may mean that this warning is bogus and that we should just remove it, but I need a better understanding of what it means for a file to be in the strays directory.

Actions #3

Updated by Jeff Layton over 2 years ago

Probably what we should do is assume that stray dirs have no rquota on them. We already have a carveout for the root ino in ceph_has_realms_with_quotas(). We should be able to add ones for MDS dirs as well. I'll see if I can draft up a patch.

Actions #4

Updated by Jeff Layton over 2 years ago

Patch posted to the ceph-devel mailing list:

https://lore.kernel.org/ceph-devel/20211109171011.39571-1-jlayton@kernel.org/T/#u

玮文 胡, if you pass along your email address, I can give you Reported-by credit when we merge a patch for this.

Actions #5

Updated by Jeff Layton over 2 years ago

  • Status changed from Triaged to Fix Under Review
Actions #6

Updated by 玮文 胡 over 2 years ago

Thanks. My email address:

Reported-by: Hu Weiwen <>

Actions #7

Updated by Jeff Layton almost 2 years ago

  • Status changed from Fix Under Review to Resolved
Actions #8

Updated by 玮文 胡 almost 2 years ago

I think the above patch is not yet pushed to the testing branch of https://github.com/ceph/ceph-client. Why is this issue marked resolved?

Actions #9

Updated by Jeff Layton almost 2 years ago

This patch was merged into v5.17:

commit 0078ea3b0566e3da09ae8e1e4fbfd708702f2876
Author: Jeff Layton <jlayton@kernel.org>
Date:   Tue Nov 9 09:54:49 2021 -0500

    ceph: don't check for quotas on MDS stray dirs
Actions #10

Updated by Jeff Layton almost 2 years ago

  • Project changed from CephFS to Linux kernel client
Actions #11

Updated by 玮文 胡 almost 2 years ago

OK, thanks. I forgot that [PATCH v2] because it is not listed under the same thread in https://lore.kernel.org/ceph-devel/ . Sorry for disturb.

Actions

Also available in: Atom PDF