Bug #55332
closedFailure in snaptest-git-ceph.sh (it's an async unlink/create bug)
Added by Venky Shankar about 2 years ago. Updated 10 months ago.
0%
Description
Checking a snapshot (w/ git diff --exit-code fails).
022-04-11T15:29:53.327 INFO:tasks.workunit.client.0.smithi183.stderr:+ for v in $versions 2022-04-11T15:29:53.327 INFO:tasks.workunit.client.0.smithi183.stderr:+ ver=v0.1 2022-04-11T15:29:53.328 INFO:tasks.workunit.client.0.smithi183.stderr:+ echo checking v0.1 2022-04-11T15:29:53.328 INFO:tasks.workunit.client.0.smithi183.stderr:+ cd .snap/v0.1 2022-04-11T15:29:53.328 INFO:tasks.workunit.client.0.smithi183.stderr:+ git diff --exit-code ... ... ... 2022-04-11T15:30:16.495 DEBUG:teuthology.orchestra.run:got remote process result: 1
Similar to https://tracker.ceph.com/issues/17172, however this is w/ kernel client.
Files
mds.e.log.unlink_crash.tar.gz (209 KB) mds.e.log.unlink_crash.tar.gz | Xiubo Li, 05/11/2022 09:13 AM | ||
mds.a.log.unlink_crash.tar.gz (207 KB) mds.a.log.unlink_crash.tar.gz | Xiubo Li, 05/11/2022 09:13 AM |
Updated by Venky Shankar about 2 years ago
- Related to Bug #17172: Failure in snaptest-git-ceph.sh added
Updated by Venky Shankar about 2 years ago
- Status changed from New to Triaged
- Assignee set to Xiubo Li
Xiubo, please take a look.
Updated by Xiubo Li about 2 years ago
Venky Shankar wrote:
Xiubo, please take a look.
Sure.
Updated by Venky Shankar about 2 years ago
Updated by Xiubo Li almost 2 years ago
- Status changed from Triaged to In Progress
Updated by Xiubo Li almost 2 years ago
From remote/smithi183/syslog/kern.log.gz, the kernel has been crashed due to NULL pointer dereference:
2022-04-11T15:24:00.701508+00:00 smithi183 kernel: Key type dns_resolver registered 2022-04-11T15:24:00.719985+00:00 smithi183 kernel: Key type ceph registered 2022-04-11T15:24:00.725509+00:00 smithi183 kernel: libceph: loaded (mon/osd proto 15/24) 2022-04-11T15:24:00.753541+00:00 smithi183 kernel: ceph: loaded (mds proto 32) 2022-04-11T15:24:00.770543+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 session established 2022-04-11T15:24:00.778531+00:00 smithi183 kernel: libceph: client4646 fsid 1c46d31f-e003-4951-9e08-b26bf9092c68 2022-04-11T15:24:08.421682+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 socket closed (con state OPEN) 2022-04-11T15:24:08.421793+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 session lost, hunting for new mon 2022-04-11T15:24:08.429506+00:00 smithi183 kernel: libceph: mon0 (2)172.21.15.183:3300 session established 2022-04-11T15:25:56.318797+00:00 smithi183 kernel: libceph: mon0 (2)172.21.15.183:3300 socket closed (con state OPEN) 2022-04-11T15:25:56.318984+00:00 smithi183 kernel: libceph: mon0 (2)172.21.15.183:3300 session lost, hunting for new mon 2022-04-11T15:25:56.326532+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 session established 2022-04-11T15:27:19.491580+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 socket closed (con state OPEN) 2022-04-11T15:27:19.491723+00:00 smithi183 kernel: libceph: mon2 (2)172.21.15.190:3301 session lost, hunting for new mon 2022-04-11T15:27:19.499524+00:00 smithi183 kernel: libceph: mon0 (2)172.21.15.183:3300 session established 2022-04-11T15:31:25.607191+00:00 smithi183 kernel: ceph: dump_cap_flushes: still waiting for cap flushes through 45572: 2022-04-11T15:31:25.607350+00:00 smithi183 kernel: ceph: 5000000008b:fffffffffffffffe Fw 23183 0 0 2022-04-11T15:31:25.607406+00:00 smithi183 kernel: ceph: 5000000008a:fffffffffffffffe Fw 23184 0 0 2022-04-11T15:31:25.612849+00:00 smithi183 kernel: ceph: 50000000089:fffffffffffffffe Fw 23185 0 0 2022-04-11T15:31:25.618548+00:00 smithi183 kernel: ceph: 50000000084:fffffffffffffffe Fw 23189 0 0 2022-04-11T15:31:25.624207+00:00 smithi183 kernel: ceph: 5000000007a:fffffffffffffffe Fw 23199 0 0 2022-04-11T15:31:25.629880+00:00 smithi183 kernel: ceph: 50000000094:fffffffffffffffe Fw 23374 0 0 2022-04-11T15:31:25.635911+00:00 smithi183 kernel: ceph: 50000000092:fffffffffffffffe Fw 23377 0 0 2022-04-11T15:31:25.647285+00:00 smithi183 kernel: ceph: 50000000091:fffffffffffffffe Fw 23378 0 0 2022-04-11T15:31:25.647363+00:00 smithi183 kernel: ceph: 5000000008e:fffffffffffffffe Fw 23380 0 0 2022-04-11T15:31:25.652953+00:00 smithi183 kernel: ceph: 50000000087:fffffffffffffffe Fw 23382 0 0 2022-04-11T15:31:25.664305+00:00 smithi183 kernel: ceph: 50000000086:fffffffffffffffe Fw 23383 0 0 2022-04-11T15:31:25.664380+00:00 smithi183 kernel: ceph: 50000000083:fffffffffffffffe Fw 23384 0 0 2022-04-11T15:31:25.669973+00:00 smithi183 kernel: ceph: 50000000082:fffffffffffffffe Fw 23385 0 0 2022-04-11T15:31:25.675641+00:00 smithi183 kernel: ceph: 50000000081:fffffffffffffffe Fw 23386 0 0 2022-04-11T15:31:25.681325+00:00 smithi183 kernel: ceph: 50000000080:fffffffffffffffe Fw 23387 0 0 2022-04-11T15:31:25.686968+00:00 smithi183 kernel: ceph: 5000000007e:fffffffffffffffe Fw 23389 0 0 2022-04-11T15:31:25.692622+00:00 smithi183 kernel: ceph: 5000000007b:fffffffffffffffe Fw 23392 0 0 2022-04-11T15:31:25.698327+00:00 smithi183 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000780 2022-04-11T15:31:26.214294+00:00 smithi183 kernel: #PF: supervisor read access in kernel mode 2022-04-11T15:31:26.214374+00:00 smithi183 kernel: #PF: error_code(0x0000) - not-present page 2022-04-11T15:31:26.214401+00:00 smithi183 kernel: PGD 0 P4D 0 2022-04-11T15:31:26.214426+00:00 smithi183 kernel: Oops: 0000 [#1] PREEMPT SMP PTI 2022-04-11T15:31:26.214449+00:00 smithi183 kernel: CPU: 3 PID: 46268 Comm: umount Tainted: G S 5.18.0-rc2-ceph-g1771083b2f18 #1 2022-04-11T15:31:26.214472+00:00 smithi183 kernel: Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 2.0 12/17/2015 2022-04-11T15:31:26.214504+00:00 smithi183 kernel: RIP: 0010:ceph_mdsc_sync.cold.64+0x77/0xc3 [ceph] 2022-04-11T15:31:26.214528+00:00 smithi183 kernel: Code: 08 4d 8b 07 4d 39 c5 73 0f 48 8b 7c 24 08 e8 d0 7e 05 e1 e9 7f 90 ff ff 49 8b 47 30 41 0f b6 4f 0c 4c 89 44 24 28 41 8b 7f 08 <4c> 8b 88 80 07 00 00 89 4c 24 24 4c 89 4c 24 18 e8 84 00 fe ff 49 2022-04-11T15:31:26.214553+00:00 smithi183 kernel: RSP: 0018:ffffc90009c4fda8 EFLAGS: 00010212 2022-04-11T15:31:26.214574+00:00 smithi183 kernel: RAX: 0000000000000000 RBX: ffff8881abf63000 RCX: 0000000000000000 2022-04-11T15:31:26.214596+00:00 smithi183 kernel: RDX: 0000000000000000 RSI: ffffffff823932ad RDI: 0000000000000000 2022-04-11T15:31:26.214618+00:00 smithi183 kernel: RBP: ffff8881abf634f0 R08: 0000000000005dc7 R09: c0000000ffffdfff 2022-04-11T15:31:26.214639+00:00 smithi183 kernel: R10: 0000000000000001 R11: ffffc90009c4fbc8 R12: 0000000000000001 2022-04-11T15:31:26.214661+00:00 smithi183 kernel: R13: 000000000000b204 R14: ffffffffa0ab3598 R15: ffff88815d36a110 2022-04-11T15:31:26.214685+00:00 smithi183 kernel: FS: 00007f50eb25e080(0000) GS:ffff88885fcc0000(0000) knlGS:0000000000000000 2022-04-11T15:31:26.214708+00:00 smithi183 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2022-04-11T15:31:26.214729+00:00 smithi183 kernel: CR2: 0000000000000780 CR3: 0000000116ea2003 CR4: 00000000003706e0 2022-04-11T15:31:26.214751+00:00 smithi183 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2022-04-11T15:31:26.214772+00:00 smithi183 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2022-04-11T15:31:26.214794+00:00 smithi183 kernel: Call Trace: 2022-04-11T15:31:26.214815+00:00 smithi183 kernel: <TASK> 2022-04-11T15:31:26.214837+00:00 smithi183 kernel: ? schedstat_stop+0x10/0x10 2022-04-11T15:31:26.214859+00:00 smithi183 kernel: ceph_sync_fs+0x2c/0x100 [ceph] 2022-04-11T15:31:26.214882+00:00 smithi183 kernel: sync_filesystem+0x6d/0x90 2022-04-11T15:31:26.214906+00:00 smithi183 kernel: generic_shutdown_super+0x22/0x120 2022-04-11T15:31:26.214928+00:00 smithi183 kernel: kill_anon_super+0x14/0x30 2022-04-11T15:31:26.214950+00:00 smithi183 kernel: ceph_kill_sb+0x36/0x90 [ceph] 2022-04-11T15:31:26.214973+00:00 smithi183 kernel: deactivate_locked_super+0x29/0x60 2022-04-11T15:31:26.214995+00:00 smithi183 kernel: cleanup_mnt+0xb8/0x140 2022-04-11T15:31:26.215017+00:00 smithi183 kernel: task_work_run+0x6d/0xb0 2022-04-11T15:31:26.215039+00:00 smithi183 kernel: exit_to_user_mode_prepare+0x226/0x230 2022-04-11T15:31:26.215061+00:00 smithi183 kernel: syscall_exit_to_user_mode+0x25/0x60 2022-04-11T15:31:26.215084+00:00 smithi183 kernel: do_syscall_64+0x40/0x80 2022-04-11T15:31:26.215106+00:00 smithi183 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xae 2022-04-11T15:31:26.215131+00:00 smithi183 kernel: RIP: 0033:0x7f50e98fcdfb 2022-04-11T15:31:26.215153+00:00 smithi183 kernel: Code: 20 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 0f 1f 44 00 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 5d 20 2c 00 f7 d8 64 89 01 48 2022-04-11T15:31:26.215176+00:00 smithi183 kernel: RSP: 002b:00007ffd3c78f468 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 2022-04-11T15:31:26.215198+00:00 smithi183 kernel: RAX: 0000000000000000 RBX: 000055af2de6b5d0 RCX: 00007f50e98fcdfb 2022-04-11T15:31:26.215220+00:00 smithi183 kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055af2de6b7b0 2022-04-11T15:31:26.215241+00:00 smithi183 kernel: RBP: 0000000000000000 R08: 000055af2de6b7e0 R09: 00007f50e9980620 2022-04-11T15:31:26.215263+00:00 smithi183 kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000055af2de6b7b0 2022-04-11T15:31:26.215286+00:00 smithi183 kernel: R13: 00007f50eae58184 R14: 0000000000000000 R15: 00000000ffffffff 2022-04-11T15:31:26.215308+00:00 smithi183 kernel: </TASK> 2022-04-11T15:31:26.215332+00:00 smithi183 kernel: Modules linked in: ceph libceph dns_resolver fscache netfs veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink bridge stp llc xfs libcrc32c intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal coretemp kvm_intel kvm i2c_i801 iTCO_wdt irqbypass crct10dif_pclmul crc32_pclmul iTCO_vendor_support mei_me ghash_clmulni_intel ipmi_ssif i2c_smbus mei lpc_ich joydev mfd_core ioatdma wmi acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler sunrpc acpi_pad acpi_power_meter igb ixgbe crc32c_intel mdio nvme ptp i2c_algo_bit pps_core dca nvme_core fuse 2022-04-11T15:31:26.215366+00:00 smithi183 kernel: CR2: 0000000000000780 2022-04-11T15:31:26.215389+00:00 smithi183 kernel: ---[ end trace 0000000000000000 ]--- 2022-04-11T15:31:26.215412+00:00 smithi183 kernel: RIP: 0010:ceph_mdsc_sync.cold.64+0x77/0xc3 [ceph] 2022-04-11T15:31:26.215434+00:00 smithi183 kernel: Code: 08 4d 8b 07 4d 39 c5 73 0f 48 8b 7c 24 08 e8 d0 7e 05 e1 e9 7f 90 ff ff 49 8b 47 30 41 0f b6 4f 0c 4c 89 44 24 28 41 8b 7f 08 <4c> 8b 88 80 07 00 00 89 4c 24 24 4c 89 4c 24 18 e8 84 00 fe ff 49 2022-04-11T15:31:26.215456+00:00 smithi183 kernel: RSP: 0018:ffffc90009c4fda8 EFLAGS: 00010212 2022-04-11T15:31:26.215477+00:00 smithi183 kernel: RAX: 0000000000000000 RBX: ffff8881abf63000 RCX: 0000000000000000 2022-04-11T15:31:26.215509+00:00 smithi183 kernel: RDX: 0000000000000000 RSI: ffffffff823932ad RDI: 0000000000000000 2022-04-11T15:31:26.215533+00:00 smithi183 kernel: RBP: ffff8881abf634f0 R08: 0000000000005dc7 R09: c0000000ffffdfff 2022-04-11T15:31:26.215557+00:00 smithi183 kernel: R10: 0000000000000001 R11: ffffc90009c4fbc8 R12: 0000000000000001 2022-04-11T15:31:26.215579+00:00 smithi183 kernel: R13: 000000000000b204 R14: ffffffffa0ab3598 R15: ffff88815d36a110 2022-04-11T15:31:26.215601+00:00 smithi183 kernel: FS: 00007f50eb25e080(0000) GS:ffff88885fcc0000(0000) knlGS:0000000000000000 2022-04-11T15:31:26.215623+00:00 smithi183 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2022-04-11T15:31:26.215644+00:00 smithi183 kernel: CR2: 0000000000000780 CR3: 0000000116ea2003 CR4: 00000000003706e0 2022-04-11T15:31:26.215666+00:00 smithi183 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 2022-04-11T15:31:26.215689+00:00 smithi183 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 2022-04-11T15:31:26.215711+00:00 smithi183 kernel: note: umount[46268] exited with preempt_count 1
It seems there has some dirty caps haven't been flushed.
Updated by Xiubo Li almost 2 years ago
There has no debug kernel for kernel-5.18.0_rc2_ceph_g1771083b2f18-1.x86_64.rpm, which could be get from [1], so I just objdump the ceph.ko and get the ceph_mdsc_sync.cold.64 related assemble code:
... 60965 00000000000012b0 <ceph_mdsc_sync.cold.64>: 60966 12b0: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 60967 12b7: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 60968 12be: e8 00 00 00 00 callq 12c3 <ceph_mdsc_sync.cold.64+0x13> 60969 12c3: e9 00 00 00 00 jmpq 12c8 <ceph_mdsc_sync.cold.64+0x18> 60970 12c8: 4c 89 ea mov %r13,%rdx 60971 12cb: 48 c7 c6 00 00 00 00 mov $0x0,%rsi 60972 12d2: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 60973 12d9: e8 00 00 00 00 callq 12de <ceph_mdsc_sync.cold.64+0x2e> 60974 12de: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi 60975 12e3: e8 00 00 00 00 callq 12e8 <ceph_mdsc_sync.cold.64+0x38> 60976 12e8: 48 8b 83 88 04 00 00 mov 0x488(%rbx),%rax 60977 12ef: 4c 8d 78 f0 lea -0x10(%rax),%r15 60978 12f3: 49 8d 47 10 lea 0x10(%r15),%rax 60979 12f7: 48 39 44 24 10 cmp %rax,0x10(%rsp) 60980 12fc: 74 08 je 1306 <ceph_mdsc_sync.cold.64+0x56> 60981 12fe: 4d 8b 07 mov (%r15),%r8 60982 1301: 4d 39 c5 cmp %r8,%r13 60983 1304: 73 0f jae 1315 <ceph_mdsc_sync.cold.64+0x65> 60984 1306: 48 8b 7c 24 08 mov 0x8(%rsp),%rdi 60985 130b: e8 00 00 00 00 callq 1310 <ceph_mdsc_sync.cold.64+0x60> 60986 1310: e9 00 00 00 00 jmpq 1315 <ceph_mdsc_sync.cold.64+0x65> 60987 1315: 49 8b 47 30 mov 0x30(%r15),%rax 60988 1319: 41 0f b6 4f 0c movzbl 0xc(%r15),%ecx 60989 131e: 4c 89 44 24 28 mov %r8,0x28(%rsp) 60990 1323: 41 8b 7f 08 mov 0x8(%r15),%edi 60991 1327: 4c 8b 88 80 07 00 00 mov 0x780(%rax),%r9 60992 132e: 89 4c 24 24 mov %ecx,0x24(%rsp) 60993 1332: 4c 89 4c 24 18 mov %r9,0x18(%rsp) 60994 1337: e8 00 00 00 00 callq 133c <ceph_mdsc_sync.cold.64+0x8c> 60995 133c: 49 8b 77 30 mov 0x30(%r15),%rsi 60996 1340: 8b 4c 24 24 mov 0x24(%rsp),%ecx 60997 1344: 4c 89 f7 mov %r14,%rdi 60998 1347: 48 8b 96 98 04 00 00 mov 0x498(%rsi),%rdx 60999 134e: 48 8b b6 90 04 00 00 mov 0x490(%rsi),%rsi 61000 1355: 51 push %rcx 61001 1356: 48 89 c1 mov %rax,%rcx 61002 1359: 4c 8b 4c 24 20 mov 0x20(%rsp),%r9 61003 135e: 4c 8b 44 24 30 mov 0x30(%rsp),%r8 61004 1363: e8 00 00 00 00 callq 1368 <ceph_mdsc_sync.cold.64+0xb8> 61005 1368: 4d 8b 7f 10 mov 0x10(%r15),%r15 61006 136c: 5a pop %rdx 61007 136d: 49 83 ef 10 sub $0x10,%r15 61008 1371: eb 80 jmp 12f3 <ceph_mdsc_sync.cold.64+0x43> ...
From the core trace:
2022-04-11T15:31:26.214504+00:00 smithi183 kernel: RIP: 0010:ceph_mdsc_sync.cold.64+0x77/0xc3 [ceph]
It crashed in line 60991:
60991 1327: 4c 8b 88 80 07 00 00 mov 0x780(%rax),%r9
Compared this with the kernel C code, it crashed in:
1994 static void dump_cap_flushes(struct ceph_mds_client *mdsc, u64 want_tid) 1995 { 1996 struct ceph_cap_flush *cf; 1997 1998 pr_info("%s: still waiting for cap flushes through %llu:\n", 1999 __func__, want_tid); 2000 spin_lock(&mdsc->cap_dirty_lock); 2001 list_for_each_entry(cf, &mdsc->cap_flush_list, g_list) { 2002 if (cf->tid > want_tid) 2003 break; 2004 pr_info("%llx:%llx %s %llu %llu %d\n", 2005 ceph_vinop(&cf->ci->vfs_inode), 2006 ceph_cap_string(cf->caps), cf->tid, 2007 cf->ci->i_last_cap_flush_ack, cf->wake); 2008 } 2009 spin_unlock(&mdsc->cap_dirty_lock); 2010 }
in Line#2004 above.
Updated by Xiubo Li almost 2 years ago
The kernel crash issue has been fixed in https://patchwork.kernel.org/project/ceph-devel/list/?series=639983.
Updated by Xiubo Li almost 2 years ago
Venky Shankar wrote:
Another instance: https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/6818698/
Checked this failure, from the remote/smithi040/syslog/kern.log.gz:
2022-05-02T17:28:32.749995+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:28:33.019006+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:28:33.287010+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:28:33.573010+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:34.366017+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:34.834007+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:37.116330+00:00 smithi040 kernel: libceph: osd4 (2)172.21.15.53:6816 socket closed (con state OPEN) 2022-05-02T17:28:37.491007+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:40.110003+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:41.281006+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:41.682001+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:42.501016+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:46.639008+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:28:47.699008+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:28:48.411189+00:00 smithi040 kernel: libceph: osd2 (2)172.21.15.40:6814 socket closed (con state OPEN) 2022-05-02T17:28:50.452009+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:28:51.006593+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/os/ObjectStore.h result=-17! 2022-05-02T17:28:51.006773+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 00000000c5772917 1099511650591 2022-05-02T17:28:51.006872+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 1000000591f.fffffffffffffffe is shut down 2022-05-02T17:28:51.054824+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/osd/OSDMap.h result=-17! 2022-05-02T17:28:51.054936+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 00000000945389c3 1099511650640 2022-05-02T17:28:51.055005+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005950.fffffffffffffffe is shut down 2022-05-02T17:28:51.064357+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/osd/PG.h result=-17! 2022-05-02T17:28:51.079525+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000006dd6ad56 1099511650642 2022-05-02T17:28:51.079622+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005952.fffffffffffffffe is shut down 2022-05-02T17:28:51.088005+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/osd/osd_types.h result=-17! 2022-05-02T17:28:51.104698+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000002db759f8 1099511650645 2022-05-02T17:28:51.104819+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005955.fffffffffffffffe is shut down ... 2022-05-02T17:29:03.486011+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:29:03.766004+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:29:04.040992+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket error on write 2022-05-02T17:29:04.756003+00:00 smithi040 kernel: libceph: mds0 (2)172.21.15.40:6834 socket closed (con state OPEN) 2022-05-02T17:29:05.035493+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/mon/Paxos.h result=-17! 2022-05-02T17:29:05.035659+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 00000000dff7f5ed 1099511651327 2022-05-02T17:29:05.035736+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/mon/PaxosService.h result=-17! 2022-05-02T17:29:05.045283+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005bff.fffffffffffffffe is shut down 2022-05-02T17:29:05.045423+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000008ccc5bbf 1099511651328 2022-05-02T17:29:05.069568+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005c00.fffffffffffffffe is shut down 2022-05-02T17:29:05.069738+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/mon/Session.h result=-17! 2022-05-02T17:29:05.086243+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000003a42c6e3 1099511651329 2022-05-02T17:29:05.086419+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/mon/mon_types.h result=-17! 2022-05-02T17:29:05.095761+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005c01.fffffffffffffffe is shut down 2022-05-02T17:29:05.104305+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 00000000f681d78e 1099511651330 2022-05-02T17:29:05.120052+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005c02.fffffffffffffffe is shut down 2022-05-02T17:29:05.120151+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/monmaptool.cc result=-17! 2022-05-02T17:29:05.136561+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000002987cb52 1099511651331 2022-05-02T17:29:05.136652+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005c03.fffffffffffffffe is shut down
There have socket connection issue and async create file failure issues.
For the async create file failure issue, for example for:
2022-05-02T17:29:05.120151+00:00 smithi040 kernel: ceph: ceph: async create failure path=(1)client.0/tmp/ceph/src/monmaptool.cc result=-17! 2022-05-02T17:29:05.136561+00:00 smithi040 kernel: ceph: dropping dirty Fw state for 000000002987cb52 1099511651331 2022-05-02T17:29:05.136652+00:00 smithi040 kernel: ceph: ceph_do_invalidate_pages: inode 10000005c03.fffffffffffffffe is shut down
From the remote/smithi040/log/ceph-mds.c.log.gz, the file monmaptool.cc was created at 22214281 2022-05-02T17:29:04.926+0000:
22214281 2022-05-02T17:29:04.926+0000 7f45e13f6700 1 -- [v2:172.21.15.40:6834/2740274198,v1:172.21.15.40:6835/2740274198] <== client.4678 172.21.15.40:0/1181180694 100974 ==== client_request(client.4678:52091 create #0x10000002d2c/monmaptool.cc 2022-05-02T17:29:03.260624+0000 ASYNC caller_uid=1000, caller_gid=1262{6,36,1000,1262,}) v4 ==== 181+0+62 (unknown 0 0 0) 0x559592fe7b80 con 0x55958fc7ec00 ... 22214322 2022-05-02T17:29:04.926+0000 7f45e13f6700 12 mds.0.cache traverse: path seg depth 0 'monmaptool.cc' snapid head 22214323 2022-05-02T17:29:04.926+0000 7f45e13f6700 20 mds.0.cache.dir(0x10000002d2c.0*) lookup (monmaptool.cc, 'head') 22214324 2022-05-02T17:29:04.926+0000 7f45e13f6700 20 mds.0.cache.dir(0x10000002d2c.0*) hit -> (monmaptool.cc,head) 22214325 2022-05-02T17:29:04.926+0000 7f45e13f6700 10 mds.0.locker acquire_locks request(client.4678:52091 nref=3 cr=0x559592fe7b80) ... 22214341 2022-05-02T17:29:04.926+0000 7f45e13f6700 7 mds.0.server reply_client_request -17 ((17) File exists) client_request(client.4678:52091 create #0x10000002d2c/monmaptool.cc 2022-05-02T17:29:03.260624+0000 ASYNC caller_uid=1000, caller_gid=1262{6,36,1000,1262,}) v4 ...
While the file monmaptool.cc was just deleted at 22214186 2022-05-02T17:29:04.926+0000, but delayed for many time and until 22254030 2022-05-02T17:29:05.127+0000 could it acquire the locks successfully and continue the unlink work:
22214186 2022-05-02T17:29:04.926+0000 7f45e13f6700 1 -- [v2:172.21.15.40:6834/2740274198,v1:172.21.15.40:6835/2740274198] <== client.4678 172.21.15.40:0/1181180694 100973 ==== client_request(client.4678:52090 unlink #0x10000002d2c/monmaptool.cc 2022-05-02T17:29:03.260624+0000 ASYNC caller_uid=1000, caller_gid=1262{6,36,1000,1262,}) v4 ==== 225+0+0 (unknown 0 0 0) 0x559592fe6dc0 con 0x55958fc7ec00 ... 22254030 2022-05-02T17:29:05.127+0000 7f45dcbed700 12 mds.0.cache traverse: path seg depth 0 'monmaptool.cc' snapid head 22254031 2022-05-02T17:29:05.127+0000 7f45dcbed700 20 mds.0.cache.dir(0x10000002d2c.0*) lookup (monmaptool.cc, 'head') 22254032 2022-05-02T17:29:05.127+0000 7f45dcbed700 20 mds.0.cache.dir(0x10000002d2c.0*) hit -> (monmaptool.cc,head) 22254033 2022-05-02T17:29:05.127+0000 7f45dcbed700 10 mds.0.locker acquire_locks request(client.4678:52090 nref=3 cr=0x559592fe6dc0) ... 22254107 2022-05-02T17:29:05.127+0000 7f45dcbed700 10 mds.0.server _unlink_local [dentry #0x1/client.0/tmp/ceph/src/monmaptool.cc [20e,head] auth (dn xlock x=1 by 0x5595910c8000) (dversion lock w=1 last_client=4678) v=13777 ap=2 ino=0x1000000571c state=1610612736 | request=1 lock=2 inodepin=1 dirty=1 authpin=1 0x55958fe2e780] ... 22254308 2022-05-02T17:29:05.127+0000 7f45dcbed700 10 mds.0.server early_reply 0 ((0) Success) client_request(client.4678:52090 unlink #0x10000002d2c/monmaptool.cc 2022-05-02T17:29:03.260624+0000 ASYNC caller_uid=1000, caller_gid=1262{6,36,1000,1262,}) v4 ... 22329149 2022-05-02T17:29:05.911+0000 7f45db3ea700 10 mds.0.server _unlink_local_finish [dentry #0x1/client.0/tmp/ceph/src/monmaptool.cc [20e,head] auth (dn xlockdone x=1) (dversion lock w=1 last_client=4678) pv=14735 v=13777 ap=2 ino=0x1000000571c state=1610612736 | request=1 lock=2 inodepin=1 dirty=1 authpin=1 0x55958fe2e780]
That means while the file monmaptool.cc still under unlinking and delayed, the async create reqeust came and failed directly.
From teuthology.log we can see the snaptest failed:
2022-05-02T17:29:13.916 INFO:tasks.workunit.client.0.smithi040.stderr:+ for v in $versions 2022-05-02T17:29:13.917 INFO:tasks.workunit.client.0.smithi040.stderr:+ ver=v0.1 2022-05-02T17:29:13.917 INFO:tasks.workunit.client.0.smithi040.stderr:+ echo checking v0.1 2022-05-02T17:29:13.918 INFO:tasks.workunit.client.0.smithi040.stdout:checking v0.1 2022-05-02T17:29:13.919 INFO:tasks.workunit.client.0.smithi040.stderr:+ cd .snap/v0.1 2022-05-02T17:29:13.919 INFO:tasks.workunit.client.0.smithi040.stderr:+ git diff --exit-code ... 2022-05-02T17:31:14.320 INFO:tasks.workunit.client.0.smithi040.stderr:+ ver=v0.19 2022-05-02T17:31:14.320 INFO:tasks.workunit.client.0.smithi040.stderr:+ echo checking v0.19 2022-05-02T17:31:14.320 INFO:tasks.workunit.client.0.smithi040.stdout:checking v0.19 2022-05-02T17:31:14.321 INFO:tasks.workunit.client.0.smithi040.stderr:+ cd .snap/v0.19 2022-05-02T17:31:14.321 INFO:tasks.workunit.client.0.smithi040.stderr:+ git diff --exit-code ... 2022-05-02T17:31:18.908 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/os/ObjectStore.h b/src/os/ObjectStore.h 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:index 527313d631d..00000000000 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/os/ObjectStore.h 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/OSDMap.h b/src/osd/OSDMap.h 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:index ea388a75d60..00000000000 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/OSDMap.h 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/PG.h b/src/osd/PG.h 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:index d085e449b64..00000000000 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/PG.h 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/osd_types.h b/src/osd/osd_types.h 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:index 40024f2ac0f..00000000000 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/osd_types.h 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:18.908 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/os/ObjectStore.h b/src/os/ObjectStore.h 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:index 527313d631d..00000000000 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/os/ObjectStore.h 2022-05-02T17:31:18.909 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/OSDMap.h b/src/osd/OSDMap.h 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:index ea388a75d60..00000000000 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/OSDMap.h 2022-05-02T17:31:18.965 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/PG.h b/src/osd/PG.h 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:index d085e449b64..00000000000 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/PG.h 2022-05-02T17:31:19.062 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ... 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:diff --git a/src/osd/osd_types.h b/src/osd/osd_types.h 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:deleted file mode 100644 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:index 40024f2ac0f..00000000000 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:--- a/src/osd/osd_types.h 2022-05-02T17:31:19.224 INFO:tasks.workunit.client.0.smithi040.stdout:+++ /dev/null ...
Updated by Venky Shankar almost 2 years ago
This happens at least once when running fs suite. Latest - https://pulpito.ceph.com/vshankar-2022-05-09_10:08:21-fs-wip-vshankar-testing-20220509-125847-testing-default-smithi/6828291/ (seems to be the same issue).
Xiubo, could early_reply be the issue here (just guessing)? Server::_unlink_local_finish() unlinks the main dentry and responds back to the client. However, this is called after the unlink event is journaled, but the operation has been early replied. Could this be confusing the client?
Updated by Xiubo Li almost 2 years ago
Venky Shankar wrote:
This happens at least once when running fs suite. Latest - https://pulpito.ceph.com/vshankar-2022-05-09_10:08:21-fs-wip-vshankar-testing-20220509-125847-testing-default-smithi/6828291/ (seems to be the same issue).
Xiubo, could early_reply be the issue here (just guessing)? Server::_unlink_local_finish() unlinks the main dentry and responds back to the client. However, this is called after the unlink event is journaled, but the operation has been early replied. Could this be confusing the client?
Today I couldn't connect to the VPN, couldn't check the logs more.
I have readed my comments above, the unlink and create were both async requests. The kclient won't wait the async unlink to be finished in the MDS side, it just assumes the unlink succeeds once the request successfully sent out from kclient.
Checked the MDS code too, it will unlink the inode from the dn only after the journal log is successfully submitted and applied to Rados. That means before that if the kclient requests to async create the file again it will fail with EEXIST directly.
For [1] I can see the async create failure with -EEXIST errno, while for [2] I couldn't see any async create failures. For [2] I think it was another corner case that it was using the sync create, which in MDS side it will open the file instead of failing it with -EEXIST errno, so the create request succeeded but in MDS side the unlink just deleted that file later.
[1] https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/6818698/
[2] https://pulpito.ceph.com/vshankar-2022-04-11_12:24:06-fs-wip-vshankar-testing1-20220411-144044-testing-default-smithi/6786388/
Updated by Xiubo Li almost 2 years ago
Locally when I running the snaptest-git-ceph.sh tests there had two MDS daemons crash with:
-29> 2022-05-11T09:59:28.647+0800 14d80e1ee700 2 mds.4.cache Memory usage: total 586512, rss 166568, heap 51468, baseline 49420, 337 / 3560 inodes have caps, 337 caps, 0.0946629 caps per inode -28> 2022-05-11T09:59:28.744+0800 14d80e9f2700 5 mds.beacon.a Sending beacon up:active seq 197 -27> 2022-05-11T09:59:28.744+0800 14d80e9f2700 10 monclient: _send_mon_message to mon.c at v1:10.72.47.117:40414/0 -26> 2022-05-11T09:59:28.748+0800 14d80f9fa700 5 mds.beacon.a received beacon reply up:active seq 197 rtt 0.00400014 -25> 2022-05-11T09:59:28.898+0800 14d80edf4700 10 monclient: tick -24> 2022-05-11T09:59:28.898+0800 14d80edf4700 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2022-05-11T09:58:58.898949+0800) -23> 2022-05-11T09:59:29.529+0800 14d80f1f6700 4 mds.4.16 apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 1) -22> 2022-05-11T09:59:29.529+0800 14d80f1f6700 10 monclient: _renew_subs -21> 2022-05-11T09:59:29.529+0800 14d80f1f6700 10 monclient: _send_mon_message to mon.c at v1:10.72.47.117:40414/0 -20> 2022-05-11T09:59:29.648+0800 14d80e1ee700 2 mds.4.cache Memory usage: total 586512, rss 166568, heap 51468, baseline 49420, 337 / 3560 inodes have caps, 337 caps, 0.0946629 caps per inode -19> 2022-05-11T09:59:29.898+0800 14d80edf4700 10 monclient: tick -18> 2022-05-11T09:59:29.898+0800 14d80edf4700 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2022-05-11T09:58:59.899101+0800) -17> 2022-05-11T09:59:30.044+0800 14d80f1f6700 3 mds.4.server handle_client_session client_session(request_renewcaps seq 27) from client.4240 -16> 2022-05-11T09:59:30.595+0800 14d80f1f6700 4 mds.4.16 apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 1) -15> 2022-05-11T09:59:30.595+0800 14d80f1f6700 10 monclient: _renew_subs -14> 2022-05-11T09:59:30.595+0800 14d80f1f6700 10 monclient: _send_mon_message to mon.c at v1:10.72.47.117:40414/0 -13> 2022-05-11T09:59:30.650+0800 14d80e1ee700 2 mds.4.cache Memory usage: total 586512, rss 166568, heap 51468, baseline 49420, 337 / 3560 inodes have caps, 337 caps, 0.0946629 caps per inode -12> 2022-05-11T09:59:30.905+0800 14d80edf4700 10 monclient: tick -11> 2022-05-11T09:59:30.905+0800 14d80edf4700 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2022-05-11T09:59:00.905732+0800) -10> 2022-05-11T09:59:31.604+0800 14d80f1f6700 4 mds.4.16 apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 1) -9> 2022-05-11T09:59:31.604+0800 14d80f1f6700 10 monclient: _renew_subs -8> 2022-05-11T09:59:31.604+0800 14d80f1f6700 10 monclient: _send_mon_message to mon.c at v1:10.72.47.117:40414/0 -7> 2022-05-11T09:59:31.651+0800 14d80e1ee700 2 mds.4.cache Memory usage: total 586512, rss 166568, heap 51468, baseline 49420, 337 / 3560 inodes have caps, 337 caps, 0.0946629 caps per inode -6> 2022-05-11T09:59:31.794+0800 14d80f1f6700 4 mds.4.16 apply_blocklist: killed 0, blocklisted sessions (0 blocklist entries, 1) -5> 2022-05-11T09:59:31.794+0800 14d80f1f6700 10 monclient: _renew_subs -4> 2022-05-11T09:59:31.794+0800 14d80f1f6700 10 monclient: _send_mon_message to mon.c at v1:10.72.47.117:40414/0 -3> 2022-05-11T09:59:31.905+0800 14d80edf4700 10 monclient: tick -2> 2022-05-11T09:59:31.905+0800 14d80edf4700 10 monclient: _check_auth_rotating renewing rotating keys (they expired before 2022-05-11T09:59:01.905938+0800) -1> 2022-05-11T09:59:32.099+0800 14d80f1f6700 -1 /data/ceph/src/mds/MDCache.cc: In function 'void MDCache::handle_dentry_unlink(ceph::cref_t<MDentryUnlink>&)' thread 14d80f1f6700 time 2022-05-11T09:59:32.078074+0800 /data/ceph/src/mds/MDCache.cc: 11233: FAILED ceph_assert(straydn->first >= in->first) ceph version 17.0.0-11972-ga328d3264a0 (a328d3264a0b64c03cf90c2d39f37c6962fa45cb) quincy (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x14d81720cddf] 2: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x14d81720d011] 3: (MDCache::handle_dentry_unlink(boost::intrusive_ptr<MDentryUnlink const> const&)+0x6ad) [0x559db355c4c7] 4: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0x1ee) [0x559db3591a48] 5: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x114) [0x559db3405074] 6: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x307) [0x559db340919b] 7: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x559db3409dd4] 8: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26c) [0x559db33f683e] 9: (DispatchQueue::entry()+0x1c2f) [0x14d817381265] 10: (DispatchQueue::DispatchThread::entry()+0xd) [0x14d817439db7] 11: (Thread::entry_wrapper()+0x3f) [0x14d8171df801] 12: (Thread::_entry_func(void*)+0x9) [0x14d8171df819] 13: /lib64/libpthread.so.0(+0x817a) [0x14d81540817a] 14: clone() 0> 2022-05-11T09:59:32.107+0800 14d80f1f6700 -1 *** Caught signal (Aborted) ** in thread 14d80f1f6700 thread_name:ms_dispatch ceph version 17.0.0-11972-ga328d3264a0 (a328d3264a0b64c03cf90c2d39f37c6962fa45cb) quincy (dev) 1: /lib64/libpthread.so.0(+0x12c20) [0x14d815412c20] 2: gsignal() 3: abort() 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x2b2) [0x14d81720cf63] 5: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x14d81720d011] 6: (MDCache::handle_dentry_unlink(boost::intrusive_ptr<MDentryUnlink const> const&)+0x6ad) [0x559db355c4c7] 7: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0x1ee) [0x559db3591a48] 8: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x114) [0x559db3405074] 9: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x307) [0x559db340919b] 10: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x559db3409dd4] 11: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x26c) [0x559db33f683e] 12: (DispatchQueue::entry()+0x1c2f) [0x14d817381265] 13: (DispatchQueue::DispatchThread::entry()+0xd) [0x14d817439db7] 14: (Thread::entry_wrapper()+0x3f) [0x14d8171df801] 15: (Thread::_entry_func(void*)+0x9) [0x14d8171df819] 16: /lib64/libpthread.so.0(+0x817a) [0x14d81540817a] 17: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The fs dump output is:
[root@lxbceph1 build]# ./bin/ceph fs dump *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** 2022-05-11T16:33:29.354+0800 1500b81ff700 -1 WARNING: all dangerous and experimental features are enabled. 2022-05-11T16:33:29.388+0800 1500b759d700 -1 WARNING: all dangerous and experimental features are enabled. e62 enable_multiple, ever_enabled_multiple: 1,1 default compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2} legacy client fscid: 1 Filesystem 'a' (1) fs_name a epoch 62 flags 12 joinable allow_snaps allow_multimds_snaps created 2022-05-11T09:46:43.239638+0800 modified 2022-05-11T10:02:10.773892+0800 tableserver 0 root 0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 required_client_features {} last_failure 0 last_failure_osd_epoch 0 compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2} max_mds 5 in 0,1,2,3,4 up {0=4233,1=4200,2=4198,3=4189,4=4169} failed damaged stopped data_pools [3] metadata_pool 2 inline_data disabled balancer standby_count_wanted 1 [mds.e{0:4233} state up:active seq 4 laggy since 2022-05-11T09:59:46.938548+0800 export targets 1,2,3,4 addr v1:10.72.47.117:6817/3511908233 compat {c=[1],r=[1],i=[7ff]}] [mds.b{1:4200} state up:active seq 206 addr v1:10.72.47.117:6814/4045335419 compat {c=[1],r=[1],i=[7ff]}] [mds.d{2:4198} state up:active seq 204 addr v1:10.72.47.117:6816/1568717300 compat {c=[1],r=[1],i=[7ff]}] [mds.c{3:4189} state up:active seq 55 addr v1:10.72.47.117:6815/3715964172 compat {c=[1],r=[1],i=[7ff]}] [mds.a{4:4169} state up:active seq 56 laggy since 2022-05-11T09:59:46.938537+0800 export targets 0,1,2,3 addr v1:10.72.47.117:6813/3882317668 compat {c=[1],r=[1],i=[7ff]}] dumped fsmap epoch 62 [root@lxbceph1 build]#
Updated by Venky Shankar almost 2 years ago
Xiubo Li wrote:
Venky Shankar wrote:
This happens at least once when running fs suite. Latest - https://pulpito.ceph.com/vshankar-2022-05-09_10:08:21-fs-wip-vshankar-testing-20220509-125847-testing-default-smithi/6828291/ (seems to be the same issue).
Xiubo, could early_reply be the issue here (just guessing)? Server::_unlink_local_finish() unlinks the main dentry and responds back to the client. However, this is called after the unlink event is journaled, but the operation has been early replied. Could this be confusing the client?
Today I couldn't connect to the VPN, couldn't check the logs more.
I have readed my comments above, the unlink and create were both async requests. The kclient won't wait the async unlink to be finished in the MDS side, it just assumes the unlink succeeds once the request successfully sent out from kclient.
Shouldn't the client wait till it gets a response (and trace information) from the MDS for the operation? What happens if the (unlink) operation fails on the MDS? Does the client trim the dentry as soon as it sends out the unlink request?
Checked the MDS code too, it will unlink the inode from the dn only after the journal log is successfully submitted and applied to Rados. That means before that if the kclient requests to async create the file again it will fail with EEXIST directly.
FWIW, early reply too includes the trace information. Should the client at least wait till it receives early reply?
For [1] I can see the async create failure with -EEXIST errno, while for [2] I couldn't see any async create failures. For [2] I think it was another corner case that it was using the sync create, which in MDS side it will open the file instead of failing it with -EEXIST errno, so the create request succeeded but in MDS side the unlink just deleted that file later.
Do you think early reply could be culprit in both cases?
[1] https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/6818698/
[2] https://pulpito.ceph.com/vshankar-2022-04-11_12:24:06-fs-wip-vshankar-testing1-20220411-144044-testing-default-smithi/6786388/
Updated by Xiubo Li almost 2 years ago
- File mds.a.log.unlink_crash.tar.gz mds.a.log.unlink_crash.tar.gz added
- File mds.e.log.unlink_crash.tar.gz mds.e.log.unlink_crash.tar.gz added
Xiubo Li wrote:
Locally when I running the snaptest-git-ceph.sh tests there had two MDS daemons crash with:
[...]
The fs dump output is:
[...]
Uploaded the log files.
Updated by Xiubo Li almost 2 years ago
Venky Shankar wrote:
Xiubo Li wrote:
Venky Shankar wrote:
This happens at least once when running fs suite. Latest - https://pulpito.ceph.com/vshankar-2022-05-09_10:08:21-fs-wip-vshankar-testing-20220509-125847-testing-default-smithi/6828291/ (seems to be the same issue).
Xiubo, could early_reply be the issue here (just guessing)? Server::_unlink_local_finish() unlinks the main dentry and responds back to the client. However, this is called after the unlink event is journaled, but the operation has been early replied. Could this be confusing the client?
Today I couldn't connect to the VPN, couldn't check the logs more.
I have readed my comments above, the unlink and create were both async requests. The kclient won't wait the async unlink to be finished in the MDS side, it just assumes the unlink succeeds once the request successfully sent out from kclient.
Shouldn't the client wait till it gets a response (and trace information) from the MDS for the operation?
Currently it won't. It will try to check the Fxr caps, and only send out an async unlink when these caps are issued.
1198 err = ceph_mdsc_submit_request(mdsc, dir, req); 1199 if (!err) { 1200 /* 1201 * We have enough caps, so we assume that the unlink 1202 * will succeed. Fix up the target inode and dcache. 1203 */ 1204 drop_nlink(inode); 1205 d_delete(dentry); 1206 } else if (err == -EJUKEBOX) { 1207 try_async = false; 1208 ceph_mdsc_put_request(req); 1209 goto retry; 1210 }
Then it just assumes the unlink will succeed.
What happens if the (unlink) operation fails on the MDS? Does the client trim the dentry as soon as it sends out the unlink request?
Yeah, as above code showing the dentry will be trimmed immediately after the async unlink request sent out successfully.
But later if the async unlink fails in MDS, the kclient will do just set up error in the inode and notify the userspaces:
1081 /* If op failed, mark everyone involved for errors */ 1082 if (result) { 1083 int pathlen = 0; 1084 u64 base = 0; 1085 char *path = ceph_mdsc_build_path(req->r_dentry, &pathlen, 1086 &base, 0); 1087 1088 /* mark error on parent + clear complete */ 1089 mapping_set_error(req->r_parent->i_mapping, result); 1090 ceph_dir_clear_complete(req->r_parent); 1091 1092 /* drop the dentry -- we don't know its status */ 1093 if (!d_unhashed(req->r_dentry)) 1094 d_drop(req->r_dentry); 1095 1096 /* mark inode itself for an error (since metadata is bogus) */ 1097 mapping_set_error(req->r_old_inode->i_mapping, result); 1098 1099 pr_warn("ceph: async unlink failure path=(%llx)%s result=%d!\n", 1100 base, IS_ERR(path) ? "<<bad>>" : path, result); 1101 ceph_mdsc_free_path(path, pathlen); 1102 }
Checked the MDS code too, it will unlink the inode from the dn only after the journal log is successfully submitted and applied to Rados. That means before that if the kclient requests to async create the file again it will fail with EEXIST directly.
FWIW, early reply too includes the trace information. Should the client at least wait till it receives early reply?
The async create/unlink won't wait for the early replies once the kclient have enough caps issued.
For [1] I can see the async create failure with -EEXIST errno, while for [2] I couldn't see any async create failures. For [2] I think it was another corner case that it was using the sync create, which in MDS side it will open the file instead of failing it with -EEXIST errno, so the create request succeeded but in MDS side the unlink just deleted that file later.
Do you think early reply could be culprit in both cases?
From https://tracker.ceph.com/issues/55332?issue_count=63&issue_position=1&next_issue_id=55421#note-9 we can see the when the async create came the early reply for the async unlink request hadn't happen yet. NOTE: please check the line number of the logs in https://tracker.ceph.com/issues/55332?issue_count=63&issue_position=1&next_issue_id=55421#note-9.
From the logs why the async unlink was delay for several times was because it seems trying to get the lockers but failed, currently I couldn't open that log file due to the teuthology server's connection issue today, so couldn't confirm it. And during async unlink request waiting the async create came and fail. But if a sync create came, it would succeed but it's buggy too.
[1] https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/6818698/
[2] https://pulpito.ceph.com/vshankar-2022-04-11_12:24:06-fs-wip-vshankar-testing1-20220411-144044-testing-default-smithi/6786388/
Anyway we need to make sure the unlink and create sequence in either the client side or the MDS side.
Updated by Jeff Layton almost 2 years ago
Anyway we need to make sure the unlink and create sequence in either the client side or the MDS side.
We do d_drop the dentry on a failed unlink, but that may not happen before the client attempts to send a create for the same dentry.
One idea : declare a new ceph_dentry_info flag for dentries that are being unlinked. If that's set, then the create could block and wait on the reply before proceeding.
Updated by Xiubo Li almost 2 years ago
Jeff Layton wrote:
Anyway we need to make sure the unlink and create sequence in either the client side or the MDS side.
We do d_drop the dentry on a failed unlink, but that may not happen before the client attempts to send a create for the same dentry.
One idea : declare a new ceph_dentry_info flag for dentries that are being unlinked. If that's set, then the create could block and wait on the reply before proceeding.
Yeah, the waiting here will only wait for the early reply, not for the safe reply. But if the unlink in MDS side not journalized and applied to the Rados, it seems the async openc in MDS still could fail, and for sync openc case still has the same issue, which is the file will be deleted after the openc succeeds. Need to check more about the MDS code and confirm it.
Updated by Xiubo Li almost 2 years ago
Xiubo Li wrote:
Venky Shankar wrote:
Another instance: https://pulpito.ceph.com/vshankar-2022-05-02_16:58:59-fs-wip-vshankar-testing1-20220502-201957-testing-default-smithi/6818698/
Checked this failure, from the remote/smithi040/syslog/kern.log.gz:
[...]
There have socket connection issue and async create file failure issues.
For the async create file failure issue, for example for:
[...]
From the remote/smithi040/log/ceph-mds.c.log.gz, the file monmaptool.cc was created at 22214281 2022-05-02T17:29:04.926+0000:
[...]
While the file monmaptool.cc was just deleted at 22214186 2022-05-02T17:29:04.926+0000, but delayed for many time and until 22254030 2022-05-02T17:29:05.127+0000 could it acquire the locks successfully and continue the unlink work:
Checked the log again, it's not waiting to acquire the locks and it was waiting for the straydir to be unfroze and then just released the mds_lock and then the async create request was triggered:
22214268 2022-05-02T17:29:04.926+0000 7f45e13f6700 20 mds.0.cache.dir(0x605.011*) lookup (1000000571c, 'head') 22214269 2022-05-02T17:29:04.926+0000 7f45e13f6700 20 mds.0.cache.dir(0x605.011*) miss -> (10000005967,head) 22214270 2022-05-02T17:29:04.926+0000 7f45e13f6700 10 mds.0.server prepare_stray_dentry: [dir 0x605.011* ~mds0/stray5/ [2,head] auth v=14670 cv=0/0 state=1610883081|complete|frozendir|sticky f(v0 12=12+0) n(v0 rc2022-05-02T17:28:57.345695+0000 b150137 12=12+0) hs=12+2,ss=0+0 dirty=12 | child=1 frozen=1 sticky=1 dirty=1 waiter=10x559592519200] is frozen, waiting 22214271 2022-05-02T17:29:04.926+0000 7f45e13f6700 10 mds.0.locker xlock_finish on (dn xlock x=1 by 0x5595910c8000) [dentry #0x1/client.0/tmp/ceph/src/monmaptool.cc [20e,head] auth (dn xlock x=1 by 0x5595910c8000) (dversion lock w=1 last_client=4678) v=13777 ap=2 ino=0x1000000571c state=1610612736 | request=1 lock=2 inodepin=1 dirty=1 authpin=1 0x55958fe2e780]
[...]
That means while the file monmaptool.cc still under unlinking and delayed, the async create reqeust came and failed directly.
From teuthology.log we can see the snaptest failed:
[...]
Updated by Xiubo Li almost 2 years ago
- Subject changed from Failure in snaptest-git-ceph.sh to Failure in snaptest-git-ceph.sh (async unlink/create bugs)
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 46331
Updated by Xiubo Li almost 2 years ago
- Subject changed from Failure in snaptest-git-ceph.sh (async unlink/create bugs) to Failure in snaptest-git-ceph.sh (it's an async unlink/create bug)
Updated by Xiubo Li almost 2 years ago
It's a async unlink/create bug and fixed it in both kclient and MDS side:
[1] https://github.com/ceph/ceph-client/commit/b3c8646b2c1d93def4dc6f0737459d23582b4a07
[2] https://github.com/ceph/ceph/pull/46331
Updated by Venky Shankar almost 2 years ago
Xiubo, there is another class of failures for this test. See - https://pulpito.ceph.com/vshankar-2022-06-15_04:03:39-fs-wip-vshankar-testing1-20220615-072516-testing-default-smithi/6879444/
Failures is:
2022-06-15T05:22:58.260 INFO:tasks.ceph.mds.d:Started 2022-06-15T05:22:58.261 DEBUG:teuthology.orchestra.run.smithi017:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph fs dump --format=json 2022-06-15T05:22:58.277 INFO:tasks.workunit.client.0.smithi017.stderr:Updating files: 99% (306/309)^MUpdating files: 99% (307/309)^MUpdating files: 100% (309/309)^MUpdating files: 100% (309/309), done. 2022-06-15T05:22:58.278 INFO:tasks.workunit.client.0.smithi017.stderr:fatal: Could not reset index file to revision 'v0.81'. 2022-06-15T05:22:58.332 DEBUG:teuthology.orchestra.run:got remote process result: 128 2022-06-15T05:22:58.333 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2022-06-15T05:22:58.334 DEBUG:teuthology.orchestra.run.smithi017:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
This looks like a different issue, isn't it?
Updated by Xiubo Li almost 2 years ago
Venky Shankar wrote:
Xiubo, there is another class of failures for this test. See - https://pulpito.ceph.com/vshankar-2022-06-15_04:03:39-fs-wip-vshankar-testing1-20220615-072516-testing-default-smithi/6879444/
Failures is:
[...]
This looks like a different issue, isn't it?
`96% (297/309)^MUpdating files: 97% (300/309)^MUpdating files: 98% (303/309)^Merror: unable to create file src/tools/radosacl.cc: File exists`
It should be the same issue. The unlinked file still existed when creating a new one.
Updated by Venky Shankar almost 2 years ago
Xiubo Li wrote:
Venky Shankar wrote:
Xiubo, there is another class of failures for this test. See - https://pulpito.ceph.com/vshankar-2022-06-15_04:03:39-fs-wip-vshankar-testing1-20220615-072516-testing-default-smithi/6879444/
Failures is:
[...]
This looks like a different issue, isn't it?
`96% (297/309)^MUpdating files: 97% (300/309)^MUpdating files: 98% (303/309)^Merror: unable to create file src/tools/radosacl.cc: File exists`
It should be the same issue. The unlinked file still existed when creating a new one.
ACK. Makes sense.
Updated by Patrick Donnelly over 1 year ago
- Has duplicate Bug #54462: Command failed (workunit test fs/snaps/snaptest-git-ceph.sh) on smithi055 with status 128 added
Updated by Rishabh Dave over 1 year ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot over 1 year ago
- Copied to Backport #57836: pacific: Failure in snaptest-git-ceph.sh (it's an async unlink/create bug) added
Updated by Backport Bot over 1 year ago
- Copied to Backport #57837: quincy: Failure in snaptest-git-ceph.sh (it's an async unlink/create bug) added
Updated by Venky Shankar over 1 year ago
- Related to Bug #55804: qa failure: pjd link tests failed added
Updated by Venky Shankar over 1 year ago
- Related to Bug #58041: mds: src/mds/Server.cc: 3231: FAILED ceph_assert(straydn->get_name() == straydname) added
Updated by Xiubo Li about 1 year ago
- Status changed from Pending Backport to Resolved
Updated by Kotresh Hiremath Ravishankar 12 months ago
This is seen in reef QA run - https://pulpito.ceph.com/yuriw-2023-05-15_15:22:39-fs-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/7274325
Venky/Xiubo,
I think this needs a backport for reef ?
Updated by Xiubo Li 12 months ago
Kotresh Hiremath Ravishankar wrote:
This is seen in reef QA run - https://pulpito.ceph.com/yuriw-2023-05-15_15:22:39-fs-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/7274325
Venky/Xiubo,
I think this needs a backport for reef ?
Yeah, I will backport this together with another unlink relevant tracker https://tracker.ceph.com/issues/58340 later after it gets merges. Maybe soon.
Updated by Venky Shankar 10 months ago
Xiubo, this needs backported to reef, yes?
Updated by Venky Shankar 10 months ago
A bit unrelated, but mentioning here for completeness:
/a/yuriw-2023-07-14_23:37:57-fs-wip-yuri8-testing-2023-07-14-0803-reef-distro-default-smithi/7338278
`git` dumped core is one instance:
file ./remote/smithi155/coredump/1689390465.141618.core ./remote/smithi155/coredump/1689390465.141618.core: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from 'git reset --hard v0.63', real uid: 1000, effective uid: 1000, real gid: 1265, effective gid: 1265, execfn: '/usr/bin/git', platform: 'x86_64'
Updated by Xiubo Li 8 months ago
- Copied to Bug #62810: Failure in snaptest-git-ceph.sh (it's an async unlink/create bug) -- Need to fix again added