Project

General

Profile

Actions

Bug #61217

closed

ceph: corrupt snap message from mds1

Added by Xiubo Li 11 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From ceph user mail list: https://www.spinics.net/lists/ceph-users/msg77106.html

The kclient received a corrupted snap trace and was marked as readonly:

[Wed May 10 14:04:54 2023] libceph: loaded (mon/osd proto 15/24)
[Wed May 10 14:04:54 2023] ceph: loaded (mds proto 32)
[Wed May 10 14:04:54 2023] libceph: mon2 (1)192.168.32.67:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon0 (1)192.168.32.65:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon0 (1)192.168.32.65:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon1 (1)192.168.32.66:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon0 (1)192.168.32.65:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon1 (1)192.168.32.66:6789 session established
[Wed May 10 14:04:54 2023] libceph: client210370165 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210356766 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210295802 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210370170 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: mon2 (1)192.168.32.67:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon2 (1)192.168.32.67:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon0 (1)192.168.32.65:6789 session established
[Wed May 10 14:04:54 2023] libceph: mon2 (1)192.168.32.67:6789 session established
[Wed May 10 14:04:54 2023] libceph: client210370160 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210356771 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210295807 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210295812 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210370175 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:54 2023] libceph: client210295817 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 14:04:56 2023] ISO 9660 Extensions: Microsoft Joliet Level 3
[Wed May 10 14:04:56 2023] ISO 9660 Extensions: RRIP_1991A
[Wed May 10 14:05:12 2023] WARNING: CPU: 3 PID: 34 at fs/ceph/caps.c:689 ceph_add_cap+0x53e/0x550 [ceph]
[Wed May 10 14:05:12 2023] Modules linked in: ceph libceph dns_resolver nls_utf8 isofs cirrus drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt intel_rapl_msr intel_rapl_common fb_sys_fops virtio_net iTCO_wdt net_failover iTCO_vendor_support drm pcspkr failover virtio_balloon joydev lpc_ich i2c_i801 nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c sr_mod cdrom sg crct10dif_pclmul ahci crc32_pclmul libahci crc32c_intel libata ghash_clmulni_intel serio_raw virtio_blk virtio_console virtio_scsi dm_mirror dm_region_hash dm_log dm_mod fuse
[Wed May 10 14:05:12 2023] CPU: 3 PID: 34 Comm: kworker/3:0 Not tainted 4.18.0-486.el8.x86_64 #1
[Wed May 10 14:05:12 2023] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014
[Wed May 10 14:05:12 2023] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[Wed May 10 14:05:12 2023] RIP: 0010:ceph_add_cap+0x53e/0x550 [ceph]
[Wed May 10 14:05:12 2023] Code: c0 48 c7 c7 c0 39 74 c0 e8 6c 7c 5d c7 0f 0b 44 89 7c 24 04 e9 7e fc ff ff 44 8b 7c 24 04 e9 68 fe ff ff 0f 0b e9 c9 fc ff ff <0f> 0b e9 0a fe ff ff 0f 0b e9 12 fe ff ff 0f 0b 66 90 0f 1f 44 00
[Wed May 10 14:05:12 2023] RSP: 0018:ffffa4c980d87b48 EFLAGS: 00010203
[Wed May 10 14:05:12 2023] RAX: 0000000000000000 RBX: 0000000000000005 RCX: dead000000000200
[Wed May 10 14:05:12 2023] RDX: ffff8cfc56dbf7d0 RSI: ffff8cfc56dbf7d0 RDI: ffff8cfc56dbf7c8
[Wed May 10 14:05:12 2023] RBP: ffff8cfc45503970 R08: ffff8cfc56dbf7d0 R09: 0000000000000001
[Wed May 10 14:05:12 2023] R10: ffff8cfc42658780 R11: 00000000ffff8ce0 R12: 0000000000000155
[Wed May 10 14:05:12 2023] R13: ffff8cfc42658780 R14: ffff8cfc42658788 R15: 0000000000000001
[Wed May 10 14:05:12 2023] FS:  0000000000000000(0000) GS:ffff8cfdb7d80000(0000) knlGS:0000000000000000
[Wed May 10 14:05:12 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 10 14:05:12 2023] CR2: 00007f7b7b013570 CR3: 0000000104cf4000 CR4: 00000000003506e0
[Wed May 10 14:05:12 2023] Call Trace:
[Wed May 10 14:05:12 2023]  ceph_handle_caps+0xdf2/0x1780 [ceph]
[Wed May 10 14:05:12 2023]  mds_dispatch+0x13a/0x670 [ceph]
[Wed May 10 14:05:12 2023]  ceph_con_process_message+0x79/0x140 [libceph]
[Wed May 10 14:05:12 2023]  ? calc_signature+0xdf/0x110 [libceph]
[Wed May 10 14:05:12 2023]  ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[Wed May 10 14:05:12 2023]  ? available_idle_cpu+0x41/0x50
[Wed May 10 14:05:12 2023]  ceph_con_workfn+0x329/0x680 [libceph]
[Wed May 10 14:05:12 2023]  process_one_work+0x1a7/0x360
[Wed May 10 14:05:12 2023]  worker_thread+0x30/0x390
[Wed May 10 14:05:12 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 14:05:12 2023]  kthread+0x134/0x150
[Wed May 10 14:05:12 2023]  ? set_kthread_struct+0x50/0x50
[Wed May 10 14:05:12 2023]  ret_from_fork+0x35/0x40
[Wed May 10 14:05:12 2023] ---[ end trace 03a1d82065fdafbd ]---
[Wed May 10 14:05:56 2023] ISO 9660 Extensions: Microsoft Joliet Level 3
[Wed May 10 14:05:56 2023] ISO 9660 Extensions: RRIP_1991A
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:24 2023] ceph: mds1 reconnect start
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:01:25 2023] ceph: mds1 reconnect success
[Wed May 10 16:02:12 2023] ceph: mds1 caps stale
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:02:19 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:06 2023] ceph: update_snap_trace error -5
[Wed May 10 16:03:06 2023] ceph: ceph_update_snap_trace failed to blocklist (3)192.168.48.142:0: -13
[Wed May 10 16:03:06 2023] ------------[ cut here ]------------
[Wed May 10 16:03:06 2023] ceph_update_snap_trace: do remount to continue after corrupted snaptrace is fixed
[Wed May 10 16:03:06 2023] WARNING: CPU: 0 PID: 5143 at fs/ceph/snap.c:841 ceph_update_snap_trace.cold.21+0x68/0x137 [ceph]
[Wed May 10 16:03:06 2023] Modules linked in: ceph libceph dns_resolver nls_utf8 isofs cirrus drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt intel_rapl_msr intel_rapl_common fb_sys_fops virtio_net iTCO_wdt net_failover iTCO_vendor_support drm pcspkr failover virtio_balloon joydev lpc_ich i2c_i801 nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c sr_mod cdrom sg crct10dif_pclmul ahci crc32_pclmul libahci crc32c_intel libata ghash_clmulni_intel serio_raw virtio_blk virtio_console virtio_scsi dm_mirror dm_region_hash dm_log dm_mod fuse
[Wed May 10 16:03:06 2023] CPU: 0 PID: 5143 Comm: kworker/0:0 Tainted: G        W        --------- -  - 4.18.0-486.el8.x86_64 #1
[Wed May 10 16:03:06 2023] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014
[Wed May 10 16:03:06 2023] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[Wed May 10 16:03:06 2023] RIP: 0010:ceph_update_snap_trace.cold.21+0x68/0x137 [ceph]
[Wed May 10 16:03:06 2023] Code: c6 74 c0 48 c7 c5 10 5e 74 c0 48 89 c8 49 89 e8 48 89 c2 44 89 0c 24 48 c7 c6 60 bf 73 c0 48 c7 c7 48 61 74 c0 e8 08 e7 5b c7 <0f> 0b 44 8b 0c 24 e9 6c e8 fe ff 44 89 ce 48 c7 c7 60 5e 74 c0 44
[Wed May 10 16:03:06 2023] RSP: 0018:ffffa4c981fbbbb8 EFLAGS: 00010286
[Wed May 10 16:03:06 2023] RAX: 0000000000000000 RBX: ffff8cfc4649a089 RCX: 0000000000000027
[Wed May 10 16:03:06 2023] RDX: 0000000000000027 RSI: 00000000ffff7fff RDI: ffff8cfdb7c1e690
[Wed May 10 16:03:06 2023] RBP: ffffffffc0745e10 R08: 0000000000000000 R09: c0000000ffff7fff
[Wed May 10 16:03:06 2023] R10: 0000000000000001 R11: ffffa4c981fbb9d0 R12: ffff8cfc623c6b48
[Wed May 10 16:03:06 2023] R13: 0000000000000000 R14: ffff8cfc623c6b18 R15: ffff8cfb65855f00
[Wed May 10 16:03:06 2023] FS:  0000000000000000(0000) GS:ffff8cfdb7c00000(0000) knlGS:0000000000000000
[Wed May 10 16:03:06 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 10 16:03:06 2023] CR2: 00007fedace4d008 CR3: 000000025f67c000 CR4: 00000000003506f0
[Wed May 10 16:03:06 2023] Call Trace:
[Wed May 10 16:03:06 2023]  ceph_handle_snap+0x188/0x4e0 [ceph]
[Wed May 10 16:03:06 2023]  mds_dispatch+0x17a/0x670 [ceph]
[Wed May 10 16:03:06 2023]  ceph_con_process_message+0x79/0x140 [libceph]
[Wed May 10 16:03:06 2023]  ? calc_signature+0xdf/0x110 [libceph]
[Wed May 10 16:03:06 2023]  ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[Wed May 10 16:03:06 2023]  ? available_idle_cpu+0x41/0x50
[Wed May 10 16:03:06 2023]  ceph_con_workfn+0x329/0x680 [libceph]
[Wed May 10 16:03:06 2023]  process_one_work+0x1a7/0x360
[Wed May 10 16:03:06 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:03:06 2023]  worker_thread+0x30/0x390
[Wed May 10 16:03:06 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:03:06 2023]  kthread+0x134/0x150
[Wed May 10 16:03:06 2023]  ? set_kthread_struct+0x50/0x50
[Wed May 10 16:03:06 2023]  ret_from_fork+0x35/0x40
[Wed May 10 16:03:06 2023] ---[ end trace 03a1d82065fdafbe ]---
[Wed May 10 16:03:06 2023] ceph: corrupt snap message from mds1
[Wed May 10 16:03:06 2023] header: 00000000: 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000010: 12 03 7f 00 01 00 00 01 00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000020: 00 00 00 00 02 01 00 00 00 00 00 00 00 01 00 00  ................
[Wed May 10 16:03:06 2023] header: 00000030: 00 98 0d 60 93                                   ...`.
[Wed May 10 16:03:06 2023]  front: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 00000010: 0c 00 00 00 88 00 00 00 d1 c0 71 38 00 01 00 00  ..........q8....
[Wed May 10 16:03:06 2023]  front: 00000020: 22 c8 71 38 00 01 00 00 d7 c7 71 38 00 01 00 00  ".q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000030: d9 c7 71 38 00 01 00 00 d4 c7 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000040: f1 c0 71 38 00 01 00 00 d4 c0 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000050: 20 c8 71 38 00 01 00 00 1d c8 71 38 00 01 00 00   .q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000060: ec c0 71 38 00 01 00 00 d6 c0 71 38 00 01 00 00  ..q8......q8....
[Wed May 10 16:03:06 2023]  front: 00000070: ef c0 71 38 00 01 00 00 6a 11 2d 1a 00 01 00 00  ..q8....j.-.....
[Wed May 10 16:03:06 2023]  front: 00000080: 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 00000090: ee 01 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000a0: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000b0: 01 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000c0: 01 00 00 00 00 00 00 00 02 09 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000d0: 05 00 00 00 00 00 00 00 01 09 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000e0: ff 08 00 00 00 00 00 00 fd 08 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023]  front: 000000f0: fb 08 00 00 00 00 00 00 f9 08 00 00 00 00 00 00  ................
[Wed May 10 16:03:06 2023] footer: 00000000: ca 39 06 07 00 00 00 00 00 00 00 00 42 06 63 61  .9..........B.ca
[Wed May 10 16:03:06 2023] footer: 00000010: 7b 4b 5d 2d 05                                   {K]-.
[Wed May 10 16:03:06 2023] ceph: ceph_do_invalidate_pages: inode 1001a2d116a.fffffffffffffffe is shut down
[Wed May 10 16:03:07 2023] WARNING: CPU: 3 PID: 5069 at fs/ceph/mds_client.c:4623 check_session_state+0x67/0x70 [ceph]
[Wed May 10 16:03:07 2023] Modules linked in: ceph libceph dns_resolver nls_utf8 isofs cirrus drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt intel_rapl_msr intel_rapl_common fb_sys_fops virtio_net iTCO_wdt net_failover iTCO_vendor_support drm pcspkr failover virtio_balloon joydev lpc_ich i2c_i801 nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c sr_mod cdrom sg crct10dif_pclmul ahci crc32_pclmul libahci crc32c_intel libata ghash_clmulni_intel serio_raw virtio_blk virtio_console virtio_scsi dm_mirror dm_region_hash dm_log dm_mod fuse
[Wed May 10 16:03:07 2023] CPU: 3 PID: 5069 Comm: kworker/3:1 Tainted: G        W        --------- -  - 4.18.0-486.el8.x86_64 #1
[Wed May 10 16:03:07 2023] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014
[Wed May 10 16:03:07 2023] Workqueue: events delayed_work [ceph]
[Wed May 10 16:03:07 2023] RIP: 0010:check_session_state+0x67/0x70 [ceph]
[Wed May 10 16:03:07 2023] Code: 3f 36 ed c8 48 39 d0 0f 88 5a 6a 00 00 89 d8 5b e9 6e ea 2c c8 48 83 7f 10 00 74 f1 48 8b 07 48 8b 00 8b 40 28 83 f8 04 74 e3 <0f> 0b eb df 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56 49 89 f6 41
[Wed May 10 16:03:07 2023] RSP: 0018:ffffa4c9844b7e48 EFLAGS: 00010202
[Wed May 10 16:03:07 2023] RAX: 0000000000000006 RBX: 0000000000000000 RCX: 0000000000000007
[Wed May 10 16:03:07 2023] RDX: 0000000000000005 RSI: ffff8cfc469e07cc RDI: ffff8cfc469e0000
[Wed May 10 16:03:07 2023] RBP: ffff8cfc469e0000 R08: 0000000000000000 R09: 000073746e657665
[Wed May 10 16:03:07 2023] R10: 8080808080808080 R11: ffffffff8965c148 R12: ffff8cfc491d5130
[Wed May 10 16:03:07 2023] R13: 0000000000000000 R14: ffff8cfc491d5008 R15: ffff8cfc491d5000
[Wed May 10 16:03:07 2023] FS:  0000000000000000(0000) GS:ffff8cfdb7d80000(0000) knlGS:0000000000000000
[Wed May 10 16:03:07 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 10 16:03:07 2023] CR2: 00007fedace4d008 CR3: 00000001d4506000 CR4: 00000000003506e0
[Wed May 10 16:03:07 2023] Call Trace:
[Wed May 10 16:03:07 2023]  delayed_work+0x143/0x240 [ceph]
[Wed May 10 16:03:07 2023]  process_one_work+0x1a7/0x360
[Wed May 10 16:03:07 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:03:07 2023]  worker_thread+0x30/0x390
[Wed May 10 16:03:07 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:03:07 2023]  kthread+0x134/0x150
[Wed May 10 16:03:07 2023]  ? set_kthread_struct+0x50/0x50
[Wed May 10 16:03:07 2023]  ret_from_fork+0x35/0x40
[Wed May 10 16:03:07 2023] ---[ end trace 03a1d82065fdafbf ]---
[Wed May 10 16:03:09 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:09 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:09 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:09 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:09 2023] ceph: mds1 caps renewed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:03:09 2023] ceph: mds1 recovery completed
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 20011a825dd.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 200118b9384.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 1001c664b26.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 1001c665c34.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 1001c66a2b6.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 2000e6d3a15.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 100219a7adf.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 10023fb4137.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 10024484920.fffffffffffffffe is shut down
[Wed May 10 16:04:10 2023] ceph: ceph_do_invalidate_pages: inode 2000e6d3a4b.fffffffffffffffe is shut down
[Wed May 10 16:04:12 2023] libceph: mds1 (1)192.168.32.74:6801 socket closed (con state OPEN)
[Wed May 10 16:04:12 2023] libceph: mds1 (1)192.168.32.74:6801 socket closed (con state OPEN)
[Wed May 10 16:04:12 2023] libceph: mds1 (1)192.168.32.74:6801 socket closed (con state OPEN)
[Wed May 10 16:04:12 2023] libceph: mds1 (1)192.168.32.74:6801 socket closed (con state OPEN)
[Wed May 10 16:04:12 2023] libceph: mds1 (1)192.168.32.74:6801 socket closed (con state OPEN)
[Wed May 10 16:04:13 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:13 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:13 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:13 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:13 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:13 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:13 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:13 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:13 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:13 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:14 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:14 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:14 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:14 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:14 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:14 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:14 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:14 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:14 2023] libceph: wrong peer, want (1)192.168.32.74:6801/3586644765, got (1)192.168.32.74:6801/3778319964
[Wed May 10 16:04:14 2023] libceph: mds1 (1)192.168.32.74:6801 wrong peer at address
[Wed May 10 16:04:20 2023] ceph: mds1 reconnect start
[Wed May 10 16:04:20 2023] ceph: mds1 reconnect start
[Wed May 10 16:04:20 2023] ceph: mds1 reconnect start
[Wed May 10 16:04:20 2023] ceph: mds1 reconnect start
[Wed May 10 16:04:20 2023] ceph: mds1 reconnect start
[Wed May 10 16:04:23 2023] ceph: mds1 reconnect success
[Wed May 10 16:04:23 2023] ceph: mds1 reconnect success
[Wed May 10 16:04:23 2023] ceph: mds1 reconnect success
[Wed May 10 16:04:23 2023] ceph: mds1 reconnect success
[Wed May 10 16:04:23 2023] ceph: mds1 reconnect success
[Wed May 10 16:06:14 2023] ceph: mds1 caps stale
[Wed May 10 16:06:14 2023] ceph: mds1 caps stale
[Wed May 10 16:06:14 2023] ceph: mds1 caps stale
[Wed May 10 16:06:14 2023] ceph: mds1 caps stale
[Wed May 10 16:06:14 2023] ceph: mds1 caps stale
[Wed May 10 16:06:45 2023] ceph: mds1 caps renewed
[Wed May 10 16:06:45 2023] ceph: mds1 caps renewed
[Wed May 10 16:06:45 2023] ceph: mds1 caps renewed
[Wed May 10 16:06:45 2023] ceph: mds1 caps renewed
[Wed May 10 16:06:45 2023] ceph: mds1 caps renewed
[Wed May 10 16:06:45 2023] ceph: mds1 recovery completed
[Wed May 10 16:06:45 2023] ceph: mds1 recovery completed
[Wed May 10 16:06:45 2023] ceph: mds1 recovery completed
[Wed May 10 16:06:45 2023] ceph: mds1 recovery completed
[Wed May 10 16:06:45 2023] ceph: mds1 recovery completed
[Wed May 10 16:22:24 2023] WARNING: CPU: 1 PID: 5392 at fs/ceph/caps.c:689 ceph_add_cap+0x53e/0x550 [ceph]
[Wed May 10 16:22:24 2023] Modules linked in: ceph libceph dns_resolver nls_utf8 isofs cirrus drm_shmem_helper drm_kms_helper syscopyarea sysfillrect sysimgblt intel_rapl_msr intel_rapl_common fb_sys_fops virtio_net iTCO_wdt net_failover iTCO_vendor_support drm pcspkr failover virtio_balloon joydev lpc_ich i2c_i801 nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c sr_mod cdrom sg crct10dif_pclmul ahci crc32_pclmul libahci crc32c_intel libata ghash_clmulni_intel serio_raw virtio_blk virtio_console virtio_scsi dm_mirror dm_region_hash dm_log dm_mod fuse
[Wed May 10 16:22:24 2023] CPU: 1 PID: 5392 Comm: kworker/1:0 Tainted: G        W        --------- -  - 4.18.0-486.el8.x86_64 #1
[Wed May 10 16:22:24 2023] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module_el8.7.0+3346+68867adb 04/01/2014
[Wed May 10 16:22:24 2023] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[Wed May 10 16:22:24 2023] RIP: 0010:ceph_add_cap+0x53e/0x550 [ceph]
[Wed May 10 16:22:24 2023] Code: c0 48 c7 c7 c0 39 74 c0 e8 6c 7c 5d c7 0f 0b 44 89 7c 24 04 e9 7e fc ff ff 44 8b 7c 24 04 e9 68 fe ff ff 0f 0b e9 c9 fc ff ff <0f> 0b e9 0a fe ff ff 0f 0b e9 12 fe ff ff 0f 0b 66 90 0f 1f 44 00
[Wed May 10 16:22:24 2023] RSP: 0018:ffffa4c984a97b48 EFLAGS: 00010207
[Wed May 10 16:22:24 2023] RAX: 0000000000000000 RBX: 0000000000000005 RCX: dead000000000200
[Wed May 10 16:22:24 2023] RDX: ffff8cfc56d7d7d0 RSI: ffff8cfc56d7d7d0 RDI: ffff8cfc56d7d7c8
[Wed May 10 16:22:24 2023] RBP: ffff8cfc45507570 R08: ffff8cfc56d7d7d0 R09: 0000000000000001
[Wed May 10 16:22:24 2023] R10: ffff8cfce4206f00 R11: 00000000ffff8ce0 R12: 0000000000000155
[Wed May 10 16:22:24 2023] R13: ffff8cfce4206f00 R14: ffff8cfce4206f08 R15: 0000000000000001
[Wed May 10 16:22:24 2023] FS:  0000000000000000(0000) GS:ffff8cfdb7c80000(0000) knlGS:0000000000000000
[Wed May 10 16:22:24 2023] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed May 10 16:22:24 2023] CR2: 00007f7b61a629f0 CR3: 00000001049f0000 CR4: 00000000003506e0
[Wed May 10 16:22:24 2023] Call Trace:
[Wed May 10 16:22:24 2023]  ceph_handle_caps+0xdf2/0x1780 [ceph]
[Wed May 10 16:22:24 2023]  mds_dispatch+0x13a/0x670 [ceph]
[Wed May 10 16:22:24 2023]  ceph_con_process_message+0x79/0x140 [libceph]
[Wed May 10 16:22:24 2023]  ? calc_signature+0xdf/0x110 [libceph]
[Wed May 10 16:22:24 2023]  ceph_con_v1_try_read+0x5d7/0xf30 [libceph]
[Wed May 10 16:22:24 2023]  ceph_con_workfn+0x329/0x680 [libceph]
[Wed May 10 16:22:24 2023]  process_one_work+0x1a7/0x360
[Wed May 10 16:22:24 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:22:24 2023]  worker_thread+0x30/0x390
[Wed May 10 16:22:24 2023]  ? create_worker+0x1a0/0x1a0
[Wed May 10 16:22:24 2023]  kthread+0x134/0x150
[Wed May 10 16:22:24 2023]  ? set_kthread_struct+0x50/0x50
[Wed May 10 16:22:24 2023]  ret_from_fork+0x35/0x40
[Wed May 10 16:22:24 2023] ---[ end trace 03a1d82065fdafc0 ]---
[Wed May 10 17:54:14 2023] libceph: mon0 (1)192.168.32.65:6789 session established
[Wed May 10 17:54:14 2023] libceph: client210426330 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9
[Wed May 10 17:54:16 2023] libceph: mon2 (1)192.168.32.67:6789 session established
[Wed May 10 17:54:16 2023] libceph: client210350757 fsid e4ece518-f2cb-4708-b00f-b6bf511e91d9

Related issues 4 (0 open4 closed)

Copied from Linux kernel client - Bug #61200: ceph: corrupt snap message from mds1ResolvedXiubo Li

Actions
Copied to CephFS - Backport #62192: reef: ceph: corrupt snap message from mds1ResolvedXiubo LiActions
Copied to CephFS - Backport #62193: pacific: ceph: corrupt snap message from mds1ResolvedXiubo LiActions
Copied to CephFS - Backport #62194: quincy: ceph: corrupt snap message from mds1ResolvedXiubo LiActions
Actions #1

Updated by Xiubo Li 11 months ago

  • Copied from Bug #61200: ceph: corrupt snap message from mds1 added
Actions #2

Updated by Xiubo Li 11 months ago

Introduced by :

commit 93e7267757508520dfc22cff1ab20558bd4a44d4
Author: Yan, Zheng <zyan@redhat.com>
Date:   Fri Jul 21 21:40:46 2017 +0800

    mds: send snap related messages centrally during mds recovery

    sending CEPH_SNAP_OP_SPLIT and CEPH_SNAP_OP_UPDATE messages to
    clients centrally in MDCache::open_snaprealms()

    Signed-off-by: "Yan, Zheng" <zyan@redhat.com>

It will send a useless split_realm list to clients for the CEPH_SNAP_OP_UPDATE.

Actions #3

Updated by Ilya Dryomov 11 months ago

  • Regression changed from No to Yes
Actions #4

Updated by Xiubo Li 11 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 51536
Actions #5

Updated by Venky Shankar 9 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to reef,quincy,pacific

Xiubo, this needs backporting to p/q/r, yes?

Actions #6

Updated by Backport Bot 9 months ago

  • Copied to Backport #62192: reef: ceph: corrupt snap message from mds1 added
Actions #7

Updated by Backport Bot 9 months ago

  • Copied to Backport #62193: pacific: ceph: corrupt snap message from mds1 added
Actions #8

Updated by Backport Bot 9 months ago

  • Copied to Backport #62194: quincy: ceph: corrupt snap message from mds1 added
Actions #9

Updated by Backport Bot 9 months ago

  • Tags set to backport_processed
Actions #10

Updated by Xiubo Li 6 months ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF