Project

General

Profile

Actions

Bug #46419

closed

ceph: ceph_add_cap: couldn't find snap realm 110

Added by Jeff Layton almost 4 years ago. Updated almost 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
fs/ceph
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Mark Nelson saw this pop in some testing, which looks like a case where we caught a cap IMPORT with a snap realm that that the client was not aware of (yet):

[ 8663.340065] ceph: ceph_add_cap: couldn't find snap realm 110
[ 8663.347293] ------------[ cut here ]------------
[ 8663.353864] WARNING: CPU: 10 PID: 33414 at fs/ceph/caps.c:707 ceph_add_cap.cold+0x14/0x1b [ceph]
[ 8663.366633] Modules linked in: ceph libceph fscache xfs libcrc32c iscsi_target_mod target_core_mod intel_rapl_msr intel_rapl_common nfit libnvdimm ppdev parport_pc intel_rapl_perf ena parport i2c_piix4 drm ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel nvme nvme_core ghash_clmulni_intel serio_raw
[ 8663.396387] CPU: 10 PID: 33414 Comm: kworker/10:2 Not tainted 5.7.6-201.fc32.x86_64 #1
[ 8663.408770] Hardware name: Amazon EC2 i3en.3xlarge/, BIOS 1.0 10/16/2017
[ 8663.416409] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 8663.423487] RIP: 0010:ceph_add_cap.cold+0x14/0x1b [ceph]
[ 8663.430461] Code: 74 24 10 48 8b 6c 24 18 41 8d 0c 04 e8 22 20 99 c1 e9 b6 75 fe ff 48 8b b4 24 90 00 00 00 48 c7 c7 d8 24 7c c0 e8 09 20 99 c1 <0f> 0b e9 2f 86 fe ff 48 8b 55 08 4c 8b 43 30 48 c7 c7 28 26 7c c0
[ 8663.452325] RSP: 0018:ffffa84a8229bb58 EFLAGS: 00010282
[ 8663.459270] RAX: 0000000000000030 RBX: 0000000000000001 RCX: 0000000000000000
[ 8663.467196] RDX: ffff94c7d8ea7cc0 RSI: ffff94c7d8e99cc8 RDI: ffff94c7d8e99cc8
[ 8663.475009] RBP: 0000000000000045 R08: 0000000000000410 R09: 0000000000000019
[ 8663.482842] R10: 0000000000000730 R11: 0000000000000000 R12: ffff94c719be9a60
[ 8663.490640] R13: ffff94c65edb3528 R14: ffff94c7ca261508 R15: 0000000000000010
[ 8663.498496] FS:  0000000000000000(0000) GS:ffff94c7d8e80000(0000) knlGS:0000000000000000
[ 8663.511091] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8663.518307] CR2: 000055da16b73000 CR3: 000000178d7ac006 CR4: 00000000007606e0
[ 8663.526130] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8663.533945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8663.541790] PKRU: 55555554
[ 8663.547492] Call Trace:
[ 8663.553101]  ? kmem_cache_alloc+0x168/0x220
[ 8663.559513]  ? __cap_is_valid+0x1c/0xb0 [ceph]
[ 8663.566058]  ceph_handle_caps+0x90e/0x19b0 [ceph]
[ 8663.572725]  dispatch+0x283/0x1400 [ceph]
[ 8663.579035]  ? inet_recvmsg+0x4d/0xd0
[ 8663.585188]  ? ceph_tcp_recvmsg+0x62/0x80 [libceph]
[ 8663.591895]  ceph_con_workfn+0x19f7/0x2800 [libceph]
[ 8663.598697]  ? __switch_to_asm+0x40/0x70
[ 8663.604952]  ? __switch_to_asm+0x34/0x70
[ 8663.611203]  ? __switch_to_asm+0x40/0x70
[ 8663.617434]  ? __switch_to_asm+0x34/0x70
[ 8663.623670]  ? __switch_to_asm+0x40/0x70
[ 8663.629956]  ? __switch_to_asm+0x40/0x70
[ 8663.636235]  ? __switch_to_asm+0x34/0x70
[ 8663.642499]  ? __switch_to+0x80/0x420
[ 8663.648621]  ? __switch_to_asm+0x34/0x70
[ 8663.654868]  process_one_work+0x1b4/0x380
[ 8663.661178]  worker_thread+0x53/0x3e0
[ 8663.667333]  ? process_one_work+0x380/0x380
[ 8663.673709]  kthread+0x115/0x140
[ 8663.679605]  ? __kthread_bind_mask+0x60/0x60
[ 8663.686019]  ret_from_fork+0x35/0x40
[ 8663.692129] ---[ end trace 12f255404f1ebfed ]---
[ 8663.698783] ------------[ cut here ]------------
[ 8663.705411] refcount_t: underflow; use-after-free.
[ 8663.712156] WARNING: CPU: 10 PID: 33414 at lib/refcount.c:28 refcount_warn_saturate+0xa6/0xf0
[ 8663.724800] Modules linked in: ceph libceph fscache xfs libcrc32c iscsi_target_mod target_core_mod intel_rapl_msr intel_rapl_common nfit libnvdimm ppdev parport_pc intel_rapl_perf ena parport i2c_piix4 drm ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel nvme nvme_core ghash_clmulni_intel serio_raw
[ 8663.754526] CPU: 10 PID: 33414 Comm: kworker/10:2 Tainted: G        W         5.7.6-201.fc32.x86_64 #1
[ 8663.767549] Hardware name: Amazon EC2 i3en.3xlarge/, BIOS 1.0 10/16/2017
[ 8663.775165] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 8663.782180] RIP: 0010:refcount_warn_saturate+0xa6/0xf0
[ 8663.789022] Code: 05 7d a3 4f 01 01 e8 20 50 bc ff 0f 0b c3 80 3d 6b a3 4f 01 00 75 95 48 c7 c7 20 d6 3b 83 c6 05 5b a3 4f 01 01 e8 01 50 bc ff <0f> 0b c3 80 3d 4a a3 4f 01 00 0f 85 72 ff ff ff 48 c7 c7 78 d6 3b
[ 8663.810286] RSP: 0018:ffffa84a8229bba0 EFLAGS: 00010286
[ 8663.816418] RAX: 0000000000000026 RBX: ffff94c7829bed28 RCX: 0000000000000000
[ 8663.823359] RDX: ffff94c7d8ea7cc0 RSI: ffff94c7d8e99cc8 RDI: ffff94c7d8e99cc8
[ 8663.830276] RBP: ffff94c7cd2cb500 R08: 000000000000043e R09: 0000000000000019
[ 8663.837210] R10: 000000000000072e R11: 0000000000000000 R12: ffff94c65ea90800
[ 8663.844163] R13: ffff94c65ea908d0 R14: ffff94c7ca261000 R15: ffff94c65edb3528
[ 8663.851072] FS:  0000000000000000(0000) GS:ffff94c7d8e80000(0000) knlGS:0000000000000000
[ 8663.862170] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8663.868527] CR2: 000055da16b73000 CR3: 000000178d7ac006 CR4: 00000000007606e0
[ 8663.875445] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8663.882348] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8663.889237] PKRU: 55555554
[ 8663.894351] Call Trace:
[ 8663.899337]  __destroy_snap_realm+0xc2/0x100 [ceph]
[ 8663.905326]  ceph_put_snap_realm+0x71/0xf0 [ceph]
[ 8663.911346]  ceph_handle_caps+0x96a/0x19b0 [ceph]
[ 8663.917460]  dispatch+0x283/0x1400 [ceph]
[ 8663.923103]  ? inet_recvmsg+0x4d/0xd0
[ 8663.928622]  ? ceph_tcp_recvmsg+0x62/0x80 [libceph]
[ 8663.934672]  ceph_con_workfn+0x19f7/0x2800 [libceph]
[ 8663.940697]  ? __switch_to_asm+0x40/0x70
[ 8663.946302]  ? __switch_to_asm+0x34/0x70
[ 8663.951892]  ? __switch_to_asm+0x40/0x70
[ 8663.957529]  ? __switch_to_asm+0x34/0x70
[ 8663.963121]  ? __switch_to_asm+0x40/0x70
[ 8663.968740]  ? __switch_to_asm+0x40/0x70
[ 8663.974355]  ? __switch_to_asm+0x34/0x70
[ 8663.979935]  ? __switch_to+0x80/0x420
[ 8663.985519]  ? __switch_to_asm+0x34/0x70
[ 8663.991134]  process_one_work+0x1b4/0x380
[ 8663.996780]  worker_thread+0x53/0x3e0
[ 8664.002300]  ? process_one_work+0x380/0x380
[ 8664.008024]  kthread+0x115/0x140
[ 8664.013343]  ? __kthread_bind_mask+0x60/0x60
[ 8664.019104]  ret_from_fork+0x35/0x40
[ 8664.024550] ---[ end trace 12f255404f1ebfee ]---
[ 8664.030525] ------------[ cut here ]------------
[ 8664.036545] kernel BUG at mm/slub.c:304!
[ 8664.042135] invalid opcode: 0000 [#1] SMP PTI
[ 8664.047921] CPU: 10 PID: 33414 Comm: kworker/10:2 Tainted: G        W         5.7.6-201.fc32.x86_64 #1
[ 8664.059542] Hardware name: Amazon EC2 i3en.3xlarge/, BIOS 1.0 10/16/2017
[ 8664.066301] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 8664.072561] RIP: 0010:kfree+0x1ef/0x200
[ 8664.078147] Code: 5b 5d 41 5c 41 5d e9 d0 2d fd ff 4d 89 e1 41 b8 01 00 00 00 4c 89 d1 48 89 da 48 89 ee 4c 89 ef e8 26 fa ff ff e9 2a ff ff ff <0f> 0b 0f 0b 48 8b 05 16 1e 53 01 e9 43 fe ff ff 90 0f 1f 44 00 00
[ 8664.097449] RSP: 0018:ffffa84a8229bb90 EFLAGS: 00010246
[ 8664.103577] RAX: ffff94c7cd2cb500 RBX: ffff94c7cd2cb500 RCX: ffff94c7cd2cb580
[ 8664.110510] RDX: 0000000001351b7e RSI: ffffa84a8229bb90 RDI: ffff94c7d2806f40
[ 8664.117389] RBP: fffff858de34b280 R08: 000000000000043e R09: 0000000000000019
[ 8664.124293] R10: ffff94c7cd2cb500 R11: 0000000000000000 R12: ffffffffc07a7381
[ 8664.131234] R13: ffff94c7d2806f40 R14: ffff94c7ca261000 R15: ffff94c65edb3528
[ 8664.138144] FS:  0000000000000000(0000) GS:ffff94c7d8e80000(0000) knlGS:0000000000000000
[ 8664.149261] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8664.155612] CR2: 000055da16b73000 CR3: 000000178d7ac006 CR4: 00000000007606e0
[ 8664.162581] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8664.169482] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8664.176453] PKRU: 55555554
[ 8664.181526] Call Trace:
[ 8664.186516]  ceph_put_snap_realm+0x71/0xf0 [ceph]
[ 8664.192425]  ceph_handle_caps+0x96a/0x19b0 [ceph]
[ 8664.198348]  dispatch+0x283/0x1400 [ceph]
[ 8664.203949]  ? inet_recvmsg+0x4d/0xd0
[ 8664.209418]  ? ceph_tcp_recvmsg+0x62/0x80 [libceph]
[ 8664.215402]  ceph_con_workfn+0x19f7/0x2800 [libceph]
[ 8664.221427]  ? __switch_to_asm+0x40/0x70
[ 8664.227059]  ? __switch_to_asm+0x34/0x70
[ 8664.232662]  ? __switch_to_asm+0x40/0x70
[ 8664.238253]  ? __switch_to_asm+0x34/0x70
[ 8664.243856]  ? __switch_to_asm+0x40/0x70
[ 8664.249442]  ? __switch_to_asm+0x40/0x70
[ 8664.255083]  ? __switch_to_asm+0x34/0x70
[ 8664.260743]  ? __switch_to+0x80/0x420
[ 8664.266319]  ? __switch_to_asm+0x34/0x70
[ 8664.272222]  process_one_work+0x1b4/0x380
[ 8664.277970]  worker_thread+0x53/0x3e0
[ 8664.283552]  ? process_one_work+0x380/0x380
[ 8664.289378]  kthread+0x115/0x140
[ 8664.294683]  ? __kthread_bind_mask+0x60/0x60
[ 8664.300420]  ret_from_fork+0x35/0x40
[ 8664.305872] Modules linked in: ceph libceph fscache xfs libcrc32c iscsi_target_mod target_core_mod intel_rapl_msr intel_rapl_common nfit libnvdimm ppdev parport_pc intel_rapl_perf ena parport i2c_piix4 drm ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel nvme nvme_core ghash_clmulni_intel serio_raw
[ 8664.332441] ---[ end trace 12f255404f1ebfef ]---
[ 8664.338411] RIP: 0010:kfree+0x1ef/0x200
[ 8664.343944] Code: 5b 5d 41 5c 41 5d e9 d0 2d fd ff 4d 89 e1 41 b8 01 00 00 00 4c 89 d1 48 89 da 48 89 ee 4c 89 ef e8 26 fa ff ff e9 2a ff ff ff <0f> 0b 0f 0b 48 8b 05 16 1e 53 01 e9 43 fe ff ff 90 0f 1f 44 00 00
[ 8664.363345] RSP: 0018:ffffa84a8229bb90 EFLAGS: 00010246
[ 8664.369595] RAX: ffff94c7cd2cb500 RBX: ffff94c7cd2cb500 RCX: ffff94c7cd2cb580
[ 8664.376579] RDX: 0000000001351b7e RSI: ffffa84a8229bb90 RDI: ffff94c7d2806f40
[ 8664.383517] RBP: fffff858de34b280 R08: 000000000000043e R09: 0000000000000019
[ 8664.390448] R10: ffff94c7cd2cb500 R11: 0000000000000000 R12: ffffffffc07a7381
[ 8664.506212] R13: ffff94c7d2806f40 R14: ffff94c7ca261000 R15: ffff94c65edb3528
[ 8664.513158] FS:  0000000000000000(0000) GS:ffff94c7d8e80000(0000) knlGS:0000000000000000
[ 8664.524246] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8664.530612] CR2: 000055da16b73000 CR3: 000000178d7ac006 CR4: 00000000007606e0
[ 8664.537534] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8664.544428] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8664.551361] PKRU: 55555554
[ 8959.725126] libceph: mds1 (1)10.0.1.10:6811 socket closed (con state OPEN)
[ 8959.727768] libceph: mds3 (1)10.0.1.9:6811 socket closed (con state OPEN)
[ 8959.727815] libceph: mds6 (1)10.0.1.8:6809 socket closed (con state OPEN)
[ 8959.728119] libceph: mds17 (1)10.0.1.2:6811 socket closed (con state OPEN)
[ 8959.728484] libceph: mds15 (1)10.0.1.3:6811 socket closed (con state OPEN)
[ 8959.728638] libceph: mds12 (1)10.0.1.5:6809 socket closed (con state OPEN)
[ 8959.728663] libceph: mds14 (1)10.0.1.4:6809 socket closed (con state OPEN)
[ 8959.728825] libceph: mds2 (1)10.0.1.10:6809 socket closed (con state OPEN)
[ 8959.728924] libceph: mds7 (1)10.0.1.7:6811 socket closed (con state OPEN)
[ 8959.728971] libceph: mds11 (1)10.0.1.5:6811 socket closed (con state OPEN)
[ 8959.729221] libceph: mds13 (1)10.0.1.4:6811 socket closed (con state OPEN)
[ 8959.730620] libceph: mds5 (1)10.0.1.8:6811 socket closed (con state OPEN)
[ 8959.730716] libceph: mds9 (1)10.0.1.6:6811 socket closed (con state OPEN)
[ 8959.731398] libceph: mds20 (1)10.0.1.1:6811 socket closed (con state OPEN)
[ 8959.732099] libceph: mds8 (1)10.0.1.7:6809 socket closed (con state OPEN)
[ 8959.738976] libceph: mds19 (1)10.0.1.1:6813 socket closed (con state OPEN)
Actions

Also available in: Atom PDF