Project

General

Profile

Bug #55857

strange auth cap state leads to potential deadlock

Added by Jeff Layton almost 2 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

While testing for tracker #55823, I hit this bug:

[ 2064.595115] libceph: mon0 (2)172.21.15.5:3300 session established
[ 2064.602282] libceph: client15462 fsid e53a7408-e36e-11ec-be41-001a4aab830c
[ 2130.688660] ceph: mds1 is already opened
[ 2172.458428] ceph: mds2 is already opened
[ 2182.104930] ceph: mds4 is already opened
[ 2994.094237] ------------[ cut here ]------------
[ 2994.099095] WARNING: CPU: 2 PID: 184823 at fs/ceph/caps.c:740 ceph_add_cap+0x57d/0x5a0 [ceph]
[ 2994.107790] Modules linked in: ceph libceph dns_resolver fscache netfs veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink bridge stp llc binfmt_misc overlay xfs libcrc32c intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass iTCO_wdt joydev mei_me iTCO_vendor_support ipmi_ssif crct10dif_pclmul i2c_i801 crc32_pclmul mei lpc_ich ghash_clmulni_intel i2c_smbus mfd_core wmi acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler sunrpc ioatdma acpi_pad acpi_power_meter ixgbe igb mdio nvme crc32c_intel ptp nvme_core i2c_algo_bit pps_core dca fuse
[ 2994.164484] CPU: 2 PID: 184823 Comm: kworker/2:0 Tainted: G S                5.18.0-ceph-g4168fe3b02d6 #1
[ 2994.174187] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
[ 2994.181802] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 2994.187422] RIP: 0010:ceph_add_cap+0x57d/0x5a0 [ceph]
[ 2994.192589] Code: 00 48 c7 c6 d8 61 8e a0 48 c7 c7 08 38 8f a0 44 89 4c 24 04 e8 ae e5 1c e1 0f 0b e9 2f fc ff ff 44 8b 4c 24 04 e9 02 fe ff ff <0f> 0b e9 72 fc ff ff 0f 0b e9 a8 fd ff ff 0f 0b e9 b0 fd ff ff 0f
[ 2994.211519] RSP: 0018:ffffc9000b2379f0 EFLAGS: 00010246
[ 2994.216849] RAX: ffff888302cc0870 RBX: ffff88843feea680 RCX: 0000000000000000
[ 2994.224094] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88843feea680
[ 2994.231339] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 2994.238575] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000d55
[ 2994.245811] R13: 0000000000000002 R14: ffff888302cc0870 R15: ffff888302cc0878
[ 2994.253052] FS:  0000000000000000(0000) GS:ffff88885fc80000(0000) knlGS:0000000000000000
[ 2994.261283] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2994.267127] CR2: 000055da1f63ca80 CR3: 0000000002612006 CR4: 00000000001706e0
[ 2994.274361] Call Trace:
[ 2994.276902]  <TASK>
[ 2994.279105]  ceph_fill_inode+0xe93/0x1140 [ceph]
[ 2994.283856]  ceph_fill_trace+0x168/0xca0 [ceph]
[ 2994.288497]  ? mds_dispatch+0x2ca/0x1790 [ceph]
[ 2994.293151]  mds_dispatch+0x2ec/0x1790 [ceph]
[ 2994.297628]  ? ceph_con_process_message+0x61/0x150 [libceph]
[ 2994.303411]  ? lock_release+0x13d/0x2b0
[ 2994.307355]  ceph_con_process_message+0x70/0x150 [libceph]
[ 2994.312956]  ? inet_recvmsg+0x63/0x1f0
[ 2994.316813]  process_message+0xe/0x70 [libceph]
[ 2994.321461]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 2994.326913]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 2994.331938]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 2994.336744]  process_one_work+0x240/0x5a0
[ 2994.340860]  worker_thread+0x3c/0x370
[ 2994.344621]  ? process_one_work+0x5a0/0x5a0
[ 2994.348901]  kthread+0xf2/0x120
[ 2994.352143]  ? kthread_complete_and_exit+0x20/0x20
[ 2994.357036]  ret_from_fork+0x1f/0x30
[ 2994.360728]  </TASK>
[ 2994.363006] irq event stamp: 1195243
[ 2994.366676] hardirqs last  enabled at (1195253): [<ffffffff811607ee>] __up_console_sem+0x5e/0x70
[ 2994.375596] hardirqs last disabled at (1195262): [<ffffffff811607d3>] __up_console_sem+0x43/0x70
[ 2994.384507] softirqs last  enabled at (1195006): [<ffffffff81e00321>] __do_softirq+0x321/0x483
[ 2994.393250] softirqs last disabled at (1194993): [<ffffffff810e4a0c>] irq_exit_rcu+0xdc/0x120
[ 2994.401901] ---[ end trace 0000000000000000 ]---
[ 2994.774440] ------------[ cut here ]------------
[ 2994.779321] WARNING: CPU: 4 PID: 186196 at fs/ceph/caps.c:3403 handle_cap_grant+0xd0f/0xd60 [ceph]
[ 2994.788428] Modules linked in: ceph libceph dns_resolver fscache netfs veth nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables nfnetlink bridge stp llc binfmt_misc overlay xfs libcrc32c intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass iTCO_wdt joydev mei_me iTCO_vendor_support ipmi_ssif crct10dif_pclmul i2c_i801 crc32_pclmul mei lpc_ich ghash_clmulni_intel i2c_smbus mfd_core wmi acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler sunrpc ioatdma acpi_pad acpi_power_meter ixgbe igb mdio nvme crc32c_intel ptp nvme_core i2c_algo_bit pps_core dca fuse
[ 2994.852825] CPU: 4 PID: 186196 Comm: kworker/4:1 Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 2994.862524] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
[ 2994.870132] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 2994.875738] RIP: 0010:handle_cap_grant+0xd0f/0xd60 [ceph]
[ 2994.881252] Code: c7 c6 a8 43 8f a0 48 c7 c7 28 95 91 a0 e8 b9 6d cc e0 41 59 41 5a e9 9a fa ff ff be 03 00 00 00 e8 06 8f c9 e0 e9 e3 f4 ff ff <0f> 0b e9 96 fb ff ff 0f 0b e9 9d fb ff ff e8 2e 15 f6 ff e9 cb f4
[ 2994.900193] RSP: 0018:ffffc9000c52fa18 EFLAGS: 00010207
[ 2994.905518] RAX: 0000000000000007 RBX: ffff88843feea680 RCX: 0000000000000001
[ 2994.912761] RDX: 0000000000000000 RSI: ffff88816d916000 RDI: ffff88843feea680
[ 2994.920014] RBP: ffffc9000c52fad8 R08: 0000000000000000 R09: ffffc9000c52fb60
[ 2994.927252] R10: 0000000000000007 R11: ada33900964ccdb8 R12: 0000000000000d55
[ 2994.934490] R13: 0000000000000d55 R14: ffff88816a8e3400 R15: ffff88829aaad000
[ 2994.941734] FS:  0000000000000000(0000) GS:ffff88885fd00000(0000) knlGS:0000000000000000
[ 2994.949952] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2994.955804] CR2: 00007f65c9283000 CR3: 0000000002612001 CR4: 00000000001706e0
[ 2994.963046] Call Trace:
[ 2994.965590]  <TASK>
[ 2994.967796]  ? _raw_spin_unlock+0x29/0x40
[ 2994.971919]  ceph_handle_caps+0xdab/0x1870 [ceph]
[ 2994.976764]  mds_dispatch+0x48e/0x1790 [ceph]
[ 2994.981236]  ? ceph_con_process_message+0x61/0x150 [libceph]
[ 2994.987014]  ? lock_release+0x13d/0x2b0
[ 2994.990962]  ? __mutex_unlock_slowpath+0x43/0x2b0
[ 2994.995774]  ceph_con_process_message+0x70/0x150 [libceph]
[ 2995.001378]  ? inet_recvmsg+0x63/0x1f0
[ 2995.005243]  process_message+0xe/0x70 [libceph]
[ 2995.009896]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 2995.015334]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 2995.020335]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 2995.025153]  process_one_work+0x240/0x5a0
[ 2995.029282]  worker_thread+0x3c/0x370
[ 2995.033052]  ? process_one_work+0x5a0/0x5a0
[ 2995.037334]  kthread+0xf2/0x120
[ 2995.040574]  ? kthread_complete_and_exit+0x20/0x20
[ 2995.045474]  ret_from_fork+0x1f/0x30
[ 2995.049157]  </TASK>
[ 2995.051436] irq event stamp: 302279
[ 2995.055023] hardirqs last  enabled at (302289): [<ffffffff811607ee>] __up_console_sem+0x5e/0x70
[ 2995.063844] hardirqs last disabled at (302298): [<ffffffff811607d3>] __up_console_sem+0x43/0x70
[ 2995.072665] softirqs last  enabled at (301912): [<ffffffff81e00321>] __do_softirq+0x321/0x483
[ 2995.081318] softirqs last disabled at (301899): [<ffffffff810e4a0c>] irq_exit_rcu+0xdc/0x120
[ 2995.089877] ---[ end trace 0000000000000000 ]---
[ 2995.095109] 
[ 2995.096680] ======================================================
[ 2995.102941] WARNING: possible circular locking dependency detected
[ 2995.109208] 5.18.0-ceph-g4168fe3b02d6 #1 Tainted: G S      W        
[ 2995.115645] ------------------------------------------------------
[ 2995.121903] kworker/4:1/186196 is trying to acquire lock:
[ 2995.127380] ffff88816d916090 (&s->s_mutex){+.+.}-{3:3}, at: ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.136280] 
[ 2995.136280] but task is already holding lock:
[ 2995.142209] ffff88810dd56290 (&mdsc->snap_rwsem){++++}-{3:3}, at: ceph_handle_caps+0x8eb/0x1870 [ceph]
[ 2995.151630] 
[ 2995.151630] which lock already depends on the new lock.
[ 2995.151630] 
[ 2995.159923] 
[ 2995.159923] the existing dependency chain (in reverse order) is:
[ 2995.167502] 
[ 2995.167502] -> #1 (&mdsc->snap_rwsem){++++}-{3:3}:
[ 2995.173870]        down_write+0x3b/0x70
[ 2995.177791]        mds_dispatch+0x27e/0x1790 [ceph]
[ 2995.182758]        ceph_con_process_message+0x70/0x150 [libceph]
[ 2995.188862]        process_message+0xe/0x70 [libceph]
[ 2995.194009]        ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 2995.199929]        ceph_con_workfn+0xee/0x6d0 [libceph]
[ 2995.205242]        process_one_work+0x240/0x5a0
[ 2995.209859]        worker_thread+0x3c/0x370
[ 2995.214121]        kthread+0xf2/0x120
[ 2995.217866]        ret_from_fork+0x1f/0x30
[ 2995.222043] 
[ 2995.222043] -> #0 (&s->s_mutex){+.+.}-{3:3}:
[ 2995.227889]        __lock_acquire+0x10eb/0x1d80
[ 2995.232506]        lock_acquire+0xd4/0x2f0
[ 2995.236682]        __mutex_lock+0xa2/0x920
[ 2995.240859]        ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.246176]        mds_dispatch+0x48e/0x1790 [ceph]
[ 2995.251149]        ceph_con_process_message+0x70/0x150 [libceph]
[ 2995.257244]        process_message+0xe/0x70 [libceph]
[ 2995.262391]        ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 2995.268320]        ceph_con_workfn+0xee/0x6d0 [libceph]
[ 2995.273632]        process_one_work+0x240/0x5a0
[ 2995.278239]        worker_thread+0x3c/0x370
[ 2995.282504]        kthread+0xf2/0x120
[ 2995.286247]        ret_from_fork+0x1f/0x30
[ 2995.290424] 
[ 2995.290424] other info that might help us debug this:
[ 2995.290424] 
[ 2995.298547]  Possible unsafe locking scenario:
[ 2995.298547] 
[ 2995.304567]        CPU0                    CPU1
[ 2995.309181]        ----                    ----
[ 2995.313791]   lock(&mdsc->snap_rwsem);
[ 2995.317621]                                lock(&s->s_mutex);
[ 2995.323449]                                lock(&mdsc->snap_rwsem);
[ 2995.329795]   lock(&s->s_mutex);
[ 2995.333108] 
[ 2995.333108]  *** DEADLOCK ***
[ 2995.333108] 
[ 2995.339146] 3 locks held by kworker/4:1/186196:
[ 2995.343761]  #0: ffff88816a989338 ((wq_completion)ceph-msgr){+.+.}-{0:0}, at: process_one_work+0x1c4/0x5a0
[ 2995.353517]  #1: ffffc9000c52fe78 ((work_completion)(&(&con->work)->work)){+.+.}-{0:0}, at: process_one_work+0x1c4/0x5a0
[ 2995.364491]  #2: ffff88810dd56290 (&mdsc->snap_rwsem){++++}-{3:3}, at: ceph_handle_caps+0x8eb/0x1870 [ceph]
[ 2995.374345] 
[ 2995.374345] stack backtrace:
[ 2995.378803] CPU: 4 PID: 186196 Comm: kworker/4:1 Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 2995.388469] Hardware name: Supermicro SYS-5018R-WR/X10SRW-F, BIOS 1.0c 09/07/2015
[ 2995.396059] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 2995.401642] Call Trace:
[ 2995.404166]  <TASK>
[ 2995.406348]  dump_stack_lvl+0x55/0x6d
[ 2995.410096]  check_noncircular+0xfe/0x110
[ 2995.414189]  __lock_acquire+0x10eb/0x1d80
[ 2995.418279]  lock_acquire+0xd4/0x2f0
[ 2995.421935]  ? ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.426904]  __mutex_lock+0xa2/0x920
[ 2995.430559]  ? ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.435527]  ? lock_is_held_type+0xe3/0x140
[ 2995.439789]  ? ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.444757]  ? ceph_lookup_inode+0x30/0x30 [ceph]
[ 2995.449552]  ? ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.454525]  ceph_handle_caps+0x42f/0x1870 [ceph]
[ 2995.459317]  ? mds_dispatch+0x70/0x1790 [ceph]
[ 2995.463850]  mds_dispatch+0x48e/0x1790 [ceph]
[ 2995.468296]  ? ceph_con_process_message+0x61/0x150 [libceph]
[ 2995.474043]  ? lock_release+0x13d/0x2b0
[ 2995.477959]  ? __mutex_unlock_slowpath+0x43/0x2b0
[ 2995.482743]  ceph_con_process_message+0x70/0x150 [libceph]
[ 2995.488320]  ? inet_recvmsg+0x63/0x1f0
[ 2995.492154]  process_message+0xe/0x70 [libceph]
[ 2995.496774]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 2995.502179]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 2995.507152]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 2995.511944]  process_one_work+0x240/0x5a0
[ 2995.516034]  worker_thread+0x3c/0x370
[ 2995.519773]  ? process_one_work+0x5a0/0x5a0
[ 2995.524039]  kthread+0xf2/0x120
[ 2995.527262]  ? kthread_complete_and_exit+0x20/0x20
[ 2995.532136]  ret_from_fork+0x1f/0x30
[ 2995.535794]  </TASK>
[ 3198.420600] INFO: task kworker/2:0:184823 blocked for more than 122 seconds.
[ 3198.427761]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3198.434730] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3198.442672] task:kworker/2:0     state:D stack:    0 pid:184823 ppid:     2 flags:0x00004000
[ 3198.451224] Workqueue: events delayed_work [ceph]
[ 3198.456041] Call Trace:
[ 3198.458577]  <TASK>
[ 3198.460770]  __schedule+0x474/0xad0
[ 3198.464351]  schedule+0x44/0xb0
[ 3198.467583]  schedule_preempt_disabled+0x14/0x20
[ 3198.472299]  __mutex_lock+0x576/0x920
[ 3198.476061]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3198.480861]  ? lock_release+0x1f9/0x2b0
[ 3198.484793]  ? delayed_work+0x15c/0x230 [ceph]
[ 3198.489340]  ? delayed_work+0x15c/0x230 [ceph]
[ 3198.493890]  delayed_work+0x15c/0x230 [ceph]
[ 3198.498266]  process_one_work+0x240/0x5a0
[ 3198.502369]  worker_thread+0x3c/0x370
[ 3198.506124]  ? process_one_work+0x5a0/0x5a0
[ 3198.510397]  kthread+0xf2/0x120
[ 3198.513628]  ? kthread_complete_and_exit+0x20/0x20
[ 3198.526412]  ret_from_fork+0x1f/0x30
[ 3198.530079]  </TASK>
[ 3198.532357] INFO: task kworker/4:1:186196 blocked for more than 122 seconds.
[ 3198.539496]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3198.546456] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3198.554392] task:kworker/4:1     state:D stack:    0 pid:186196 ppid:     2 flags:0x00004000
[ 3198.562938] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 3198.568563] Call Trace:
[ 3198.571105]  <TASK>
[ 3198.573290]  __schedule+0x474/0xad0
[ 3198.576873]  schedule+0x44/0xb0
[ 3198.580102]  rwsem_down_write_slowpath+0x2e0/0x6a0
[ 3198.584988]  mds_dispatch+0x27e/0x1790 [ceph]
[ 3198.589453]  ceph_con_process_message+0x70/0x150 [libceph]
[ 3198.595042]  ? inet_recvmsg+0x63/0x1f0
[ 3198.598882]  process_message+0xe/0x70 [libceph]
[ 3198.603525]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 3198.608940]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3198.613735]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 3198.618710]  ? rcu_read_lock_held_common+0xe/0x40
[ 3198.623510]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3198.628304]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 3198.633114]  process_one_work+0x240/0x5a0
[ 3198.637210]  worker_thread+0x3c/0x370
[ 3198.640962]  ? process_one_work+0x5a0/0x5a0
[ 3198.645236]  kthread+0xf2/0x120
[ 3198.648466]  ? kthread_complete_and_exit+0x20/0x20
[ 3198.653348]  ret_from_fork+0x1f/0x30
[ 3198.657015]  </TASK>
[ 3198.659288] INFO: lockdep is turned off.
[ 3292.246477] ceph: mds3 hung
[ 3295.064449] libceph: mds1 (2)172.21.15.131:6836 socket closed (con state OPEN)
[ 3295.072317] libceph: mds3 (2)172.21.15.29:6834 socket closed (con state OPEN)
[ 3295.140457] libceph: mds0 (2)172.21.15.131:6838 socket closed (con state OPEN)
[ 3295.179892] libceph: mds2 (2)172.21.15.131:6834 socket closed (con state OPEN)
[ 3295.292969] libceph: mds3 (2)172.21.15.29:6834 session reset
[ 3295.298741] ceph: mds3 closed our session
[ 3295.302849] ceph: mds3 reconnect start
[ 3321.303210] INFO: task kworker/2:0:184823 blocked for more than 245 seconds.
[ 3321.310364]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3321.317330] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3321.325275] task:kworker/2:0     state:D stack:    0 pid:184823 ppid:     2 flags:0x00004000
[ 3321.333828] Workqueue: events delayed_work [ceph]
[ 3321.338659] Call Trace:
[ 3321.341192]  <TASK>
[ 3321.343380]  __schedule+0x474/0xad0
[ 3321.346958]  schedule+0x44/0xb0
[ 3321.350196]  schedule_preempt_disabled+0x14/0x20
[ 3321.354905]  __mutex_lock+0x576/0x920
[ 3321.358663]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3321.363464]  ? lock_release+0x1f9/0x2b0
[ 3321.367413]  ? delayed_work+0x15c/0x230 [ceph]
[ 3321.371961]  ? delayed_work+0x15c/0x230 [ceph]
[ 3321.376516]  delayed_work+0x15c/0x230 [ceph]
[ 3321.380894]  process_one_work+0x240/0x5a0
[ 3321.385008]  worker_thread+0x3c/0x370
[ 3321.388764]  ? process_one_work+0x5a0/0x5a0
[ 3321.393051]  kthread+0xf2/0x120
[ 3321.396292]  ? kthread_complete_and_exit+0x20/0x20
[ 3321.401180]  ret_from_fork+0x1f/0x30
[ 3321.404843]  </TASK>
[ 3321.407116] INFO: task kworker/4:1:186196 blocked for more than 245 seconds.
[ 3321.414265]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3321.421229] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3321.429164] task:kworker/4:1     state:D stack:    0 pid:186196 ppid:     2 flags:0x00004000
[ 3321.437716] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 3321.443307] Call Trace:
[ 3321.445840]  <TASK>
[ 3321.448026]  __schedule+0x474/0xad0
[ 3321.451598]  schedule+0x44/0xb0
[ 3321.454824]  rwsem_down_write_slowpath+0x2e0/0x6a0
[ 3321.459729]  mds_dispatch+0x27e/0x1790 [ceph]
[ 3321.464200]  ceph_con_process_message+0x70/0x150 [libceph]
[ 3321.469785]  ? inet_recvmsg+0x63/0x1f0
[ 3321.473620]  process_message+0xe/0x70 [libceph]
[ 3321.478249]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 3321.483660]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3321.488454]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 3321.493425]  ? rcu_read_lock_held_common+0xe/0x40
[ 3321.498220]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3321.503016]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 3321.507815]  process_one_work+0x240/0x5a0
[ 3321.511911]  worker_thread+0x3c/0x370
[ 3321.515653]  ? process_one_work+0x5a0/0x5a0
[ 3321.519919]  kthread+0xf2/0x120
[ 3321.523142]  ? kthread_complete_and_exit+0x20/0x20
[ 3321.528025]  ret_from_fork+0x1f/0x30
[ 3321.531690]  </TASK>
[ 3321.533963] INFO: lockdep is turned off.
[ 3444.186723] INFO: task kworker/7:2:138207 blocked for more than 122 seconds.
[ 3444.193894]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3444.200862] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3444.208806] task:kworker/7:2     state:D stack:    0 pid:138207 ppid:     2 flags:0x00004000
[ 3444.217370] Workqueue: ceph-cap ceph_cap_release_work [ceph]
[ 3444.223141] Call Trace:
[ 3444.225678]  <TASK>
[ 3444.227881]  __schedule+0x474/0xad0
[ 3444.231466]  schedule+0x44/0xb0
[ 3444.234691]  schedule_preempt_disabled+0x14/0x20
[ 3444.239393]  __mutex_lock+0x576/0x920
[ 3444.243143]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3444.247945]  ? ceph_cap_release_work+0x21/0x60 [ceph]
[ 3444.253120]  ? ceph_cap_release_work+0x21/0x60 [ceph]
[ 3444.258272]  ceph_cap_release_work+0x21/0x60 [ceph]
[ 3444.263249]  process_one_work+0x240/0x5a0
[ 3444.267353]  worker_thread+0x3c/0x370
[ 3444.271105]  ? process_one_work+0x5a0/0x5a0
[ 3444.275378]  kthread+0xf2/0x120
[ 3444.278605]  ? kthread_complete_and_exit+0x20/0x20
[ 3444.283483]  ret_from_fork+0x1f/0x30
[ 3444.287155]  </TASK>
[ 3444.289448] INFO: task kworker/7:4:182454 blocked for more than 122 seconds.
[ 3444.296583]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3444.303541] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3444.311479] task:kworker/7:4     state:D stack:    0 pid:182454 ppid:     2 flags:0x00004000
[ 3444.320034] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 3444.325623] Call Trace:
[ 3444.328163]  <TASK>
[ 3444.330348]  __schedule+0x474/0xad0
[ 3444.333923]  ? rwsem_down_read_slowpath+0x16a/0x3f0
[ 3444.338887]  schedule+0x44/0xb0
[ 3444.342119]  rwsem_down_read_slowpath+0x389/0x3f0
[ 3444.346919]  down_read+0xc2/0x150
[ 3444.350319]  send_mds_reconnect+0x2d9/0x6fd [ceph]
[ 3444.355211]  __handle_control+0x1186/0x1280 [libceph]
[ 3444.360370]  ? tcp_recvmsg+0xb4/0x1e0
[ 3444.364128]  ? tcp_recvmsg+0x1ba/0x1e0
[ 3444.367965]  ? __tcp_transmit_skb+0x568/0xb90
[ 3444.372411]  ? inet_recvmsg+0x63/0x1f0
[ 3444.376255]  ceph_con_v2_try_read+0x87a/0x1840 [libceph]
[ 3444.381669]  ? rcu_read_lock_held_common+0xe/0x40
[ 3444.386462]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 3444.391440]  ? rcu_read_lock_held_common+0xe/0x40
[ 3444.396232]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3444.401022]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 3444.405824]  process_one_work+0x240/0x5a0
[ 3444.409923]  worker_thread+0x3c/0x370
[ 3444.413674]  ? process_one_work+0x5a0/0x5a0
[ 3444.417952]  kthread+0xf2/0x120
[ 3444.421186]  ? kthread_complete_and_exit+0x20/0x20
[ 3444.426074]  ret_from_fork+0x1f/0x30
[ 3444.429747]  </TASK>
[ 3444.432039] INFO: task kworker/2:0:184823 blocked for more than 368 seconds.
[ 3444.439217]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3444.446203] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3444.454143] task:kworker/2:0     state:D stack:    0 pid:184823 ppid:     2 flags:0x00004000
[ 3444.462692] Workqueue: events delayed_work [ceph]
[ 3444.467498] Call Trace:
[ 3444.470029]  <TASK>
[ 3444.472214]  __schedule+0x474/0xad0
[ 3444.475795]  schedule+0x44/0xb0
[ 3444.479021]  schedule_preempt_disabled+0x14/0x20
[ 3444.483728]  __mutex_lock+0x576/0x920
[ 3444.487479]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3444.492276]  ? lock_release+0x1f9/0x2b0
[ 3444.496202]  ? delayed_work+0x15c/0x230 [ceph]
[ 3444.500743]  ? delayed_work+0x15c/0x230 [ceph]
[ 3444.505282]  delayed_work+0x15c/0x230 [ceph]
[ 3444.509649]  process_one_work+0x240/0x5a0
[ 3444.513749]  worker_thread+0x3c/0x370
[ 3444.517502]  ? process_one_work+0x5a0/0x5a0
[ 3444.521775]  kthread+0xf2/0x120
[ 3444.525003]  ? kthread_complete_and_exit+0x20/0x20
[ 3444.529877]  ret_from_fork+0x1f/0x30
[ 3444.533549]  </TASK>
[ 3444.543641] INFO: task kworker/4:1:186196 blocked for more than 368 seconds.
[ 3444.550780]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3444.557743] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3444.565689] task:kworker/4:1     state:D stack:    0 pid:186196 ppid:     2 flags:0x00004000
[ 3444.574239] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 3444.579828] Call Trace:
[ 3444.582359]  <TASK>
[ 3444.584543]  __schedule+0x474/0xad0
[ 3444.588118]  schedule+0x44/0xb0
[ 3444.591341]  rwsem_down_write_slowpath+0x2e0/0x6a0
[ 3444.596226]  mds_dispatch+0x27e/0x1790 [ceph]
[ 3444.600681]  ceph_con_process_message+0x70/0x150 [libceph]
[ 3444.606261]  ? inet_recvmsg+0x63/0x1f0
[ 3444.610099]  process_message+0xe/0x70 [libceph]
[ 3444.614729]  ceph_con_v2_try_read+0xf8d/0x1840 [libceph]
[ 3444.620143]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3444.624941]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 3444.629910]  ? rcu_read_lock_held_common+0xe/0x40
[ 3444.634707]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3444.639499]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 3444.644302]  process_one_work+0x240/0x5a0
[ 3444.648398]  worker_thread+0x3c/0x370
[ 3444.652151]  ? process_one_work+0x5a0/0x5a0
[ 3444.656423]  kthread+0xf2/0x120
[ 3444.659654]  ? kthread_complete_and_exit+0x20/0x20
[ 3444.664538]  ret_from_fork+0x1f/0x30
[ 3444.668202]  </TASK>
[ 3444.670489] INFO: lockdep is turned off.
[ 3567.068450] INFO: task kworker/7:2:138207 blocked for more than 245 seconds.
[ 3567.075604]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3567.082575] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3567.090513] task:kworker/7:2     state:D stack:    0 pid:138207 ppid:     2 flags:0x00004000
[ 3567.099068] Workqueue: ceph-cap ceph_cap_release_work [ceph]
[ 3567.104843] Call Trace:
[ 3567.107381]  <TASK>
[ 3567.109576]  __schedule+0x474/0xad0
[ 3567.113157]  schedule+0x44/0xb0
[ 3567.116390]  schedule_preempt_disabled+0x14/0x20
[ 3567.121097]  __mutex_lock+0x576/0x920
[ 3567.124858]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3567.129657]  ? ceph_cap_release_work+0x21/0x60 [ceph]
[ 3567.134823]  ? ceph_cap_release_work+0x21/0x60 [ceph]
[ 3567.139971]  ceph_cap_release_work+0x21/0x60 [ceph]
[ 3567.144958]  process_one_work+0x240/0x5a0
[ 3567.149064]  worker_thread+0x3c/0x370
[ 3567.152815]  ? process_one_work+0x5a0/0x5a0
[ 3567.157090]  kthread+0xf2/0x120
[ 3567.160329]  ? kthread_complete_and_exit+0x20/0x20
[ 3567.165216]  ret_from_fork+0x1f/0x30
[ 3567.168886]  </TASK>
[ 3567.171188] INFO: task kworker/7:4:182454 blocked for more than 245 seconds.
[ 3567.178328]       Tainted: G S      W         5.18.0-ceph-g4168fe3b02d6 #1
[ 3567.185294] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3567.193264] task:kworker/7:4     state:D stack:    0 pid:182454 ppid:     2 flags:0x00004000
[ 3567.201823] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[ 3567.207416] Call Trace:
[ 3567.209948]  <TASK>
[ 3567.212132]  __schedule+0x474/0xad0
[ 3567.215709]  ? rwsem_down_read_slowpath+0x16a/0x3f0
[ 3567.220673]  schedule+0x44/0xb0
[ 3567.223906]  rwsem_down_read_slowpath+0x389/0x3f0
[ 3567.228706]  down_read+0xc2/0x150
[ 3567.232110]  send_mds_reconnect+0x2d9/0x6fd [ceph]
[ 3567.237007]  __handle_control+0x1186/0x1280 [libceph]
[ 3567.242159]  ? tcp_recvmsg+0xb4/0x1e0
[ 3567.245908]  ? tcp_recvmsg+0x1ba/0x1e0
[ 3567.249739]  ? __tcp_transmit_skb+0x568/0xb90
[ 3567.254178]  ? inet_recvmsg+0x63/0x1f0
[ 3567.258009]  ceph_con_v2_try_read+0x87a/0x1840 [libceph]
[ 3567.263421]  ? rcu_read_lock_held_common+0xe/0x40
[ 3567.268216]  ? ceph_con_workfn+0x39/0x6d0 [libceph]
[ 3567.273189]  ? rcu_read_lock_held_common+0xe/0x40
[ 3567.277977]  ? rcu_read_lock_sched_held+0x23/0x90
[ 3567.282765]  ceph_con_workfn+0xee/0x6d0 [libceph]
[ 3567.287565]  process_one_work+0x240/0x5a0
[ 3567.291664]  worker_thread+0x3c/0x370
[ 3567.295413]  ? process_one_work+0x5a0/0x5a0
[ 3567.299678]  kthread+0xf2/0x120
[ 3567.302909]  ? kthread_complete_and_exit+0x20/0x20
[ 3567.307783]  ret_from_fork+0x1f/0x30
[ 3567.311452]  </TASK>
[ 3567.313739] INFO: lockdep is turned off.
[ 3647.452811] libceph: mds0 (2)172.21.15.131:6838 session reset
[ 3647.458656] ceph: mds0 closed our session
[ 3647.462755] ceph: mds0 reconnect start

Subtasks

CephFS - Bug #54106: kclient: hang during workunit cleanupDuplicateXiubo Li

History

#1 Updated by Jeff Layton almost 2 years ago

Full console log is at:

http://qa-proxy.ceph.com/teuthology/jlayton-2022-06-03_18:17:02-fs-wip-vshankar-testing1-20220603-134300-distro-default-smithi/6862802/console_logs/smithi005.log

The first warning is this in ceph_add_cap:

   WARN_ON(ci->i_auth_cap == cap);

So we have an i_auth_cap that doesn't have CEPH_CAP_FLAG_AUTH set. The second warning is this:

        /*
         * auth mds of the inode changed. we received the cap export message,
         * but still haven't received the cap import message. handle_cap_export
         * updated the new auth MDS' cap.
         *
         * "ceph_seq_cmp(seq, cap->seq) <= 0" means we are processing a message
         * that was sent before the cap import message. So don't remove caps.
         */
        if (ceph_seq_cmp(seq, cap->seq) <= 0) {
                WARN_ON(cap != ci->i_auth_cap);

That causes this condition to evaluate false:

        if (ci->i_auth_cap == cap &&
            le32_to_cpu(grant->op) == CEPH_CAP_OP_IMPORT) {

...which causes this function to leak the snap_rwsem. That's a clear bug that I'll send a patch for in a bit. The part I don't understand is how we ended up importing an auth cap we already had, and why the code is written to consider that to be so problematic.

#2 Updated by Xiubo Li almost 2 years ago

Jeff Layton wrote:

Full console log is at:

http://qa-proxy.ceph.com/teuthology/jlayton-2022-06-03_18:17:02-fs-wip-vshankar-testing1-20220603-134300-distro-default-smithi/6862802/console_logs/smithi005.log

The first warning is this in ceph_add_cap:

[...]

So we have an i_auth_cap that doesn't have CEPH_CAP_FLAG_AUTH set. The second warning is this:
[...]

That causes this condition to evaluate false:

[...]

...which causes this function to leak the snap_rwsem. That's a clear bug that I'll send a patch for in a bit. The part I don't understand is how we ended up importing an auth cap we already had, and why the code is written to consider that to be so problematic.

There has one case seems could cause it:

1, In kclient do a async create and sends the request to mds.0, which is the parent dir auth MDS for now. And we will set the auth cap to mds.0
2, Before the mds.0 could handle the async create request, the auth of the parent dir was changed to mds.1.
3, The mds.0 begins to handle the async create request but will forward the request to mds.1
4, The mds.1 successfully handled the reqeust and reply to kclient.
5, The kclient will add a new auth cap to mds.1

Locally I have one patch to fix it and still under testing.

#3 Updated by Xiubo Li almost 2 years ago

Xiubo Li wrote:

Jeff Layton wrote:

Full console log is at:

http://qa-proxy.ceph.com/teuthology/jlayton-2022-06-03_18:17:02-fs-wip-vshankar-testing1-20220603-134300-distro-default-smithi/6862802/console_logs/smithi005.log

The first warning is this in ceph_add_cap:

[...]

So we have an i_auth_cap that doesn't have CEPH_CAP_FLAG_AUTH set. The second warning is this:
[...]

That causes this condition to evaluate false:

[...]

...which causes this function to leak the snap_rwsem. That's a clear bug that I'll send a patch for in a bit. The part I don't understand is how we ended up importing an auth cap we already had, and why the code is written to consider that to be so problematic.

There has one case seems could cause it:

1, In kclient do a async create and sends the request to mds.0, which is the parent dir auth MDS for now. And we will set the auth cap to mds.0
2, Before the mds.0 could handle the async create request, the auth of the parent dir was changed to mds.1.
3, The mds.0 begins to handle the async create request but will forward the request to mds.1
4, The mds.1 successfully handled the reqeust and reply to kclient.
5, The kclient will add a new auth cap to mds.1

Locally I have one patch to fix it and still under testing.

The patchwork link: https://patchwork.kernel.org/project/ceph-devel/list/?series=649077

#4 Updated by Xiubo Li almost 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Xiubo Li

#5 Updated by Xiubo Li almost 2 years ago

> [ 2995.290424] other info that might help us debug this:
> [ 2995.290424] 
> [ 2995.298547]  Possible unsafe locking scenario:
> [ 2995.298547] 
> [ 2995.304567]        CPU0                    CPU1
> [ 2995.309181]        ----                    ----
> [ 2995.313791]   lock(&mdsc->snap_rwsem);
> [ 2995.317621]                                lock(&s->s_mutex);
> [ 2995.323449]                                lock(&mdsc->snap_rwsem);
> [ 2995.329795]   lock(&s->s_mutex);
> [ 2995.333108] 
> [ 2995.333108]  *** DEADLOCK ***

Jeff has fixed the above snap_rwsem and s_mutex deadlock bug in https://patchwork.kernel.org/project/ceph-devel/list/?series=647320.

#6 Updated by Xiubo Li almost 2 years ago

  • Status changed from In Progress to Fix Under Review

#7 Updated by Xiubo Li about 1 year ago

  • Status changed from Fix Under Review to Resolved

Applied to the mainline and closing this tracker.

Also available in: Atom PDF