Actions
Bug #45635
closedkclient: kclient node get stuck dues to double lock happens
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
kcephfs
Crash signature (v1):
Crash signature (v2):
Description
[183549.259361] Workqueue: ceph-msgr ceph_con_workfn [libceph]
[183549.260653] Call Trace:
[183549.261436] ? __schedule+0x272/0x5b0
[183549.262431] schedule+0x45/0xb0
[183549.263322] schedule_preempt_disabled+0x5/0x10
[183549.264472] __mutex_lock.isra.0+0x262/0x4b0
[183549.265549] ? get_page_from_freelist+0x710/0x1000
[183549.266719] ? __ceph_caps_issued+0x68/0xc0 [ceph]
[183549.267878] ceph_check_caps+0x4a9/0x980 [ceph]
[183549.268965] ? con_get+0xc/0x20 [ceph]
[183549.269892] ? msg_con_set.isra.0+0x31/0x50 [libceph]
[183549.271113] ? ceph_con_send+0xbc/0x1b0 [libceph]
[183549.272263] ? __send_request+0x683/0x890 [ceph]
[183549.273391] ? select_collect2+0xe0/0xe0
[183549.274474] ceph_put_cap_refs+0x24c/0x320 [ceph]
[183549.275643] send_mds_reconnect+0x268/0x679 [ceph]
[183549.276779] ceph_mdsc_handle_mdsmap+0x5b6/0x620 [ceph]
[183549.278007] ? extra_mon_dispatch+0x2f/0x40 [ceph]
[183549.279192] extra_mon_dispatch+0x2f/0x40 [ceph]
[183549.280325] dispatch+0x527/0x8d0 [libceph]
[183549.281423] ceph_con_workfn+0xcc6/0x29d0 [libceph]
[183549.282614] ? __switch_to_asm+0x34/0x70
[183549.283620] ? __switch_to_asm+0x40/0x70
[183549.284640] ? __switch_to_asm+0x34/0x70
[183549.285640] ? __switch_to_asm+0x40/0x70
[183549.286682] ? __switch_to_asm+0x34/0x70
[183549.287678] ? __switch_to_asm+0x34/0x70
[183549.288693] ? __switch_to_asm+0x40/0x70
[183549.289705] ? __switch_to_asm+0x34/0x70
[183549.290696] ? __switch_to+0x162/0x3f0
[183549.291634] process_one_work+0x1d2/0x3a0
[183549.292682] worker_thread+0x45/0x3c0
[183549.293647] kthread+0xf6/0x130
[183549.294539] ? process_one_work+0x3a0/0x3a0
[183549.295956] ? kthread_park+0x80/0x80
[183549.297424] ret_from_fork+0x22/0x40
Updated by Xiubo Li almost 4 years ago
In the ceph_check_caps() it may call the session lock/unlock stuff.
There have some deadlock cases, like:
handle_forward()
...
mutex_lock(&mdsc->mutex)
...
ceph_mdsc_put_request()
--> ceph_mdsc_release_request()
--> ceph_put_cap_request()
--> ceph_put_cap_refs()
--> ceph_check_caps()
...
mutex_unlock(&mdsc->mutex)
And also there maybe has some double session lock cases, like:
send_mds_reconnect()
...
mutex_lock(&session->s_mutex);
...
--> replay_unsafe_requests()
--> ceph_mdsc_release_dir_caps()
--> ceph_put_cap_refs()
--> ceph_check_caps()
...
mutex_unlock(&session->s_mutex);
Updated by Xiubo Li almost 4 years ago
- Status changed from In Progress to Fix Under Review
Updated by Xiubo Li over 3 years ago
- Status changed from Fix Under Review to Resolved
Actions