Project

General

Profile

Actions

Bug #49605

closed

pybind/mgr/volumes: deadlock on async job hangs finisher thread

Added by Patrick Donnelly about 3 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Immediate
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-03-04T14:37:08.589+0000 7fd38cd44700 10 mgr.server ms_handle_authentication ms_handle_authentication new session 0x56077af7f8c0 con 0x56077b160800 entity client.admin addr
2021-03-04T14:37:08.589+0000 7fd38cd44700 10 mgr.server ms_handle_authentication  session 0x56077af7f8c0 client.admin has caps allow * 'allow *'
2021-03-04T14:37:08.590+0000 7fd38cd44700  1 --2- [v2:172.21.15.136:6832/32141,v1:172.21.15.136:6833/32141] >> 172.21.15.136:0/205030034 conn(0x56077b160800 0x56077b1f1200 secure :-1 s=READY pgs=4 cs=0 l=1 rev1=1 rx=0x56077b0ee5d0 tx=0x56077b1b9200).ready entity=client.5841 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0
2021-03-04T14:37:08.818+0000 7fd36b1fb700  1 -- [v2:172.21.15.136:6832/32141,v1:172.21.15.136:6833/32141] <== client.5841 172.21.15.136:0/205030034 1 ==== mgr_command(tid 0: {"prefix": "fs volume ls", "target": ["mon-mgr", ""]}) v1 ==== 77+0+0 (secure 0 0 0) 0x56077adff760 con 0x56077b160800
2021-03-04T14:37:08.818+0000 7fd36b1fb700 10 mgr.server _handle_command decoded-size=2 prefix=fs volume ls
2021-03-04T14:37:08.819+0000 7fd36b1fb700 20 is_capable service=py module=volumes command=fs volume ls read addr - on cap allow *
2021-03-04T14:37:08.819+0000 7fd36b1fb700 20  allow so far , doing grant allow *
2021-03-04T14:37:08.819+0000 7fd36b1fb700 20  allow all
2021-03-04T14:37:08.819+0000 7fd36b1fb700 10 mgr.server _allowed_command  client.admin capable
2021-03-04T14:37:08.819+0000 7fd36b1fb700  0 log_channel(audit) log [DBG] : from='client.5841 -' entity='client.admin' cmd=[{"prefix": "fs volume ls", "target": ["mon-mgr", ""]}]: dispatch
2021-03-04T14:37:08.819+0000 7fd36b1fb700 10 mgr.server _handle_command passing through 2
2021-03-04T14:37:08.821+0000 7fd385535700 10 mgr tick tick

From: /ceph/teuthology-archive/pdonnell-2021-03-04_03:51:01-fs-wip-pdonnell-testing-20210303.195715-distro-basic-smithi/5932239/remote/smithi136/log/ceph-mgr.y.log.gz

The `volume ls` command was never dispatched. It looks like the queued finisher context was dropped somehow.


Related issues 3 (0 open3 closed)

Copied to CephFS - Backport #50126: octopus: pybind/mgr/volumes: deadlock on async job hangs finisher threadRejectedCory SnyderActions
Copied to CephFS - Backport #50127: pacific: pybind/mgr/volumes: deadlock on async job hangs finisher threadResolvedPatrick DonnellyActions
Copied to CephFS - Backport #50128: nautilus: pybind/mgr/volumes: deadlock on async job hangs finisher threadResolvedPatrick DonnellyActions
Actions

Also available in: Atom PDF