Project

General

Profile

Bug #44207

mgr/volumes: deadlock when trying to purge large number of trash entries

Added by Venky Shankar about 1 month ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Crash signature:

Description

There's a subtle deadlock when purge tasks (via the generic async job machinery) tries to fetch the next job to execute. The volume (filesystem) should be opened in lockless mode since the main thread (command dispatcher thread) serializes access to the volume.

Hit this once when trying to remove large number of trash entries.


Related issues

Related to fs - Bug #44276: pybind/mgr/volumes: cleanup stale connection hang In Progress
Copied to fs - Backport #44282: nautilus: mgr/volumes: deadlock when trying to purge large number of trash entries Resolved

History

#1 Updated by Venky Shankar about 1 month ago

  • Status changed from New to In Progress

#2 Updated by Venky Shankar about 1 month ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 33413

#3 Updated by Patrick Donnelly about 1 month ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version set to v15.0.0

#4 Updated by Patrick Donnelly about 1 month ago

  • Related to Bug #44281: pybind/mgr/volumes: cleanup stale connection hang added

#5 Updated by Patrick Donnelly about 1 month ago

  • Copied to Backport #44282: nautilus: mgr/volumes: deadlock when trying to purge large number of trash entries added

#6 Updated by Patrick Donnelly about 1 month ago

  • Related to deleted (Bug #44281: pybind/mgr/volumes: cleanup stale connection hang)

#7 Updated by Patrick Donnelly about 1 month ago

  • Related to Bug #44276: pybind/mgr/volumes: cleanup stale connection hang added

#8 Updated by Nathan Cutler about 1 month ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF