Project

General

Profile

Bug #55687

pacific: Regressions with holding the GIL while attempting to lock a mutex

Added by Cory Snyder 7 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The mgr process can deadlock if the GIL is held while attempting to lock a mutex. There have been some recent regressions that make this scenario possible again. We have seen this regression cause all 5 of our managers to deadlock and become unavailable in a large cluster.

History

#1 Updated by Cory Snyder 7 months ago

  • Affected Versions v16.2.8 added

These regressions appear to have been introduced here: https://github.com/ceph/ceph/pull/44750

Note that the issues do not exist on the master branch or on Quincy, they were introduced due to mistakes with the Pacific backport.

#2 Updated by Cory Snyder 7 months ago

  • Backport deleted (quincy, pacific)

#3 Updated by Cory Snyder 7 months ago

  • Regression changed from No to Yes

#4 Updated by Cory Snyder 7 months ago

  • Pull request ID set to 46302

#5 Updated by Neha Ojha 7 months ago

  • Subject changed from Regressions with holding the GIL while attempting to lock a mutex to pacific: Regressions with holding the GIL while attempting to lock a mutex
  • Status changed from New to Resolved

#6 Updated by Ilya Dryomov 5 months ago

  • Target version set to v16.2.9

Also available in: Atom PDF