Project

General

Profile

Actions

Bug #45112

closed

OSD might fail to recover after ENOSPC crash

Added by Igor Fedotov about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus, nautilus, mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

While opening after such a crash KV might need to flush some data and hence needs additional disk space. But allocator isn't ready at this point (since opened DB is needed to init it).
Hence OSD stuck in a permanent "deadlock" state.


Related issues 3 (0 open3 closed)

Copied to bluestore - Backport #45122: octopus: OSD might fail to recover after ENOSPC crashResolvedIgor FedotovActions
Copied to bluestore - Backport #45123: nautilus: OSD might fail to recover after ENOSPC crashResolvedIgor FedotovActions
Copied to bluestore - Backport #45234: mimic: OSD might fail to recover after ENOSPC crashRejectedIgor FedotovActions
Actions #1

Updated by Igor Fedotov about 4 years ago

  • Copied to Backport #45122: octopus: OSD might fail to recover after ENOSPC crash added
Actions #2

Updated by Igor Fedotov about 4 years ago

  • Copied to Backport #45123: nautilus: OSD might fail to recover after ENOSPC crash added
Actions #5

Updated by Igor Fedotov about 4 years ago

  • Related to Backport #45125: mimic: Extent leak after main device expand added
Actions #6

Updated by Igor Fedotov about 4 years ago

  • Related to deleted (Backport #45125: mimic: Extent leak after main device expand)
Actions #7

Updated by Igor Fedotov about 4 years ago

  • Copied to Backport #45125: mimic: Extent leak after main device expand added
Actions #8

Updated by Nathan Cutler about 4 years ago

  • Copied to deleted (Backport #45125: mimic: Extent leak after main device expand)
Actions #9

Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #45234: mimic: OSD might fail to recover after ENOSPC crash added
Actions #10

Updated by Igor Fedotov almost 4 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF