Project

General

Profile

Actions

Bug #8722

closed

osd: recovery op counting leak (dumpling)

Added by Sage Weil almost 10 years ago. Updated about 7 years ago.

Status:
Won't Fix
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

encountered pgs stuck during backfill, with rops=1 on the primary, no progress, and no blocked requests on the primary or replicas. either there is a recovery op counting bug, or a message was lost, or ...

need to capture this happening with logs. :/

Actions #1

Updated by Sage Weil almost 10 years ago

note that a pg query includes this:

              "peer_backfill_info": { "begin": "0\/\/0\/\/-1",
                  "end": "0\/\/0\/\/-1",
                  "objects": []},
              "backfills_in_flight": [],
              "pull_from_peer": [],
              "pushing": []},

i.e., no work in progress. so, i think a recovery op counting bug...

Actions #2

Updated by Sage Weil about 7 years ago

  • Status changed from Need More Info to Won't Fix
Actions

Also available in: Atom PDF