Project

General

Profile

Actions

Bug #37439

closed

Degraded PG does not discover remapped data on originating OSD

Added by Jonas Jelten over 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
Backfill/Recovery
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus, mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

There seems to be an issue that an OSD is not queried for missing objects that were remapped, but the OSD for this is up. This happened in two different scenarios for us. In both, data is stored in EC pools (8+3).

Scenario 0

To remove a broken disk (e.g. osd.22), it is weighted to 0 with ceph osd out 22. Objects are remapped normally. During object movement, osd.22 is restarted (or crashes and then starts again). Now the bug shows up: Objects will become degraded and stay degraded, because osd.22 is not queried. ceph pq query shows:

    "might_have_unfound": [
      {
        "osd": "22(3)",
        "status": "not queried" 
      }
    ],

A workaround is to in the broken-disk osd temporarily. The osd is then queried and missing objects are discovered. Then, out the osd again. No objects are degraded any more and disk will be emptied.

Scenario 1

Add new disks to the cluster. Data is remapped to be transferred from the old disks (e.g. osd.19) to new disks (e.g. > osd.42).
When there is a restart an OSD of the old disks (or it restarts because of a crash), objects become degraded. The missing data is on the osd.19 but again it is not queried. ceph pg query shows:

    "might_have_unfound": [
      {
        "osd": "19(6)",
        "status": "not queried" 
      }
    ],

Only remapped data seems to be missing, if osd.19 is taken down, much more data is degraded. Mind that osd.19 is missing in the acting set in the current state of this PG:

    "up": [38, 36, 28, 17, 13, 39, 48, 10, 29, 5, 47],
    "acting": [36, 15, 28, 17, 13, 32, 2147483647, 10, 29, 5, 20],
    "backfill_targets": [
        "36(1)",
        "38(0)",
        "39(5)",
        "47(10)",
        "48(6)" 
    ],
    "acting_recovery_backfill": [
        "5(9)",
        "10(7)",
        "13(4)",
        "15(1)",
        "17(3)",
        "20(10)",
        "28(2)",
        "29(8)",
        "32(5)",
        "36(0)",
        "36(1)",
        "38(0)",
        "39(5)",
        "47(10)",
        "48(6)" 
    ],

For this scenario, I have not found a workaround yet. The cluster remains degraded until it has recovered by restoring the data.

So, overall I suspect there is a bug which prevents remapped pg data to be discovered. The PG already knows which OSD is the correct candidate, but does not query it.


Files

ceph-osd.18.log.xz (27.4 KB) ceph-osd.18.log.xz log level 20 of OSD 18 Jonas Jelten, 12/13/2018 12:59 PM
ceph-osd.38.log.xz (314 KB) ceph-osd.38.log.xz log of primary for pg 6.65 Jonas Jelten, 04/01/2019 10:55 PM

Related issues 4 (1 open3 closed)

Related to RADOS - Bug #46847: Loss of placement information on OSD rebootNeed More Info

Actions
Copied to RADOS - Backport #39431: luminous: Degraded PG does not discover remapped data on originating OSDResolvedAshish SinghActions
Copied to RADOS - Backport #39432: nautilus: Degraded PG does not discover remapped data on originating OSDResolvedAshish SinghActions
Copied to RADOS - Backport #39433: mimic: Degraded PG does not discover remapped data on originating OSDResolvedAshish SinghActions
Actions

Also available in: Atom PDF