Bug #37439: Degraded PG does not discover remapped data on originating OSD - RADOS - Ceph

Actions

Copy link

Bug #37439

closed

Degraded PG does not discover remapped data on originating OSD

Added by Jonas Jelten over 5 years ago. Updated over 4 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Category:

Backfill/Recovery

Target version:

% Done:

Source:

Tags:

Backport:

nautilus, mimic, luminous

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v14.2.0

ceph-qa-suite:

Component(RADOS):

OSD

Pull request ID:

27288

Crash signature (v1):

Crash signature (v2):

Description

There seems to be an issue that an OSD is not queried for missing objects that were remapped, but the OSD for this is up. This happened in two different scenarios for us. In both, data is stored in EC pools (8+3).

Scenario 0¶

To remove a broken disk (e.g. osd.22), it is weighted to 0 with ceph osd out 22. Objects are remapped normally. During object movement, osd.22 is restarted (or crashes and then starts again). Now the bug shows up: Objects will become degraded and stay degraded, because osd.22 is not queried. ceph pq query shows:

    "might_have_unfound": [
      {
        "osd": "22(3)",
        "status": "not queried" 
      }
    ],

A workaround is to in the broken-disk osd temporarily. The osd is then queried and missing objects are discovered. Then, out the osd again. No objects are degraded any more and disk will be emptied.

Scenario 1¶

Add new disks to the cluster. Data is remapped to be transferred from the old disks (e.g. osd.19) to new disks (e.g. > osd.42).
When there is a restart an OSD of the old disks (or it restarts because of a crash), objects become degraded. The missing data is on the osd.19 but again it is not queried. ceph pg query shows:

    "might_have_unfound": [
      {
        "osd": "19(6)",
        "status": "not queried" 
      }
    ],

Only remapped data seems to be missing, if osd.19 is taken down, much more data is degraded. Mind that osd.19 is missing in the acting set in the current state of this PG:

    "up": [38, 36, 28, 17, 13, 39, 48, 10, 29, 5, 47],
    "acting": [36, 15, 28, 17, 13, 32, 2147483647, 10, 29, 5, 20],
    "backfill_targets": [
        "36(1)",
        "38(0)",
        "39(5)",
        "47(10)",
        "48(6)" 
    ],
    "acting_recovery_backfill": [
        "5(9)",
        "10(7)",
        "13(4)",
        "15(1)",
        "17(3)",
        "20(10)",
        "28(2)",
        "29(8)",
        "32(5)",
        "36(0)",
        "36(1)",
        "38(0)",
        "39(5)",
        "47(10)",
        "48(6)" 
    ],

For this scenario, I have not found a workaround yet. The cluster remains degraded until it has recovered by restoring the data.

So, overall I suspect there is a bug which prevents remapped pg data to be discovered. The PG already knows which OSD is the correct candidate, but does not query it.

Files

Download all files

ceph-osd.18.log.xz (27.4 KB) ceph-osd.18.log.xz	log level 20 of OSD 18	Jonas Jelten, 12/13/2018 12:59 PM
ceph-osd.38.log.xz (314 KB) ceph-osd.38.log.xz	log of primary for pg 6.65	Jonas Jelten, 04/01/2019 10:55 PM

Related issues 4 (1 open — 3 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #37439

Degraded PG does not discover remapped data on originating OSD

Scenario 0¶

Scenario 1¶

Updated by Jonas Jelten over 5 years ago

Updated by Greg Farnum over 5 years ago

Updated by Jonas Jelten over 5 years ago

Updated by Greg Farnum over 5 years ago

Updated by Neha Ojha over 5 years ago

Updated by Jonas Jelten over 5 years ago

Updated by Jonas Jelten over 5 years ago

Updated by Jonas Jelten over 5 years ago

Updated by Jonas Jelten about 5 years ago

Updated by Neha Ojha about 5 years ago

Updated by Neha Ojha about 5 years ago

Updated by Jonas Jelten about 5 years ago

Updated by Sage Weil about 5 years ago

Updated by Nathan Cutler about 5 years ago

Updated by Nathan Cutler about 5 years ago

Updated by Nathan Cutler about 5 years ago

Updated by Greg Farnum over 4 years ago

Updated by Jonas Jelten over 3 years ago