Project

General

Profile

Bug #37968

maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps

Added by Ed Fisher 6 months ago. Updated 5 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
01/18/2019
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
luminous,mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:

Description

It appears that OSDMap::maybe_remove_pg_upmaps's sanity checks are overzealous. With some crush rules it is possible for osdmaptool to generate valid upmaps, but maybe_remove_pg_upmaps will cancel them.

It looks like it relies on get_rule_failure_domain and rejects any upmap that results in two osds sharing a parent of that type. However, with a custom crush rule like "choose indep 2 type host, choose indep 2 type osd" such an upmap would be valid. Is it possible to use CrushWrapper::try_remap_rule or something similar to more thoroughly validate the upmap?

To reproduce:

  1. ceph osd erasure-code-profile set upmaptest plugin=jerasure k=2 m=2 crush-device-class=hdd crush-failure-domain=osd
  2. create a crush rule for the pool:
    {
        "rule_id": 2,
        "rule_name": "upmaptest",
        "ruleset": 2,
        "type": 3,
        "min_size": 3,
        "max_size": 4,
        "steps": [
            {
                "op": "set_chooseleaf_tries",
                "num": 5
            },
            {
                "op": "set_choose_tries",
                "num": 100
            },
            {
                "op": "take",
                "item": -1,
                "item_name": "default" 
            },
            {
                "op": "choose_indep",
                "num": 2,
                "type": "host" 
            },
            {
                "op": "choose_indep",
                "num": 2,
                "type": "osd" 
            },
            {
                "op": "emit" 
            }
        ]
    } ]
    }
    
  3. ceph osd pool create upmaptest 8 8 erasure upmaptest
  4. Submit an upmap where the source+target osd are on the same host: ceph osd pg-upmap-items 2.7 1 2

The mon's debug log will show "2019-01-18 19:16:32.044 7fdd4d0a2700 10 maybe_remove_pg_upmaps cancel invalid pending pg_upmap_items entry 2.7->[1,2]"

This is an edge case since it depends on using a custom crush rule, but it almost completely breaks the upmap functionality for affected pools.


Related issues

Copied to RADOS - Backport #38162: luminous: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps Resolved
Copied to RADOS - Backport #38163: mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps New

History

#1 Updated by xie xingguo 6 months ago

  • Assignee set to xie xingguo

#2 Updated by xie xingguo 6 months ago

  • Status changed from New to Pending Backport
  • Backport set to luminous,mimic

#4 Updated by Nathan Cutler 5 months ago

  • Pull request ID set to 26179

#5 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #38162: luminous: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps added

#6 Updated by Nathan Cutler 5 months ago

  • Copied to Backport #38163: mimic: maybe_remove_pg_upmaps incorrectly cancels valid pending upmaps added

Also available in: Atom PDF