Project

General

Profile

Actions

Bug #43124

closed

Probably legal crush rules cause upmaps to be cleaned

Added by David Zafman over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I've seen multiple user sites with crush rules for EC pools which will trigger the verify_upmap() to detect an error. At that point the clean upmaps mechanism will purge all upmaps from their EC pool PGs.

Pull request https://github.com/ceph/ceph/pull/31131

commit 712a39e5c9d9848f618ad55a768103d84c0a460f "crush: remove invalid upmap items”

{
"rule_id": 5,
"rule_name": "ecrule",
"ruleset": 5,
"type": 3,
"min_size": 1,
"max_size": 15,
"steps": [ {
"op": "take",
"item": -417,
"item_name": "default~ssd"
}, {
"op": "choose_firstn",
"num": 4,
"type": "rack"
}, {
"op": "chooseleaf_indep",
"num": 3, <<<<<< This triggers the problem
"type": "host"
}, {
"op": "emit"
}
]
}

I added some extra logging information. This is what happens on every upmap for a PG in this pool. It triggers the removal of the upmap.
2019-12-03 18:57:33.919715 7f57c70a63c0 10 verify_upmap rule_id 5 pool_size 11
2019-12-03 18:57:33.919717 7f57c70a63c0 10 verify_upmap step 0 op 1 arg1 -417 arg2 0
2019-12-03 18:57:33.919718 7f57c70a63c0 10 verify_upmap step 1 op 2 arg1 4 arg2 3
2019-12-03 18:57:33.919754 7f57c70a63c0 10 verify_upmap step 2 op 7 arg1 3 arg2 1
2019-12-03 18:57:33.919905 7f57c70a63c0 10 verify_upmap osds_by_parent {-633=2190,-618=2084,-582=1775,-579=1754,-561=1607,-468=2580,-438=2374,-432=2331,-72=588,-60=582,-3=20}
2019-12-03 18:57:33.919937 7f57c70a63c0 -1 verify_upmap expected 3 items in bucket -417 real 11
2019-12-03 18:57:33.919939 7f57c70a63c0 0 check_pg_upmaps verify_upmap of pg 7.23 returning -22

Actions #1

Updated by David Zafman over 4 years ago

We are reverting the original pull request which changed verify_upmaps(): https://github.com/ceph/ceph/pull/31131

This tracker could be used to track a re-implementation of that change.

Actions #2

Updated by David Zafman over 4 years ago

  • Pull request ID set to 32099
Actions #3

Updated by David Zafman about 4 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF