Project

General

Profile

Bug #52741

pg inconsistent state is lost after the primary osd restart

Added by Mykola Golub over 2 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Steps to reproduce:

- Create a pool (either replicated or erasure)
- Introduce an inconsistency (e.g. put an object and then remove one replica or shard)
- Run pg deep-scrub to detect the inconsistency and the pg enter 'active+clean+inconsistent' state
- Restart the primary osd.
- After the osd is up again and the pg is peered it is in 'active+clean' state.

The expected result: the pg 'inconsistent' state is not lost after the primary osd restart.

History

#1 Updated by Mykola Golub over 2 years ago

Just a note. I see that to re-detect inconsistency just running scrub (not deep-scrub) is enough, which is supposed to be run every day, so I am not sure if it is a bug or expected behavior.

#2 Updated by Neha Ojha over 2 years ago

  • Assignee set to Ronen Friedman

Can you please verify this behavior?

#3 Updated by yite gu over 2 years ago

What is the way you remove replica?

#4 Updated by Mykola Golub over 2 years ago

yite gu wrote:

What is the way you remove replica?

In my case it was filestore so I just remove the file on the fs. With bluestore I would use ceph-objectstore-tool for this. But I don't think the nature of inconsistency is important here. The first time I noticed the issue was when I was investigating an inconsistency introduced by invalid hinfo, and then later when checking it was easily reproducible I was introducing an inconsistency by just removing a replica.

Also available in: Atom PDF