Actions
Bug #39402
openCan't remove ghost PGs
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This is on the downstream long-running cluster. I can grant SSH access to whomever needs it.
This bug is similar to http://tracker.ceph.com/issues/10411 except I can't perform the troubleshooting steps because most of the OSDs are bluestore.
All the OSDs on reesi004 were lost due to the journals being reused but not zapped prior to re-adding them. Alfredo created a ticket to fix that.
[root@reesi001 ~]# ceph health detail HEALTH_ERR noout flag(s) set; 158255/11903128 objects misplaced (1.330%); Reduced data availability: 8 pgs inactive; 2 slow requests are blocked > 32 sec. Implicated osds ; 5 stuck requests are blocked > 4096 sec. Implicated osds 1,3,25,72,85 OSDMAP_FLAGS noout flag(s) set OBJECT_MISPLACED 158255/11903128 objects misplaced (1.330%) PG_AVAILABILITY Reduced data availability: 8 pgs inactive pg 1.83 is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.d2 is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.11b is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.15e is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.169 is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.173 is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.1a0 is stuck inactive for 4019.820958, current state unknown, last acting [] pg 1.1ed is stuck inactive for 4019.820958, current state unknown, last acting [] REQUEST_SLOW 2 slow requests are blocked > 32 sec. Implicated osds 2 ops are blocked > 2097.15 sec REQUEST_STUCK 5 stuck requests are blocked > 4096 sec. Implicated osds 1,3,25,72,85 5 ops are blocked > 4194.3 sec osds 1,3,25,72,85 have stuck requests > 4194.3 sec [root@reesi001 ~]# ceph pg 1.83 query Error ENOENT: i don't have pgid 1.83 [root@reesi001 ~]# ceph pg 1.83 mark_unfound_lost delete Error ENOENT: i don't have pgid 1.83 [root@reesi001 ~]# ceph pg force_create_pg 1.83 Error ENOTSUP: this command is obsolete
I'm hoping the blocked requests are related.
Actions