Bug #12716: Cluster health_warn stuck on active+remapped - Ceph - Ceph

Actions

Copy link

Bug #12716

closed

Cluster health_warn stuck on active+remapped

Added by Steve Dainard over 8 years ago. Updated about 7 years ago.

Status:

Can't reproduce

Priority:

Normal

Assignee:

Category:

ceph cli

Target version:

% Done:

Source:

other

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

v0.94.2

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

I ran a ceph osd reweight-by-utilization last week and partway through had a network interruption. After the network was restored the cluster continued to rebalance but eventually idled with active+remapped and degraded PG's. I added 2 OSD's to the cluster and ran another reweight-by-utilization and now the cluster is idle with 3 PG's active-remapped.

ceph -s
cluster af859ff1-c394-4c9a-95e2-0e0e4c87445c
health HEALTH_WARN
3 pgs stuck unclean
recovery 24379/66089446 objects misplaced (0.037%)
monmap e24: 3 mons at {mon1=10.0.231.53:6789/0,mon2=10.0.231.54:6789/0,mon3=10.0.231.55:6789/0}
election epoch 268, quorum 0,1,2 mon1,mon2,mon3
osdmap e186553: 102 osds: 102 up, 102 in; 3 remapped pgs
pgmap v3178336: 4144 pgs, 7 pools, 125 TB data, 32270 kobjects
251 TB used, 118 TB / 370 TB avail
24379/66089446 objects misplaced (0.037%)
4141 active+clean
3 active+remapped

ceph health detail
HEALTH_WARN 3 pgs stuck unclean; recovery 24379/66089446 objects misplaced (0.037%)
pg 2.e7f is stuck unclean for 517058.124297, current state active+remapped, last acting [58,5]
pg 2.b16 is stuck unclean for 434261.024579, current state active+remapped, last acting [40,90]
pg 2.782 is stuck unclean for 307997.053475, current state active+remapped, last acting [76,101]
recovery 24379/66089446 objects misplaced (0.037%)

ceph pg 2.e7f query|head {
"state": "active+remapped",
"snap_trimq": "[]",
"epoch": 186553,
"up": [
58
],
"acting": [
58,
5
[root@ceph1 media]# ceph pg 2.b16 query|head {
"state": "active+remapped",
"snap_trimq": "[]",
"epoch": 186553,
"up": [
40
],
"acting": [
40,
90
[root@ceph1 media]# ceph pg 2.782 query|head {
"state": "active+remapped",
"snap_trimq": "[]",
"epoch": 186553,
"up": [
76
],
"acting": [
76,
101

Full pg queries and crushmap, and above osd debug logs attached.

Files

Download all files

pg_query (28.2 KB) pg_query		Steve Dainard, 08/17/2015 09:30 PM
decompiled-crushmap (5.25 KB) decompiled-crushmap		Steve Dainard, 08/17/2015 09:30 PM
osd-logs.tar.gz (355 KB) osd-logs.tar.gz		Steve Dainard, 08/17/2015 09:30 PM