Project

General

Profile

Actions

Bug #15133

closed

pg stuck in down+peering state

Added by huang jun about 8 years ago. Updated about 8 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version: 0.94.5
kernel version: 3.18.25

Ceph Cluster include 4 hosts:
server1: 192.168.10.1 (24 osd)
server2:12.168.10.2 (24 osd)
server3: 192.168.10.3(24 osd)
server4:192.168.10.4 (1mon, 1mds, 24osd)

Pool info(we have a ec pool with k:m=3:1):
pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 393 flags

hashpspool stripe_width 0
pool 1 'data' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 389 flags

hashpspool crash_replay_interval 45 stripe_width 0
pool 2 'metadata' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 399 flags

hashpspool stripe_width 0
pool 3 'ecpool-1' erasure size 4 min_size 3 crush_ruleset 1 object_hash rjenkins pg_num 1152 pgp_num 1152 last_change 408 lfor

408 flags hashpspool tiers 4 read_tier 4 write_tier 4 stripe_width 196608
pool 4 'capool-1' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 288 pgp_num 288 last_change 414 flags

hashpspool,incomplete_clones tier_of 3 cache_mode readproxy target_bytes 2000000000000 hit_set bloom{false_positive_probability:

0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0

Our test steps:
1. cut off server1 and server2's power at same time.
2. wait for 10min, 48 osds status turns from DOWN to OUT.
duringt the down-->out, there is no client io.
3. power on host server1
4. the final cluster status is:
cluster 29613322-0857-4633-8f73-7f5ebe16f4b8
health HEALTH_WARN
119 pgs degraded
560 pgs down
560 pgs peering
119 pgs stuck degraded
560 pgs stuck inactive
1152 pgs stuck unclean
119 pgs stuck undersized
119 pgs undersized
monmap e1: 1 mons at {server4=192.168.10.4:6789/0}
election epoch 2, quorum 0 server4
mdsmap e6: 1/1/1 up {0=server4=up:active}
osdmap e1101: 96 osds: 72 up, 72 in; 473 remapped pgs
pgmap v6677: 2208 pgs, 5 pools, 45966 MB data, 33630 objects
891 GB used, 130 TB / 130 TB avail
1056 active+clean
560 down+peering
473 active+remapped
119 active+undersized+degraded
5、ceph pg 3.1ac query
"recovery_state": [ {
"name": "Started\/Primary\/Peering\/GetInfo",
"enter_time": "2016-03-15 16:21:50.002705",
"requested_info_from": []
}, {
"name": "Started\/Primary\/Peering",
"enter_time": "2016-03-15 16:21:50.002696",
"past_intervals": [ {
"first": 906,
"last": 923,
"maybe_went_rw": 1,
"up": [
94,
60,
27,
13
],
"acting": [
94,
60,
27,
13
],
"primary": 94,
"up_primary": 94
}, {
"first": 924,
"last": 925,
"maybe_went_rw": 1,
"up": [
94,
60,
2147483647,
13
],
"acting": [
94,
60,
2147483647,
13
],
"primary": 94,
"up_primary": 94
}, {
"first": 926,
"last": 934,
"maybe_went_rw": 0,
"up": [
94,
2147483647,
2147483647,
13
],
"acting": [
94,
2147483647,
2147483647,
13
],
"primary": 94,
"up_primary": 94
}, {
"first": 935,
"last": 937,
"maybe_went_rw": 1,
"up": [
94,
2147483647,
24,
13
],
"acting": [
94,
2147483647,
24,
13
],
"primary": 94,
"up_primary": 94
}
],
"probing_osds": [
"13(3)",
"24(2)",
"27(2)",
"94(0)"
],
"blocked": "peering is blocked due to down osds",
"down_osds_we_would_probe": [
60
],
"peering_blocked_by": [ {
"osd": 60,
"current_lost_at": 0,
"comment": "starting or marking this osd lost may let us proceed"
}
]
}, {
"name": "Started",
"enter_time": "2016-03-15 16:21:50.002649"
}
],

what we found:
during osdmap [924~925],pg(3.1ac) mapped to:
"up": [94,60,2147483647, 13]
"acting": [94,60,2147483647, 13]

But now, osd.60(which is on server2) in DOWN state,
so this pg's peering procedure will blocked on osd.60,
and set pg state to 'down+peering'.

osd.60 in epoch 924~925 doesn't marked down, so we think during this
time there maybe write/update in this pg, so we cannot continue the peering
until we manually set the osd.60 'lost'.But we don't have any client IO at all.

Is there any way to handle this special case smarter?
the fast-peering plans to resolve this?

Actions

Also available in: Atom PDF