Project

General

Profile

Actions

Bug #25103

closed

mgr: pgs show in unknown state despite being active

Added by Sage Weil over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

- mgr restarts
- mgr receives reports

2018-07-24 23:43:40.372 7f6e07d00700  1 -- 172.21.15.97:6816/15444 <== osd.3 172.21.15.97:6812/15560 2 ==== mgrreport(osd.3 +54-0 packed 742) v6 ==== 5008+0+0 (2660646926 0 0) 0x5595bd20b180 con 0x5595bd1de000
2018-07-24 23:43:40.372 7f6e07d00700  4 mgr.server handle_report from 0x5595bd1de000 osd,3
2018-07-24 23:43:40.372 7f6e07d00700 20 mgr.server handle_report updating existing DaemonState for osd,3
2018-07-24 23:43:40.372 7f6e07d00700 20 mgr update loading 54 new types, 0 old types, had 522 types, got 742 bytes of data
...
2018-07-24 23:43:40.372 7f6e07d00700  1 -- 172.21.15.97:6816/15444 <== osd.3 172.21.15.97:6812/15560 3 ==== pg_stats(1 pgs tid 0 v 0) v1 ==== 743+0+0 (1419596817 0 0) 0x5595bd144ec0 con 0x5595bd1de000
2018-07-24 23:43:40.372 7f6e07d00700  4 mgr.server maybe_ready initial report from osd 3
2018-07-24 23:43:40.372 7f6e07d00700  4 mgr.server maybe_ready still waiting for 2 osds to report in before PGMap is ready

- mgr's PGMap still has most pgs unknown
2018-07-24 23:43:40.376 7f6e07d00700 10 mgr.server operator() 8 pgs: 2 peering, 6 unknown; 0 B data, 8.1 GiB used, 712 GiB / 720 GiB avail

/a/yuriw-2018-07-24_22:40:04-upgrade:luminous-x-mimic-distro-basic-smithi/2810446

Actions #1

Updated by John Spray over 5 years ago

  • Category set to ceph-mgr
Actions #2

Updated by John Spray over 5 years ago

  • Status changed from 12 to Fix Under Review
Actions #3

Updated by huanwen ren over 5 years ago

The effect is as follows:
1. pgmap_ready is false

[root@cephfs103 ~]# ceph pg dump pgs_brief
pg_state_ready 0
dumped pgs_brief
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
3.1ff unknown [] -1 [] -1
3.1fe unknown [] -1 [] -1
......

2.pgmap_ready is true

[root@cephfs103 ~]# ceph pg dump pgs_brief
pg_state_ready 1
dumped pgs_brief
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
3.1ff active+undersized+degraded [8,16] 8 [8,16] 8
3.1fe active+undersized+degraded [2,17] 2 [2,17] 2

Actions #4

Updated by Paul Emmerich over 5 years ago

We have also encountered this while upgrading to mimic

Actions #5

Updated by huanwen ren over 5 years ago

huanwen ren wrote:

The effect is as follows:
1. pgmap_ready is false

[root@cephfs103 ~]# ceph pg dump pgs_brief
pg_state_ready 0
dumped pgs_brief
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
3.1ff unknown [] -1 [] -1
3.1fe unknown [] -1 [] -1
......

2.pgmap_ready is true

[root@cephfs103 ~]# ceph pg dump pgs_brief
pg_state_ready 1
dumped pgs_brief
PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY
3.1ff active+undersized+degraded [8,16] 8 [8,16] 8
3.1fe active+undersized+degraded [2,17] 2 [2,17] 2

finally outprint:
(1) cli print:

Warning: due to ceph-mgr restart, some PG states may not be up to date 
dumped pgs_brief
3.1ff   unknown []         -1     []             -1
3.1fe   unknown []         -1     []             -1

(2)JSON prints as follows:

{
    "pgmap_ready": {
        "pgmap_ready": false
    },
    "pg_stats": [
        {
            "pgid": "3.1ff",
            "state": "unknown",
            "up": [],
            "acting": [],
            "up_primary": -1,
            "acting_primary": -1
        },
        ......
    ]
}

Actions #6

Updated by John Spray over 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF