Bug #9752
closedacting in past intervals contains primary and up_primary (looks like duplicates but is a display problem only)
80%
Description
In a 0.80.6 in the context of http://tracker.ceph.com/issues/9750 the following showed up (the full output can be found at http://tracker.ceph.com/issues/9750#note-10).
$ ceph pg 57.7d query ... { "first": 10934, "last": 10936, "maybe_went_rw": 0, "up": [ 3], "acting": [ 3, 3, 3]},
It does not look like it could be a display bug because the code is straightforward . There has been many actions on the cluster which started as 0.80.5.
It's quite impossible to tell at which point this problem showed up for the first time and it's equally impossible to retrace the history of the cluster. I'm filling this bug just in case it shows up again but I don't see what kind of diagnostic is possible at this point. From the bash history I found that a monmap was injected a few weeks ago but did not find traces of other inject that could be disruptive.
Files
Updated by David Zafman over 9 years ago
"kit" on #ceph was in a situation of having incomplete pg. They sent the pg query output and it showed strange past_intervals information as in this bug:
{ "first": 8194,
"last": 8194,
"maybe_went_rw": 0,
"up": [
13,
15],
"acting": [
13,
15,
13,
13]},
Updated by Loïc Dachary over 9 years ago
Full logs from pastebin to survive expiration.
Updated by Greg Farnum over 9 years ago
See the thread "[ceph-users] Troubleshooting Incomplete PGs" for another instance of this (and there are several more on the list archives if you want to search).
In that case, it's also reporting "-1" for some active slots, which is also of interest.
Updated by Samuel Just over 9 years ago
Actually, that thread is the same instance as david's.
Updated by Samuel Just over 9 years ago
First thing we want to get is an osdmap from the misbehaving epoch.
Loic: you can get the osdmap for a particular epoch with something like ceph osd getmap <epoch> -o <outfile>
You'd probably want epoch 10934 in this case.
Updated by Loïc Dachary over 9 years ago
It is unfortunately gone
root@g3:~# ceph osd getmap 10934 > /tmp/osdmap.10934 Error ENOENT: there is no map for epoch 10934
Updated by David Zafman over 9 years ago
I happen to notice the issue because I happen to look at this guys pastebin. I didn't interact with him at all. Not sure if he ever posted his information to ceph-users or ceph-devel. This might have been on Oct 19.
3:47:15 PM kitz: I've got a number of PGs that are "incomplete" and I'm not sure what to do about them. Can anyone help?
3:56:18 PM lincolnb: kitz: do a 'ceph pg [pgid] query'
3:56:39 PM lincolnb: it might give some suggestions near the end
3:56:45 PM lincolnb: like what OSDs its trying to probe
3:57:08 PM lincolnb: all of those OSDs should be up and in
3:57:34 PM kitz: http://pastebin.com/raw.php?i=AJ3RMjz6
3:58:35 PM lincolnb: im looking at the very bottom, "probing_osds"
3:58:42 PM kitz: all of my OSDs are up+in but some are doing a backfill. I don't think that's related though...
3:58:45 PM lincolnb: are osds 10,13,15,and 25 up ?
3:59:00 PM kitz: yeah, why so many? with this pool size=2
3:59:27 PM lincolnb: im not an expert but i think those are the potential locations for the pg to be if an osd goes down
3:59:58 PM lincolnb: like, the acting set is [13,15] but if 13 goes down it would put the pg on [10,15] (something like that)
4:00:07 PM lincolnb: im sure one of the experts can correct me :)
4:00:19 PM kitz: I was having trouble with 10 earlier...
4:00:30 PM lincolnb: ahh
4:02:34 PM kitz: But what can I do to recover the PG?
4:04:32 PM lincolnb: to be honest, i am not 100% sure. i had the same problem, but i ended up completely deleting and recreating the pg, but thats guaranteed data loss.. so you probably don't want to do that.
4:07:11 PM lincolnb: you might be able to get a more helpful response if you post to the problem to the ceph users list and attach the query
4:09:54 PM kitz: lincolnb: thanks
Updated by Loïc Dachary over 9 years ago
Could it be that the the acting vector size is not reset when ceph osd pool set size 4 is followed by ceph osd pool set size 2 ?
Sam says he looked into this but found that acting is rebuilt one osd at a time and it is unlikely to be the source of the problem.
Updated by Loïc Dachary over 9 years ago
root@g3:~# ceph osd getmap 11523 > /tmp/osdmap.11523 Error ENOENT: there is no map for epoch 11523 root@g3:~# ceph osd getmap 11522 > /tmp/osdmap.11522 Error ENOENT: there is no map for epoch 11522
Updated by Loïc Dachary over 9 years ago
- File osdmap.10818 osdmap.10818 added
- File osdmap.10831 osdmap.10831 added
- File osdmap.10832 osdmap.10832 added
- File osdmap.10858 osdmap.10858 added
- File osdmap.10859 osdmap.10859 added
- File osdmap.10882 osdmap.10882 added
Fortunately I saved an entire osd directory from which I was able to extract osdmaps with duplicates related to q.txt
Updated by Loïc Dachary over 9 years ago
$ for osdmap in /tmp/osdmap.108* ; do ./osdmaptool --test-map-pg 57.7d $osdmap ; done ./osdmaptool: osdmap file '/tmp/osdmap.10818' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([11,3], p11) acting ([3,6], p3) ./osdmaptool: osdmap file '/tmp/osdmap.10831' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([11,3], p11) acting ([3,6], p3) ./osdmaptool: osdmap file '/tmp/osdmap.10832' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([11,3], p11) acting ([11,3], p11) ./osdmaptool: osdmap file '/tmp/osdmap.10858' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([11,3], p11) acting ([11,3], p11) ./osdmaptool: osdmap file '/tmp/osdmap.10859' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([3], p3) acting ([3], p3) ./osdmaptool: osdmap file '/tmp/osdmap.10882' parsed '57.7d' -> 57.7d 57.7d raw ([11,3], p11) up ([3], p3) acting ([3], p3)
{ "first": 10818, "last": 10831, "maybe_went_rw": 1, "up": [ 11, 3], "acting": [ 3, 6, 3, 11]}, { "first": 10832, "last": 10858, "maybe_went_rw": 1, "up": [ 11, 3], "acting": [ 11, 3, 11, 11]}, { "first": 10859, "last": 10882, "maybe_went_rw": 0, "up": [ 3], "acting": [ 3, 3, 3]},
Updated by Loïc Dachary over 9 years ago
On a cluster running from sources with
# ceph version 0.87-578-g105d30a (105d30a6d3546f26b0e052e0e97fff1c7a50b54c) $ MON=1 OSD=3 ./vstart.sh -n -l mon osd
the following shows during peering, immediately after it is started
$ ceph pg 0.2 query { "state": "peering", "snap_trimq": "[]", "epoch": 10, "up": [ 0, 1, 2], "acting": [ 0, 1, 2], "actingbackfill": [ "0", "1", "2"], "info": { "pgid": "0.2", "last_update": "0'0", "last_complete": "0'0", "log_tail": "0'0", "last_user_version": 0, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 1, "last_epoch_started": 9, "last_epoch_clean": 9, "last_epoch_split": 0, "same_up_since": 10, "same_interval_since": 10, "same_primary_since": 5, "last_scrub": "0'0", "last_scrub_stamp": "2014-10-30 22:33:15.919997", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-10-30 22:33:15.919997", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "0'0", "reported_seq": "23", "reported_epoch": "10", "state": "peering", "last_fresh": "2014-10-30 22:33:27.623073", "last_change": "2014-10-30 22:33:27.576033", "last_active": "2014-10-30 22:33:23.544610", "last_clean": "0.000000", "last_became_active": "0.000000", "last_unstale": "2014-10-30 22:33:27.623073", "last_undegraded": "2014-10-30 22:33:27.623073", "last_fullsized": "2014-10-30 22:33:27.623073", "mapping_epoch": 8, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 1, "last_epoch_clean": 9, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-10-30 22:33:15.919997", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-10-30 22:33:15.919997", "last_clean_scrub_stamp": "0.000000", "log_size": 0, "ondisk_log_size": 0, "stats_invalid": "0", "stat_sum": { "num_bytes": 0, "num_objects": 0, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 0, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 0, "num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 0, 1, 2], "acting": [ 0, 1, 2], "blocked_by": [], "up_primary": 0, "acting_primary": 0}, "empty": 1, "dne": 0, "incomplete": 0, "last_epoch_started": 9, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, "peer_info": [ { "peer": "1", "pgid": "0.2", "last_update": "0'0", "last_complete": "0'0", "log_tail": "0'0", "last_user_version": 0, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 1, "last_epoch_started": 9, "last_epoch_clean": 9, "last_epoch_split": 0, "same_up_since": 10, "same_interval_since": 10, "same_primary_since": 5, "last_scrub": "0'0", "last_scrub_stamp": "2014-10-30 22:33:15.919997", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-10-30 22:33:15.919997", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "0'0", "reported_seq": "15", "reported_epoch": "9", "state": "peering", "last_fresh": "2014-10-30 22:33:22.580789", "last_change": "2014-10-30 22:33:21.195947", "last_active": "2014-10-30 22:33:19.939091", "last_clean": "0.000000", "last_became_active": "0.000000", "last_unstale": "2014-10-30 22:33:22.580789", "last_undegraded": "2014-10-30 22:33:22.580789", "last_fullsized": "2014-10-30 22:33:22.580789", "mapping_epoch": 8, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 1, "last_epoch_clean": 6, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "2014-10-30 22:33:15.919997", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "2014-10-30 22:33:15.919997", "last_clean_scrub_stamp": "0.000000", "log_size": 0, "ondisk_log_size": 0, "stats_invalid": "0", "stat_sum": { "num_bytes": 0, "num_objects": 0, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 0, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 0, "num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [ 0, 1, 2], "acting": [ 0, 1, 2], "blocked_by": [], "up_primary": 0, "acting_primary": 0}, "empty": 1, "dne": 0, "incomplete": 0, "last_epoch_started": 9, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}, { "peer": "2", "pgid": "0.2", "last_update": "0'0", "last_complete": "0'0", "log_tail": "0'0", "last_user_version": 0, "last_backfill": "MAX", "purged_snaps": "[]", "history": { "epoch_created": 0, "last_epoch_started": 0, "last_epoch_clean": 0, "last_epoch_split": 0, "same_up_since": 0, "same_interval_since": 0, "same_primary_since": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000"}, "stats": { "version": "0'0", "reported_seq": "0", "reported_epoch": "0", "state": "inactive", "last_fresh": "0.000000", "last_change": "0.000000", "last_active": "0.000000", "last_clean": "0.000000", "last_became_active": "0.000000", "last_unstale": "0.000000", "last_undegraded": "0.000000", "last_fullsized": "0.000000", "mapping_epoch": 0, "log_start": "0'0", "ondisk_log_start": "0'0", "created": 0, "last_epoch_clean": 0, "parent": "0.0", "parent_split_bits": 0, "last_scrub": "0'0", "last_scrub_stamp": "0.000000", "last_deep_scrub": "0'0", "last_deep_scrub_stamp": "0.000000", "last_clean_scrub_stamp": "0.000000", "log_size": 0, "ondisk_log_size": 0, "stats_invalid": "0", "stat_sum": { "num_bytes": 0, "num_objects": 0, "num_object_clones": 0, "num_object_copies": 0, "num_objects_missing_on_primary": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 0, "num_whiteouts": 0, "num_read": 0, "num_read_kb": 0, "num_write": 0, "num_write_kb": 0, "num_scrub_errors": 0, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 0, "num_objects_recovered": 0, "num_bytes_recovered": 0, "num_keys_recovered": 0, "num_objects_omap": 0, "num_objects_hit_set_archive": 0, "num_bytes_hit_set_archive": 0}, "stat_cat_sum": {}, "up": [], "acting": [], "blocked_by": [], "up_primary": -1, "acting_primary": -1}, "empty": 1, "dne": 1, "incomplete": 0, "last_epoch_started": 0, "hit_set_history": { "current_last_update": "0'0", "current_last_stamp": "0.000000", "current_info": { "begin": "0.000000", "end": "0.000000", "version": "0'0"}, "history": []}}], "recovery_state": [ { "name": "Started\/Primary\/Peering\/WaitUpThru", "enter_time": "2014-10-30 22:33:27.623078", "comment": "waiting for osdmap to reflect a new up_thru for this osd"}, { "name": "Started\/Primary\/Peering", "enter_time": "2014-10-30 22:33:27.575916", "past_intervals": [ { "first": 8, "last": 9, "maybe_went_rw": 1, "up": [ 0, 1], "acting": [ 0, 1, 0, 0]}], "probing_osds": [ "0", "1", "2"], "down_osds_we_would_probe": [], "peering_blocked_by": []}, { "name": "Started", "enter_time": "2014-10-30 22:33:27.575605"}], "agent_state": {}}
Updated by Loïc Dachary over 9 years ago
- Status changed from Need More Info to Fix Under Review
- % Done changed from 0 to 80
Updated by Loïc Dachary over 9 years ago
- Subject changed from acting in past intervals contains duplicates to acting in past intervals contains primary and up_primary (looks like duplicates but is a display problem only)
Updated by Greg Farnum over 9 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Loïc Dachary over 9 years ago
- Status changed from Pending Backport to Fix Under Review
Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Resolved