Project

General

Profile

Actions

Bug #20439

closed

PG never finishes getting created

Added by David Zafman almost 7 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

dzafman-2017-06-26_14:07:20-rados-wip-13837-distro-basic-smithi/1328370

description: rados/singleton/{all/divergent_priors.yaml msgr-failures/many.yaml msgr/random.yaml

This ended up as a dead job that didn't get past initially waiting for PGs to go clean. PG 0.2 was still "creating+peering" after many hours.

Repeated run of ceph pg dump from 2017-06-26 22:16:01 to 2017-06-27 10:11:41.


{
  "pgid": "0.2",
  "version": "0'0",
  "reported_seq": "5",
  "reported_epoch": "9",
  "state": "creating+peering",
  "last_fresh": "2017-06-26 22:15:59.640357",
  "last_change": "2017-06-26 22:15:37.799546",
  "last_active": "2017-06-26 22:15:34.421810",
  "last_peered": "2017-06-26 22:15:34.421810",
  "last_clean": "2017-06-26 22:15:34.421810",
  "last_became_active": "0.000000",
  "last_became_peered": "0.000000",
  "last_unstale": "2017-06-26 22:15:59.640357",
  "last_undegraded": "2017-06-26 22:15:59.640357",
  "last_fullsized": "2017-06-26 22:15:59.640357",
  "mapping_epoch": 5,
  "log_start": "0'0",
  "ondisk_log_start": "0'0",
  "created": 2,
  "last_epoch_clean": 0,
  "parent": "0.0",
  "parent_split_bits": 0,
  "last_scrub": "0'0",
  "last_scrub_stamp": "2017-06-26 22:15:34.421810",
  "last_deep_scrub": "0'0",
  "last_deep_scrub_stamp": "2017-06-26 22:15:34.421810",
  "last_clean_scrub_stamp": "2017-06-26 22:15:34.421810",
  "log_size": 0,
  "ondisk_log_size": 0,
  "stats_invalid": false,
  "dirty_stats_invalid": false,
  "omap_stats_invalid": false,
  "hitset_stats_invalid": false,
  "hitset_bytes_stats_invalid": false,
  "pin_stats_invalid": false,
  "stat_sum": {
    "num_bytes": 0,
    "num_objects": 0,
    "num_object_clones": 0,
    "num_object_copies": 0,
    "num_objects_missing_on_primary": 0,
    "num_objects_missing": 0,
    "num_objects_degraded": 0,
    "num_objects_misplaced": 0,
    "num_objects_unfound": 0,
    "num_objects_dirty": 0,
    "num_whiteouts": 0,
    "num_read": 0,
    "num_read_kb": 0,
    "num_write": 0,
    "num_write_kb": 0,
    "num_scrub_errors": 0,
    "num_shallow_scrub_errors": 0,
    "num_deep_scrub_errors": 0,
    "num_objects_recovered": 0,
    "num_bytes_recovered": 0,
    "num_keys_recovered": 0,
    "num_objects_omap": 0,
    "num_objects_hit_set_archive": 0,
    "num_bytes_hit_set_archive": 0,
    "num_flush": 0,
    "num_flush_kb": 0,
    "num_evict": 0,
    "num_evict_kb": 0,
    "num_promote": 0,
    "num_flush_mode_high": 0,
    "num_flush_mode_low": 0,
    "num_evict_mode_some": 0,
    "num_evict_mode_full": 0,
    "num_objects_pinned": 0,
    "num_legacy_snapsets": 0
  },
  "up": [
    0,
    1
  ],
  "acting": [
    0,
    1
  ],
  "blocked_by": [
    1
  ],
  "up_primary": 0,
  "acting_primary": 0
}
Actions #1

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
Actions #2

Updated by Josh Durgin over 6 years ago

  • Priority changed from Normal to Urgent

Same thing in http://pulpito.ceph.com/yuriw-2018-01-04_20:43:14-rados-wip-yuri4-testing-2018-01-04-1750-distro-basic-smithi/2027040/

PG 1.6 is stuck creating+peering, up = acting = [2,6], blocked_by=6.

No indication of osd.6 crashing in teuthology.log.

Actions #3

Updated by Kefu Chai about 6 years ago

  • Assignee set to Kefu Chai
Actions #4

Updated by Sage Weil about 6 years ago

  • Status changed from New to Can't reproduce
Actions #5

Updated by David Zafman over 5 years ago

  • Status changed from Can't reproduce to 12

Seen again:

/a/dzafman-2018-09-26_22:31:44-rados-wip-zafman-testing-distro-basic-smithi/3074605

Actions #6

Updated by Sage Weil over 5 years ago

  • Status changed from 12 to Resolved

I'm going to guess this reoccurance was actually #37775

Actions

Also available in: Atom PDF