Project

General

Profile

Bug #43656

AssertionError: not all PGs are active or peered 15 seconds after marking out OSDs

Added by Sage Weil about 4 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-01-17T22:19:28.631 ERROR:tasks.thrashosds.thrasher:exception:
Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_liewegas_ceph_wip-cephadm-cot/qa/tasks/ceph_manager.py", line 1040, in do_thrash
    self._do_thrash()
  File "/home/teuthworker/src/github.com_liewegas_ceph_wip-cephadm-cot/qa/tasks/ceph_manager.py", line 1052, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_liewegas_ceph_wip-cephadm-cot/qa/tasks/ceph_manager.py", line 1182, in _do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_liewegas_ceph_wip-cephadm-cot/qa/tasks/ceph_manager.py", line 847, in test_pool_min_size
    'not all PGs are active or peered 15 seconds after marking out OSDs'
AssertionError: not all PGs are active or peered 15 seconds after marking out OSDs

/a/sage-2020-01-17_21:45:24-rados:thrash-erasure-code-master-distro-basic-smithi/4679221

Related issues

Copied to RADOS - Backport #43776: nautilus: AssertionError: not all PGs are active or peered 15 seconds after marking out OSDs Rejected

History

#1 Updated by Sage Weil about 4 years ago

In this case, the workload happened to delete the old pool/pgs and create a new one right before the check, so the new pool's PGs were all in state 'unknown'--not because of the out osd, but because they were new.

i.e., the test is buggy.

#2 Updated by Sage Weil about 4 years ago

/a/sage-2020-01-20_14:10:17-rados:thrash-erasure-code-wip-sage-testing-2020-01-19-1713-distro-basic-smithi/4688160

#3 Updated by Sage Weil about 4 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 32737

#4 Updated by Sage Weil about 4 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to nautilus

#5 Updated by Nathan Cutler about 4 years ago

  • Copied to Backport #43776: nautilus: AssertionError: not all PGs are active or peered 15 seconds after marking out OSDs added

#6 Updated by Nathan Cutler about 4 years ago

Hi Sage:

This issue appears to have been introduced by https://github.com/ceph/ceph/pull/17619 - a major octopus feature which is not being backported to nautilus. So I'm not sure if the backport to nautilus is valid.

Marked #43776 "Need More Info" for now.

Thanks,
Nathan

#7 Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed

#8 Updated by Konstantin Shalygin over 1 year ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF