Bug #45135: nautilus: "too few PGs per OSD (2 < min 30) (TOO_FEW_PGS)" in smoke (all suites seem broken) - Ceph - Ceph

Actions

Copy link

Bug #45135

closed

nautilus: "too few PGs per OSD (2 < min 30) (TOO_FEW_PGS)" in smoke (all suites seem broken)

Added by Yuri Weinstein about 4 years ago. Updated almost 4 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Category:

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

v14.2.9

ceph-qa-suite:

rados, smoke

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Run http://pulpito.ceph.com/teuthology-2020-04-17_07:00:05-smoke-nautilus-testing-basic-smithi/
Jobs: ['4960947', '4960951', '4960948', '4960959', '4960965', '4960950', '4960962', '4960952', '4960967', '4960945', '4960960', '4960944', '4960961', '4960964', '4960968', '4960966', '4960946', '4960957', '4960949']
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2020-04-17_07:00:05-smoke-nautilus-testing-basic-smithi/4960944/teuthology.log

description: smoke/basic/{clusters/{fixed-3-cephfs.yaml openstack.yaml} objectstore/bluestore-bitmap.yaml
  tasks/cfuse_workunit_suites_blogbench.yaml}
duration: 1587.0086629390717
failure_reason: '"2020-04-17 08:00:33.977883 mon.a (mon.0) 62 : cluster [WRN] Health
  check failed: too few PGs per OSD (2 < min 30) (TOO_FEW_PGS)" in cluster log'
flavor: basic

notice timeout:

2020-04-17T08:19:59.075 INFO:teuthology.orchestra.run.smithi094.stderr:mon.a: injectargs:mon_health_to_clog = 'false'
2020-04-17T08:19:59.290 INFO:teuthology.orchestra.run.smithi094.stderr:mon.b: injectargs:mon_health_to_clog = 'false'
2020-04-17T08:19:59.505 INFO:teuthology.orchestra.run.smithi094.stderr:mon.c: injectargs:mon_health_to_clog = 'false'
2020-04-17T08:19:59.529 ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/contextutil.py", line 34, in nested
    yield vars
  File "/home/teuthworker/src/git.ceph.com_ceph_nautilus/qa/tasks/ceph.py", line 1922, in task
    healthy(ctx=ctx, config=dict(cluster=config['cluster']))
  File "/home/teuthworker/src/git.ceph.com_ceph_nautilus/qa/tasks/ceph.py", line 1484, in healthy
    ceph_cluster=cluster_name,
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/misc.py", line 867, in wait_until_healthy
    while proceed():
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_py2/teuthology/contextutil.py", line 134, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: 'wait_until_healthy' reached maximum tries (150) after waiting for 900 seconds

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Neha Ojha about 4 years ago

Subject changed from "too few PGs per OSD (2 < min 30) (TOO_FEW_PGS)" in smoke (all suites seem broken) to nautilus: "too few PGs per OSD (2 < min 30) (TOO_FEW_PGS)" in smoke (all suites seem broken)
Status changed from New to Triaged

The problem is that https://github.com/ceph/ceph/pull/34055/commits/fd608af305745830778d826c8e29a8ecd14d4748 removed "mon pg warn min per osd = 1" for all tests. This change was made in master following 1ac34a5ea3d1aca299b02e574b295dd4bf6167f4. But this commit is missing in nautilus and mon_pg_warn_min_per_osd defaults to 30, which is why most tests are failing.

Actions

Copy link