Project

General

Profile

Bug #20326

Scrubbing terminated -- not all pgs were active and clean.

Added by Kefu Chai about 1 month ago. Updated 20 days ago.

Status:
Resolved
Priority:
Immediate
Assignee:
Category:
Tests
Target version:
-
Start date:
05/23/2017
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No
Component(RADOS):

Description

2017-06-15T19:19:22.556 INFO:teuthology.orchestra.run.smithi077:Running: "sudo TESTDIR=/home/ubuntu/cephtest bash -c 'ceph pg dump -f json-pretty'" 
...
2017-06-15T19:19:22.752 INFO:teuthology.orchestra.run.smithi077.stdout:    "pg_stats_sum": {
2017-06-15T19:19:22.752 INFO:teuthology.orchestra.run.smithi077.stdout:        "stat_sum": {
..
2017-06-15T19:19:22.754 INFO:teuthology.orchestra.run.smithi077.stdout:            "num_scrub_errors": 0,
2017-06-15T19:19:22.754 INFO:teuthology.orchestra.run.smithi077.stdout:            "num_shallow_scrub_errors": 0,
..
2017-06-15T19:19:22.755 INFO:teuthology.orchestra.run.smithi077.stdout:            "num_legacy_snapsets": 151
...
2017-06-15T19:25:59.872 DEBUG:teuthology.orchestra.console:expect after: smithi164 login:
CommandFailedError: Command failed on smithi077 with status 1: 'sudo TESTDIR=/home/ubuntu/cephtest bash -c \'ceph pg dump sum -f json-pretty | grep num_legacy_snapsets | head -1 |
grep \'"\'"\': 0\'"\'"\'\''

/kchai-2017-06-15_17:39:27-rados-wip-kefu-testing---basic-smithi/1291196/

Related issues

Copied from Ceph - Bug #20058: 'ceph pg dump sum -f json-pretty | grep num_legacy_snapsets | ...' fails Resolved 05/23/2017

History

#1 Updated by Kefu Chai about 1 month ago

  • Copied from Bug #20058: 'ceph pg dump sum -f json-pretty | grep num_legacy_snapsets | ...' fails added

#2 Updated by Kefu Chai about 1 month ago

  • Status changed from Resolved to New

#3 Updated by Sage Weil about 1 month ago

  • Subject changed from 'ceph pg dump sum -f json-pretty | grep num_legacy_snapsets | ...' fails to Scrubbing terminated -- not all pgs were active and clean.
  • Status changed from New to In Progress
2017-06-15T19:19:12.551 INFO:tasks.ceph:Waiting for all osds to be active and clean.
2017-06-15T19:19:22.553 INFO:tasks.ceph:Scrubbing terminated -- not all pgs were active and clean.

that's why the legacy snapsets are still there. this is why the wip-qa-snap pr is in flight, to catch this earlier. see https://github.com/ceph/ceph/pull/15310

it looks like the problem might be

2017-06-15 19:17:05.676252 mon.a mon.0 172.21.15.77:6789/0 4461 : cluster [INF] pgmap 202 pgs: 3 active+clean+snaptrim, 199 active+clean; 2862 MB data, 14227 MB used, 545 GB / 558 GB avail; 23075 kB/s, 8 objects/s recovering

confusing the teuth check, but i'm not sure. see https://github.com/ceph/ceph/pull/15717

#5 Updated by Greg Farnum about 1 month ago

  • Project changed from Ceph to RADOS
  • Category set to Tests

#6 Updated by Nathan Cutler about 1 month ago

  • Status changed from In Progress to Resolved

#7 Updated by Patrick Donnelly 20 days ago

Saw this error here:

/ceph/teuthology-archive/pdonnell-2017-07-01_01:07:39-fs-wip-pdonnell-20170630-distro-basic-smithi/1347377/teuthology.log

Test branch: https://github.com/ceph/ceph-ci/tree/wip-pdonnell-20170630

That branch has Sage's fix.

Also available in: Atom PDF