Project

General

Profile

Actions

Bug #45298

closed

cram: balancer/misplaced.t fails with 'Error EAGAIN: Some objects (0.008913) are degraded; try again later'

Added by Brad Hubbard about 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/teuthology-2020-04-26_07:01:02-rados-master-distro-basic-smithi/4985666

2020-04-26T19:18:59.437 INFO:teuthology.orchestra.run.smithi045:> CEPH_REF=master CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/virtualenv/bin/cram -v -- /home/ubuntu/cephtest/archive/cram.client.0/*.t
2020-04-26T19:19:54.159 INFO:tasks.ceph.mgr.x.smithi045.stderr:2020-04-26T19:19:54.157+0000 7fce37e0e700 -1 mgr.server reply reply (11) Resource temporarily unavailable Some objects (0.008913) are degraded; try again later
2020-04-26T19:19:54.662 INFO:tasks.ceph.mgr.x.smithi045.stderr:2020-04-26T19:19:54.659+0000 7fce37e0e700 -1 mgr.server reply reply (2) No such file or directory plan test_plan not found
2020-04-26T19:19:55.169 INFO:tasks.ceph.mgr.x.smithi045.stderr:2020-04-26T19:19:55.166+0000 7fce37e0e700 -1 mgr.server reply reply (2) No such file or directory plan test_plan not found
2020-04-26T19:19:55.817 INFO:tasks.cram.client.0.smithi045.stdout:/home/ubuntu/cephtest/archive/cram.client.0/misplaced.t: failed
2020-04-26T19:19:55.827 DEBUG:teuthology.orchestra.run:got remote process result: 1
2020-04-26T19:19:55.828 INFO:tasks.cram.client.0.smithi045.stdout:--- misplaced.t
2020-04-26T19:19:55.828 INFO:tasks.cram.client.0.smithi045.stdout:+++ misplaced.t.err
2020-04-26T19:19:55.828 INFO:tasks.cram.client.0.smithi045.stdout:@@ -13,11 +13,13 @@
2020-04-26T19:19:55.829 INFO:tasks.cram.client.0.smithi045.stdout: # Turn off active balancer to use manual commands
2020-04-26T19:19:55.829 INFO:tasks.cram.client.0.smithi045.stdout:   $ ceph balancer off
2020-04-26T19:19:55.829 INFO:tasks.cram.client.0.smithi045.stdout:   $ ceph balancer optimize test_plan balancer_opt
2020-04-26T19:19:55.829 INFO:tasks.cram.client.0.smithi045.stdout:+  Error EAGAIN: Some objects (0.008913) are degraded; try again later
2020-04-26T19:19:55.830 INFO:tasks.cram.client.0.smithi045.stdout:+  [11]
2020-04-26T19:19:55.830 INFO:tasks.cram.client.0.smithi045.stdout:   $ ceph balancer ls
2020-04-26T19:19:55.830 INFO:tasks.cram.client.0.smithi045.stdout:-  [
2020-04-26T19:19:55.836 INFO:tasks.cram.client.0.smithi045.stdout:-      "test_plan" 
2020-04-26T19:19:55.836 INFO:tasks.cram.client.0.smithi045.stdout:-  ]
2020-04-26T19:19:55.837 INFO:tasks.cram.client.0.smithi045.stdout:+  []
2020-04-26T19:19:55.837 INFO:tasks.cram.client.0.smithi045.stdout:   $ ceph balancer execute test_plan
2020-04-26T19:19:55.837 INFO:tasks.cram.client.0.smithi045.stdout:+  Error ENOENT: plan test_plan not found
2020-04-26T19:19:55.837 INFO:tasks.cram.client.0.smithi045.stdout:+  [2]
2020-04-26T19:19:55.838 INFO:tasks.cram.client.0.smithi045.stdout:   $ ceph balancer eval
2020-04-26T19:19:55.838 INFO:tasks.cram.client.0.smithi045.stdout:   current cluster score [0-9]*\.?[0-9]+.* (re)
2020-04-26T19:19:55.838 INFO:tasks.cram.client.0.smithi045.stdout: # Plan is gone after execution ?
2020-04-26T19:19:55.838 INFO:tasks.cram.client.0.smithi045.stdout:# Ran 1 tests, 0 skipped, 1 failed.

The degraded state cleared about two minutes later.

2020-04-26T19:19:57.061875+0000 mgr.x (mgr.4103) 90 : cluster [DBG] pgmap v93: 8 pgs: 8 active+clean; 0 B data, 59 GiB used, 238 GiB / 300 GiB avail
2020-04-26T19:19:57.759798+0000 mon.a (mon.0) 138 : cluster [INF] Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 194/21766 objects degraded (0.891%), 5 pgs degraded)
2020-04-26T19:19:57.759830+0000 mon.a (mon.0) 139 : cluster [INF] Cluster is now healthy
Actions #1

Updated by Greg Farnum almost 4 years ago

  • Project changed from Ceph to RADOS
Actions #2

Updated by Neha Ojha almost 4 years ago

  • Assignee set to Neha Ojha
Actions #3

Updated by Neha Ojha almost 4 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Brad Hubbard almost 4 years ago

This looks similar.

2020-04-28T22:42:23.804 INFO:teuthology.orchestra.run.smithi097:> cp -- /home/ubuntu/cephtest/clone.client.0/src/test/cli-integration/balancer/misplaced.t /home/ubuntu/cephtest/archive/cram.client.0
2020-04-28T22:42:23.851 INFO:tasks.cram:Running tests for client.0...
2020-04-28T22:42:23.852 INFO:teuthology.orchestra.run.smithi097:> true
2020-04-28T22:42:23.898 INFO:teuthology.orchestra.run.smithi097:> CEPH_REF=master CEPH_ID="0" PATH=$PATH:/usr/sbin adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/virtualenv/bin/cram -v -- /home/ubuntu/cephtest/archive/cram.client.0/*.t
2020-04-28T22:43:20.817 INFO:tasks.ceph.mgr.x.smithi097.stderr:2020-04-28T22:43:20.814+0000 7f177e7f8700 -1 mgr.server reply reply (2) No such file or directory plan test_plan not found
2020-04-28T22:43:21.296 INFO:tasks.cram.client.0.smithi097.stdout:/home/ubuntu/cephtest/archive/cram.client.0/misplaced.t: passed

Ultimately failed with.

failure_reason: '"2020-04-28T22:43:21.226896+0000 mon.a (mon.0) 135 : cluster [WRN]
  Health check failed: Degraded data redundancy: 440/12848 objects degraded (3.425%),
  8 pgs degraded (PG_DEGRADED)" in cluster log'

/a/yuriw-2020-04-28_21:58:13-rados-wip-yuri-testing-2020-04-24-1941-master-distro-basic-smithi/4995269

Actions #5

Updated by Neha Ojha almost 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 34897
Actions #6

Updated by Brad Hubbard almost 4 years ago

  • Backport set to octopus, nautilus

/a/yuriw-2020-05-04_17:54:17-rados-wip-yuri5-testing-2020-05-04-1554-nautilus-distro-basic-smithi/5022793

Actions #7

Updated by Kefu Chai almost 4 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #8

Updated by Neha Ojha almost 4 years ago

  • Status changed from Pending Backport to Resolved
  • Backport deleted (octopus, nautilus)

This was a result of d4fbaf7ea959fd945857abd327271a97fb1da631, which only applies to master.

Actions

Also available in: Atom PDF