Project

General

Profile

Backport #16239

'ceph tell osd.0 flush_pg_stats' fails in rados qa run

Added by Kefu Chai almost 8 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Release:
jewel
Crash signature (v1):
Crash signature (v2):


Related issues

Related to Ceph - Bug #16334: rados/singleton/{rados.yaml all/pg-removal-interruption.yaml fs/xfs.yaml -- remove the flush_pg_stats call Resolved 06/15/2016

History

#2 Updated by Nathan Cutler over 7 years ago

  • Related to Bug #16334: rados/singleton/{rados.yaml all/pg-removal-interruption.yaml fs/xfs.yaml -- remove the flush_pg_stats call added

#3 Updated by Nathan Cutler over 7 years ago

Possibly a duplicate of #16334

#4 Updated by David Zafman over 7 years ago

description: rados/singleton/{rados.yaml all/osd-backfill.yaml fs/xfs.yaml msgr/random.yaml
msgr-failures/many.yaml}

dzafman-2016-10-05_15:00:09-rados-wip-zafman-testing2-distro-basic-smithi/456237

In my run it is a race with osd.1 recently starting and not ready to handle the ceph request. Not sure if msgr-failures contributed to it.

#5 Updated by Nathan Cutler almost 7 years ago

This is showing in jewel 10.2.8 integration testing.

description: rados/singleton/{all/ec-lost-unfound-upgrade.yaml fs/xfs.yaml msgr-failures/few.yaml msgr/async.yaml rados.yaml}

2017-04-21T06:58:56.419 INFO:teuthology.orchestra.run.smithi018:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph tell osd.0 flush_pg_stats'
2017-04-21T06:58:56.577 INFO:teuthology.orchestra.run.smithi018.stderr:2017-04-21 06:58:56.582289 7f691bdc5700 -1 WARNING: the following dangerous and experimental features are enabled: *
2017-04-21T06:58:56.617 INFO:teuthology.orchestra.run.smithi018.stderr:2017-04-21 06:58:56.622204 7f691bdc5700 -1 WARNING: the following dangerous and experimental features are enabled: *
2017-04-21T06:58:56.655 INFO:teuthology.orchestra.run.smithi018.stderr:Error ENXIO: problem getting command descriptions from osd.0

Log: http://qa-proxy.ceph.com/teuthology/smithfarm-2017-04-21_05:45:14-rados-wip-jewel-backports-distro-basic-smithi/1052552/teuthology.log

Added a 300 second delay after reaching HEALTH_OK: https://github.com/ceph/ceph/pull/14710

#6 Updated by Kefu Chai almost 7 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Kefu Chai

#7 Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to Tests

#8 Updated by Nathan Cutler over 6 years ago

  • Tracker changed from Bug to Backport
  • Description updated (diff)
  • Status changed from Fix Under Review to Resolved
  • Target version set to v10.2.10
  • Release set to jewel

description

2016-06-11T02:41:49.170 INFO:teuthology.orchestra.run.mira117:Running: "sudo TESTDIR=/home/ubuntu/cephtest bash -c 'sudo ceph tell osd.0 flush_pg_stats'" 
2016-06-11T02:41:49.273 INFO:teuthology.orchestra.run.mira117.stderr:Error ENXIO: problem getting command descriptions from osd.0

see http://pulpito.ceph.com/kchai-2016-06-11_02:35:56-rados-master---basic-mira/251227/

Also available in: Atom PDF