Project

General

Profile

Actions

Bug #48906

closed

wait_for_recovery: failed before timeout expired with tests that override osd_async_recovery_min_cost

Added by Neha Ojha over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Category:
Peering
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-01-16T23:59:02.417 INFO:tasks.ceph.ceph_manager.ceph:PG 1.2c is not active+clean
2021-01-16T23:59:02.418 INFO:tasks.ceph.ceph_manager.ceph:{'pgid': '1.2c', 'version': "20'3", 'reported_seq': '1740', 'reported_epoch': '2128', 'state': 'remapped+peering', 'last_fresh': '2021-01-16T23:58:59.728986+0000', 'last_change': '2021-01-16T23:58:59.729116+0000', 'last_active': '2021-01-16T23:32:13.364853+0000', 'last_peered': '2021-01-16T23:32:11.361331+0000', 'last_clean': '2021-01-16T23:32:11.361331+0000', 'last_became_active': '2021-01-16T23:31:45.371944+0000', 'last_became_peered': '2021-01-16T23:31:45.371944+0000', 'last_unstale': '2021-01-16T23:58:59.728986+0000', 'last_undegraded': '2021-01-16T23:58:59.728986+0000', 'last_fullsized': '2021-01-16T23:58:59.728986+0000', 'mapping_epoch': 2128, 'log_start': "0'0", 'ondisk_log_start': "0'0", 'created': 423, 'last_epoch_clean': 494, 'parent': '0.0', 'parent_split_bits': 6, 'last_scrub': "20'3", 'last_scrub_stamp': '2021-01-16T23:31:03.884843+0000', 'last_deep_scrub': "0'0", 'last_deep_scrub_stamp': '2021-01-16T23:23:34.847625+0000', 'last_clean_scrub_stamp': '2021-01-16T23:31:03.884843+0000', 'log_size': 3, 'ondisk_log_size': 3, 'stats_invalid': False, 'dirty_stats_invalid': False, 'omap_stats_invalid': False, 'hitset_stats_invalid': False, 'hitset_bytes_stats_invalid': False, 'pin_stats_invalid': False, 'manifest_stats_invalid': False, 'snaptrimq_len': 0, 'stat_sum': {'num_bytes': 0, 'num_objects': 0, 'num_object_clones': 0, 'num_object_copies': 0, 'num_objects_missing_on_primary': 0, 'num_objects_missing': 0, 'num_objects_degraded': 0, 'num_objects_misplaced': 0, 'num_objects_unfound': 0, 'num_objects_dirty': 0, 'num_whiteouts': 0, 'num_read': 0, 'num_read_kb': 0, 'num_write': 0, 'num_write_kb': 0, 'num_scrub_errors': 0, 'num_shallow_scrub_errors': 0, 'num_deep_scrub_errors': 0, 'num_objects_recovered': 0, 'num_bytes_recovered': 0, 'num_keys_recovered': 0, 'num_objects_omap': 0, 'num_objects_hit_set_archive': 0, 'num_bytes_hit_set_archive': 0, 'num_flush': 0, 'num_flush_kb': 0, 'num_evict': 0, 'num_evict_kb': 0, 'num_promote': 0, 'num_flush_mode_high': 0, 'num_flush_mode_low': 0, 'num_evict_mode_some': 0, 'num_evict_mode_full': 0, 'num_objects_pinned': 0, 'num_legacy_snapsets': 0, 'num_large_omap_objects': 0, 'num_objects_manifest': 0, 'num_omap_bytes': 0, 'num_omap_keys': 0, 'num_objects_repaired': 0}, 'up': [3, 7], 'acting': [7], 'avail_no_missing': [], 'object_location_counts': [], 'blocked_by': [1, 3], 'up_primary': 3, 'acting_primary': 7, 'purged_snaps': []}
2021-01-16T23:59:02.418 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 116, in wrapper
    return func(self)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 1195, in _do_thrash
    timeout=self.config.get('timeout')
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 2585, in wait_for_recovery
    'wait_for_recovery: failed before timeout expired'
AssertionError: wait_for_recovery: failed before timeout expired

2021-01-16T23:59:02.418 ERROR:tasks.thrashosds.thrasher:exception:
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 1070, in do_thrash
    self._do_thrash()
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 116, in wrapper
    return func(self)
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 1195, in _do_thrash
    timeout=self.config.get('timeout')
  File "/home/teuthworker/src/git.ceph.com_ceph-c_wip-config-sets-ci/qa/tasks/ceph_manager.py", line 2585, in wait_for_recovery
    'wait_for_recovery: failed before timeout expired'
AssertionError: wait_for_recovery: failed before timeout expired

/a/nojha-2021-01-16_20:47:46-rados-wip-config-sets-ci-distro-basic-smithi/5792796

This is due to https://github.com/ceph/ceph/pull/38675 and https://github.com/ceph/ceph/pull/38920 and is a known issue for tests with recovery-overrides for osd_async_recovery_min_cost. Sridhar/Sunny will be fixing them.


Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #48949: pacific: wait_for_recovery: failed before timeout expired with tests that override osd_async_recovery_min_costResolvedActions
Actions #1

Updated by Sridhar Seshasayee over 3 years ago

  • Category set to Peering
  • Source set to Development
  • Pull request ID set to 38941
  • Component(RADOS) OSD added
Actions #2

Updated by Neha Ojha over 3 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Neha Ojha over 3 years ago

  • Backport set to pacific
Actions #4

Updated by Neha Ojha over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #48949: pacific: wait_for_recovery: failed before timeout expired with tests that override osd_async_recovery_min_cost added
Actions #7

Updated by Neha Ojha over 3 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF