Project

General

Profile

Actions

Bug #51904

closed

test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to down PGs

Added by Neha Ojha almost 3 years ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
quincy, pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-07-25T15:18:21.688 INFO:tasks.ceph.ceph_manager.ceph:PG 1.b is not active+clean
2021-07-25T15:18:21.688 INFO:tasks.ceph.ceph_manager.ceph:{'pgid': '1.b', 'version': "22'3", 'reported_seq': 475, 'reported_epoch': 478, 'state': 'down', 'last_fresh': '2021-07-25T15:18:15.848533+0000', 'last_change': '2021-07-25T14:58:09.960725+0000', 'last_active': '2021-07-25T14:58:09.960471+0000', 'last_peered': '2021-07-25T14:58:01.054827+0000', 'last_clean': '2021-07-25T14:58:01.054827+0000', 'last_became_active': '2021-07-25T14:58:00.955866+0000', 'last_became_peered': '2021-07-25T14:58:00.955866+0000', 'last_unstale': '2021-07-25T15:18:15.848533+0000', 'last_undegraded': '2021-07-25T15:18:15.848533+0000', 'last_fullsized': '2021-07-25T15:18:15.848533+0000', 'mapping_epoch': 31, 'log_start': "0'0", 'ondisk_log_start': "0'0", 'created': 18, 'last_epoch_clean': 22, 'parent': '0.0', 'parent_split_bits': 0, 'last_scrub': "0'0", 'last_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'last_deep_scrub': "0'0", 'last_deep_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'last_clean_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'log_size': 3, 'ondisk_log_size': 3, 'stats_invalid': False, 'dirty_stats_invalid': False, 'omap_stats_invalid': False, 'hitset_stats_invalid': False, 'hitset_bytes_stats_invalid': False, 'pin_stats_invalid': False, 'manifest_stats_invalid': False, 'snaptrimq_len': 0, 'stat_sum': {'num_bytes': 2554631, 'num_objects': 1, 'num_object_clones': 0, 'num_object_copies': 3, 'num_objects_missing_on_primary': 0, 'num_objects_missing': 0, 'num_objects_degraded': 0, 'num_objects_misplaced': 0, 'num_objects_unfound': 0, 'num_objects_dirty': 1, 'num_whiteouts': 0, 'num_read': 1, 'num_read_kb': 0, 'num_write': 4, 'num_write_kb': 1011, 'num_scrub_errors': 0, 'num_shallow_scrub_errors': 0, 'num_deep_scrub_errors': 0, 'num_objects_recovered': 0, 'num_bytes_recovered': 0, 'num_keys_recovered': 0, 'num_objects_omap': 0, 'num_objects_hit_set_archive': 0, 'num_bytes_hit_set_archive': 0, 'num_flush': 0, 'num_flush_kb': 0, 'num_evict': 0, 'num_evict_kb': 0, 'num_promote': 0, 'num_flush_mode_high': 0, 'num_flush_mode_low': 0, 'num_evict_mode_some': 0, 'num_evict_mode_full': 0, 'num_objects_pinned': 0, 'num_legacy_snapsets': 0, 'num_large_omap_objects': 0, 'num_objects_manifest': 0, 'num_omap_bytes': 0, 'num_omap_keys': 0, 'num_objects_repaired': 0}, 'up': [2, 0, 5], 'acting': [2, 0, 5], 'avail_no_missing': [], 'object_location_counts': [], 'blocked_by': [1], 'up_primary': 0, 'acting_primary': 0, 'purged_snaps': [{'start': '1', 'length': '1'}]}
2021-07-25T15:18:21.688 INFO:tasks.ceph.ceph_manager.ceph:PG 1.3 is not active+clean
2021-07-25T15:18:21.689 INFO:tasks.ceph.ceph_manager.ceph:{'pgid': '1.3', 'version': "0'0", 'reported_seq': 449, 'reported_epoch': 478, 'state': 'down', 'last_fresh': '2021-07-25T15:18:16.232816+0000', 'last_change': '2021-07-25T14:58:09.960717+0000', 'last_active': '0.000000', 'last_peered': '0.000000', 'last_clean': '0.000000', 'last_became_active': '0.000000', 'last_became_peered': '0.000000', 'last_unstale': '2021-07-25T15:18:16.232816+0000', 'last_undegraded': '2021-07-25T15:18:16.232816+0000', 'last_fullsized': '2021-07-25T15:18:16.232816+0000', 'mapping_epoch': 31, 'log_start': "0'0", 'ondisk_log_start': "0'0", 'created': 18, 'last_epoch_clean': 22, 'parent': '0.0', 'parent_split_bits': 0, 'last_scrub': "0'0", 'last_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'last_deep_scrub': "0'0", 'last_deep_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'last_clean_scrub_stamp': '2021-07-25T14:57:55.882472+0000', 'log_size': 0, 'ondisk_log_size': 0, 'stats_invalid': False, 'dirty_stats_invalid': False, 'omap_stats_invalid': False, 'hitset_stats_invalid': False, 'hitset_bytes_stats_invalid': False, 'pin_stats_invalid': False, 'manifest_stats_invalid': False, 'snaptrimq_len': 0, 'stat_sum': {'num_bytes': 0, 'num_objects': 0, 'num_object_clones': 0, 'num_object_copies': 0, 'num_objects_missing_on_primary': 0, 'num_objects_missing': 0, 'num_objects_degraded': 0, 'num_objects_misplaced': 0, 'num_objects_unfound': 0, 'num_objects_dirty': 0, 'num_whiteouts': 0, 'num_read': 0, 'num_read_kb': 0, 'num_write': 0, 'num_write_kb': 0, 'num_scrub_errors': 0, 'num_shallow_scrub_errors': 0, 'num_deep_scrub_errors': 0, 'num_objects_recovered': 0, 'num_bytes_recovered': 0, 'num_keys_recovered': 0, 'num_objects_omap': 0, 'num_objects_hit_set_archive': 0, 'num_bytes_hit_set_archive': 0, 'num_flush': 0, 'num_flush_kb': 0, 'num_evict': 0, 'num_evict_kb': 0, 'num_promote': 0, 'num_flush_mode_high': 0, 'num_flush_mode_low': 0, 'num_evict_mode_some': 0, 'num_evict_mode_full': 0, 'num_objects_pinned': 0, 'num_legacy_snapsets': 0, 'num_large_omap_objects': 0, 'num_objects_manifest': 0, 'num_omap_bytes': 0, 'num_omap_keys': 0, 'num_objects_repaired': 0}, 'up': [4, 6, 5], 'acting': [4, 6, 5], 'avail_no_missing': [], 'object_location_counts': [], 'blocked_by': [1], 'up_primary': 4, 'acting_primary': 4, 'purged_snaps': []}
2021-07-25T15:18:21.689 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3df2a2c35be348281dd870cd64ba15314e6776a9/qa/tasks/ceph_manager.py", line 188, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3df2a2c35be348281dd870cd64ba15314e6776a9/qa/tasks/ceph_manager.py", line 1411, in _do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3df2a2c35be348281dd870cd64ba15314e6776a9/qa/tasks/ceph_manager.py", line 979, in test_pool_min_size
    self.ceph_manager.wait_for_clean(timeout=self.config.get('timeout'))
  File "/home/teuthworker/src/github.com_ceph_ceph-c_3df2a2c35be348281dd870cd64ba15314e6776a9/qa/tasks/ceph_manager.py", line 2719, in wait_for_clean
    'wait_for_clean: failed before timeout expired'
AssertionError: wait_for_clean: failed before timeout expired

rados/thrash-erasure-code-overwrites/{bluestore-bitmap ceph clusters/{fixed-2 openstack} fast/normal mon_election/connectivity msgr-failures/few rados recovery-overrides/{more-async-partial-recovery} supported-random-distro$/{rhel_8} thrashers/minsize_recovery thrashosds-health workloads/ec-pool-snaps-few-objects-overwrites}

/a/yuriw-2021-07-25_14:33:55-rados-wip-yuri7-testing-master-7.23.21-distro-basic-smithi/6292358 - no logs


Related issues 5 (1 open4 closed)

Related to RADOS - Bug #49777: test_pool_min_size: 'check for active or peered' reached maximum tries (5) after waiting for 25 secondsResolvedKamoltat (Junior) Sirivadhna

Actions
Related to RADOS - Bug #54511: test_pool_min_size: AssertionError: not clean before minsize thrashing startsResolvedKamoltat (Junior) Sirivadhna

Actions
Related to RADOS - Bug #59172: test_pool_min_size: AssertionError: wait_for_clean: failed before timeout expired due to down PGsPending BackportKamoltat (Junior) Sirivadhna

Actions
Copied to RADOS - Backport #57025: quincy: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to down PGsResolvedKamoltat (Junior) SirivadhnaActions
Copied to RADOS - Backport #57026: pacific: test_pool_min_size:AssertionError:wait_for_clean:failed before timeout expired due to down PGsResolvedActions
Actions

Also available in: Atom PDF