Actions
Bug #61457
openPgScrubber: shard blocked on an object for too long
% Done:
0%
Source:
Tags:
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
/a/yuriw-2023-05-25_14:52:58-rados-wip-yuri3-testing-2023-05-24-1136-quincy-distro-default-smithi/7286563
2023-05-25T19:50:08.503 DEBUG:teuthology.orchestra.run.smithi040:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | head -n 1
2023-05-25T19:50:08.566 INFO:teuthology.orchestra.run.smithi040.stdout:2023-05-25T19:43:27.319078+0000 osd.1 (osd.1) 114 : cluster [WRN] osd.1 PgScrubber: 3.4s0 blocked on an object for too long (since 2023-05-25T19:38:27)
2023-05-25T19:50:08.566 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2023-05-25T19:50:08.567 DEBUG:teuthology.orchestra.run.smithi040:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | head -n 1
2023-05-25T19:50:08.641 DEBUG:teuthology.orchestra.run.smithi040:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | head -n 1
2023-05-25T19:50:08.714 DEBUG:teuthology.orchestra.run.smithi040:> sudo egrep '\[WRN\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | head -n 1
2023-05-25T19:50:08.789 INFO:teuthology.orchestra.run.smithi040.stdout:2023-05-25T19:43:27.319078+0000 osd.1 (osd.1) 114 : cluster [WRN] osd.1 PgScrubber: 3.4s0 blocked on an object for too long (since 2023-05-25T19:38:27)
Updated by Laura Flores 11 months ago
- Assignee set to Ronen Friedman
Ronen, could this be related to any recent scrub changes?
Updated by Laura Flores 11 months ago
The failure did not reproduce in 15 reruns: http://pulpito.front.sepia.ceph.com/lflores-2023-05-25_22:14:25-rados-wip-yuri3-testing-2023-05-24-1136-quincy-distro-default-smithi/
Actions