Project

General

Profile

Actions

Bug #55009

open

Scrubbing exits due to error reading object head

Added by Laura Flores about 2 years ago. Updated 3 months ago.

Status:
Fix Under Review
Priority:
High
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description: rados/thrash/{0-size-min-size-overrides/3-size-2-min-size 1-pg-log-overrides/short_pg_log 2-recovery-overrides/{more-partial-recovery} 3-scrub-overrides/{max-simultaneous-scrubs-2} backoff/peering ceph clusters/{fixed-2 openstack} crc-failures/default d-balancer/on mon_election/connectivity msgr-failures/osd-delay msgr/async-v2only objectstore/bluestore-comp-zstd rados supported-random-distro$/{ubuntu_latest} thrashers/default thrashosds-health workloads/cache-agent-big}

/a/yuriw-2022-03-19_14:37:23-rados-wip-yuri6-testing-2022-03-18-1104-distro-default-smithi/6747129

2022-03-19T19:53:07.393 INFO:tasks.ceph:Still waiting for all pgs to be scrubbed.
2022-03-19T19:53:10.548 INFO:tasks.ceph.osd.3.smithi123.stderr:2022-03-19T19:53:10.545+0000 7f2cf3076700 -1 osd.3 pg_epoch: 816 pg[4.0( v 816'11798 (106'8797,816'11798] local-lis/les=801/802 n=1002 ec=24/24 lis/c=801/801 les/c/f=802/802/0 sis=801) [3,7,0] r=0 lpr=801 crt=816'11798 lcod 816'11797 mlcod 816'11797 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 7 (0x56314438d080)
2022-03-19T19:53:13.480 INFO:tasks.ceph.osd.3.smithi123.stderr:2022-03-19T19:53:13.477+0000 7f2cf2074700 -1 osd.3 pg_epoch: 818 pg[3.2s0( v 640'1548 (0'0,640'1548] local-lis/les=771/772 n=1321 ec=21/21 lis/c=771/771 les/c/f=772/772/0 sis=771) [3,5,6,4]p3(0) r=0 lpr=771 crt=640'1548 lcod 637'1547 mlcod 0'0 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 6(2) (0x5631425969a0)
2022-03-19T19:53:13.481 INFO:tasks.ceph.osd.3.smithi123.stderr:2022-03-19T19:53:13.477+0000 7f2cf2074700 -1 osd.3 pg_epoch: 818 pg[3.2s0( v 640'1548 (0'0,640'1548] local-lis/les=771/772 n=1321 ec=21/21 lis/c=771/771 les/c/f=772/772/0 sis=771) [3,5,6,4]p3(0) r=0 lpr=771 crt=640'1548 lcod 637'1547 mlcod 0'0 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 4(3) (0x563143a1b8c0)
2022-03-19T19:53:14.429 INFO:tasks.ceph.osd.3.smithi123.stderr:2022-03-19T19:53:14.429+0000 7f2cf707e700 -1 osd.3 pg_epoch: 818 pg[4.0( v 816'11798 (106'8797,816'11798] local-lis/les=801/802 n=1002 ec=24/24 lis/c=801/801 les/c/f=802/802/0 sis=801) [3,7,0] r=0 lpr=801 crt=816'11798 lcod 816'11797 mlcod 816'11797 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 7 (0x5631436badc0)
2022-03-19T19:53:14.781 INFO:tasks.ceph.osd.2.smithi123.stderr:2022-03-19T19:53:14.777+0000 7f0ef4cca700 -1 osd.2 pg_epoch: 818 pg[2.1( empty local-lis/les=725/726 n=0 ec=16/16 lis/c=725/725 les/c/f=726/726/0 sis=725) [2,3,0] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 3 (0x55b0c21691e0)
2022-03-19T19:53:16.483 INFO:tasks.ceph.osd.4.smithi150.stderr:2022-03-19T19:53:16.480+0000 7f0d5e64c700 -1 osd.4 pg_epoch: 818 pg[3.3s0( v 642'1500 (0'0,642'1500] local-lis/les=787/788 n=1242 ec=21/21 lis/c=787/787 les/c/f=788/788/0 sis=787) [4,5,0,6]p4(0) r=0 lpr=787 crt=642'1500 mlcod 0'0 active+clean MUST_DEEP_SCRUB MUST_SCRUB planned REQ_SCRUB] scrubber <NotActive/>: handle_scrub_reserve_grant: received unsolicited reservation grant from osd 6(3) (0x55b3ca33c000)

long pg dump...

2022-03-19T19:53:48.241 DEBUG:teuthology.orchestra.run.smithi123:> sudo ceph --cluster ceph config set global mon_health_to_clog false
2022-03-19T19:53:48.546 INFO:teuthology.misc:Shutting down mds daemons...
2022-03-19T19:53:48.547 INFO:teuthology.misc:Shutting down osd daemons...
2022-03-19T19:53:48.547 DEBUG:tasks.ceph.osd.0:waiting for process to exit
2022-03-19T19:53:48.547 INFO:teuthology.orchestra.run:waiting for 300

...

2022-03-19T19:53:55.366 INFO:tasks.ceph:Checking cluster log for badness...
2022-03-19T19:53:55.366 DEBUG:teuthology.orchestra.run.smithi123:> sudo egrep '\[ERR\]|\[WRN\]|\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | egrep -v 'must scrub before tier agent can activate' | head -n 1
2022-03-19T19:53:55.449 INFO:teuthology.orchestra.run.smithi123.stdout:2022-03-19T19:40:18.710528+0000 osd.5 (osd.5) 3 : cluster [ERR] Error -2 reading object 3:fbf04b8a:::smithi12323554-4688:head
2022-03-19T19:53:55.450 WARNING:tasks.ceph:Found errors (ERR|WRN|SEC) in cluster log
2022-03-19T19:53:55.451 DEBUG:teuthology.orchestra.run.smithi123:> sudo egrep '\[SEC\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | egrep -v 'must scrub before tier agent can activate' | head -n 1
2022-03-19T19:53:55.473 DEBUG:teuthology.orchestra.run.smithi123:> sudo egrep '\[ERR\]' /var/log/ceph/ceph.log | egrep -v '\(MDS_ALL_DOWN\)' | egrep -v '\(MDS_UP_LESS_THAN_MAX\)' | egrep -v '\(OSD_SLOW_PING_TIME' | egrep -v 'but it is still running' | egrep -v 'objects unfound and apparently lost' | egrep -v 'overall HEALTH_' | egrep -v '\(OSDMAP_FLAGS\)' | egrep -v '\(OSD_' | egrep -v '\(PG_' | egrep -v '\(POOL_' | egrep -v '\(CACHE_POOL_' | egrep -v '\(SMALLER_PGP_NUM\)' | egrep -v '\(OBJECT_' | egrep -v '\(SLOW_OPS\)' | egrep -v '\(REQUEST_SLOW\)' | egrep -v '\(TOO_FEW_PGS\)' | egrep -v 'slow request' | egrep -v 'must scrub before tier agent can activate' | head -n 1
2022-03-19T19:53:55.488 INFO:teuthology.orchestra.run.smithi123.stdout:2022-03-19T19:40:18.710528+0000 osd.5 (osd.5) 3 : cluster [ERR] Error -2 reading object 3:fbf04b8a:::smithi12323554-4688:head


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #64017: osd/scrub: 'max_osd_scrub' default of '1' is too lowResolvedRonen Friedman

Actions
Actions

Also available in: Atom PDF