Project

General

Profile

Actions

Bug #8193

closed

HitSetTrim test in test/librados/tier.cc needs to be skipped if thrasher running

Added by Samuel Just about 10 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Command failed on 10.214.131.16 with status 1: 'mkdir p -
/home/ubuntu/cephtest/mnt.0/client.0/tmp && cd --
/home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1
CEPH_REF=623014623851a4df10e6412380823ca68cf72d5b
TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage
/home/ubuntu/cephtest/archive/coverage timeout 3h
/home/ubuntu/cephtest/workunit.client.0/rados/test.sh'
ubuntu@teuthology:/a/sage-2014-04-22_09:16:16-rados-firefly-testing-basic-plana/209325

Actions #1

Updated by David Zafman about 10 years ago

  • Assignee set to David Zafman
Actions #2

Updated by David Zafman about 10 years ago

2014-04-22T16:26:11.096 INFO:teuthology.task.workunit.client.0.out:[10.214.131.16]: [ RUN ] LibRadosTierECPP.HitSetTrim

Lots of in and out of recovery and primary changes for pg 86.6. Finally, things might have made progress after finding missing test object "foo" and archives and going active+clean, but time ran out.

2014-04-22 16:28:00.922829 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=4] on_local_recover: 7fc1f406/foo/head//86
2014-04-22 16:28:00.922846 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=4] got missing 7fc1f406/foo/head//86 v 764'272

2014-04-22 16:28:00.932736 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=3] on_local_recover: 6/hit_set_86.6_archive_2014-04-22 16:26:34.669846_2014-04-22 16:26:38.255763/head/.ceph-internal/86
2014-04-22 16:28:00.932770 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=3] got missing 6/hit_set_86.6 archive_2014-04-22 16:26:34.669846_2014-04-22 16:26:38.255763/head/.ceph-internal/86 v 754'264
2014-04-22 16:28:00.932896 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=2] on_local_recover: 6/hit_set_86.6_archive_2014-04-22 16:27:10.746643_2014-04-22 16:27:14.660239/head/.ceph-internal/86
2014-04-22 16:28:00.932916 7f3e4a65a700 10 osd.5 pg_epoch: 781 pg[86.6s1( v 764'272 lc 679'1 (0'0,764'272] local-les=778 n=4 ec=675 les/c 778/764 776/776/776) [2147483647,5,0] r=1 lpr=778 pi=763-775/3 luod=0'0 crt=764'272 lcod 0'0 active m=2] got missing 6/hit_set_86.6 archive_2014-04-22 16:27:10.746643_2014-04-22 16:27:14.660239/head/.ceph-internal/86 v 764'269

2014-04-22 16:28:03.450236 7f3e4a65a700 10 osd.5 pg_epoch: 783 pg[86.6s1( v 783'274 (0'0,783'274] local-les=783 n=4 ec=675 les/c 783/783 782/782/782) [2147483647,5,0] r=1 lpr=782 pi=763-781/4 luod=781'273 crt=764'272 lcod 781'273 mlcod 0'0 active+clean] trim_past_intervals: trimming interval(763-764 [3,2,0]/[3,2,0] maybe_went_rw)

2014-04-22T16:28:03.468 INFO:teuthology.task.workunit.client.0.out:[10.214.131.16]: test/librados/tier.cc:4128: Failure

Actions #3

Updated by David Zafman about 10 years ago

  • Project changed from Ceph to teuthology
  • Subject changed from hitset trim fail to HitSetTrim test in test/librados/tier.cc needs to be skipped if thrasher running
  • Assignee deleted (David Zafman)

This particular test case is timing sensitive. It doesn't make sense to run it when the thrasher is running. This may require a new mechanism in order to skip this test.

Actions #4

Updated by Ian Colle about 10 years ago

  • Assignee set to Anonymous
Actions #5

Updated by David Zafman about 10 years ago

I should have mentioned that there are 2 HitSetTrim tests as is typical.

Actions #6

Updated by Sage Weil about 10 years ago

  • Status changed from New to Fix Under Review
Actions #7

Updated by Anonymous about 10 years ago

  • Status changed from Fix Under Review to Resolved

Request pulled

Actions #8

Updated by Samuel Just almost 9 years ago

  • Project changed from teuthology to Ceph
  • Status changed from Resolved to 12
  • Regression set to No

Heh, that PR isn't actually enough. The number of hitsets can still exceed count while backfilling. I'll just remove that assert and leave the time limit assert.

Actions #9

Updated by Samuel Just almost 9 years ago

  • Assignee changed from Anonymous to Samuel Just
Actions #10

Updated by Samuel Just over 8 years ago

  • Status changed from 12 to Resolved
Actions

Also available in: Atom PDF