Project

General

Profile

Actions

Bug #45734

open

ClsLock.TestExclusiveEphemeralStealExclusive fails

Added by Kefu Chai almost 4 years ago. Updated over 2 years ago.

Status:
Need More Info
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Tags:
cls
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

IZE/gigantic/release/16.0.0-1952-g41868f1d942/rpm/el8/BUILD/ceph-16.0.0-1952-g41868f1d942/src/test/cls_lock/test_cls_lock.cc:562: Failure
2020-05-28T01:53:16.397 INFO:tasks.workunit.client.0.smithi047.stdout:Expected equality of these values:
2020-05-28T01:53:16.397 INFO:tasks.workunit.client.0.smithi047.stdout:  0
2020-05-28T01:53:16.397 INFO:tasks.workunit.client.0.smithi047.stdout:  l2.unlock(&ioctx, oid1)
2020-05-28T01:53:16.397 INFO:tasks.workunit.client.0.smithi047.stdout:    Which is: -2
2020-05-28T01:53:16.398 INFO:tasks.workunit.client.0.smithi047.stdout:[  FAILED  ] ClsLock.TestExclusiveEphemeralStealExclusive (67859 ms)
2020-05-28T01:53:16.398 INFO:tasks.workunit.client.0.smithi047.stdout:[----------] 11 tests from ClsLock (159372 ms total)
2020-05-28T01:53:16.398 INFO:tasks.workunit.client.0.smithi047.stdout:
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:[----------] Global test environment tear-down
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:[==========] 11 tests from 1 test suite ran. (159372 ms total)
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:[  PASSED  ] 10 tests.
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:[  FAILED  ] 1 test, listed below:
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:[  FAILED  ] ClsLock.TestExclusiveEphemeralStealExclusive
2020-05-28T01:53:16.399 INFO:tasks.workunit.client.0.smithi047.stdout:
2020-05-28T01:53:16.400 INFO:tasks.workunit.client.0.smithi047.stdout: 1 FAILED TEST
rados/verify/{centos_latest.yaml ceph.yaml clusters/{fixed-2.yaml openstack.yaml} d-thrash/default/{default.yaml thrashosds-health.yaml} msgr-failures/few.yaml msgr/async-v1only.yaml objectstore/bluestore-bitmap.yaml rados.yaml tasks/rados_cls_all.yaml validater/valgrind.yaml}

/a/kchai-2020-05-27_23:43:53-rados-wip-kefu-testing-2020-05-27-2242-distro-basic-smithi/5097057

Actions #1

Updated by Casey Bodley almost 4 years ago

  • Assignee set to J. Eric Ivancich
Actions #2

Updated by Casey Bodley almost 4 years ago

suspect a timing issue with lock expirations, given the long runtime of these test cases

Actions #3

Updated by Neha Ojha over 3 years ago

2020-09-22T16:46:47.678 INFO:tasks.workunit.client.0.smithi035.stdout:/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-5821-ge806c408/rpm/el8/BUILD/ceph-16.0.0-5821-ge806c408/src/test/cls_lock/test_cls_lock.cc:562: Failure
2020-09-22T16:46:47.678 INFO:tasks.workunit.client.0.smithi035.stdout:Expected equality of these values:
2020-09-22T16:46:47.678 INFO:tasks.workunit.client.0.smithi035.stdout:  0
2020-09-22T16:46:47.679 INFO:tasks.workunit.client.0.smithi035.stdout:  l2.unlock(&ioctx, oid1)
2020-09-22T16:46:47.679 INFO:tasks.workunit.client.0.smithi035.stdout:    Which is: -2
2020-09-22T16:46:47.679 INFO:tasks.workunit.client.0.smithi035.stdout:[  FAILED  ] ClsLock.TestExclusiveEphemeralStealExclusive (10974 ms)

rados/verify/{centos_latest ceph clusters/{fixed-2 openstack} d-thrash/default/{default thrashosds-health} mon_election/connectivity msgr-failures/few msgr/async objectstore/filestore-xfs rados tasks/rados_cls_all validater/lockdep}

/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458804

Actions #4

Updated by Casey Bodley over 2 years ago

  • Status changed from New to Need More Info
  • Tags set to cls

has this shown up at all recently? rgw runs these tests under qa/suites/rgw/verify/tasks/cls.yaml and i haven't seen any failures there

Actions #5

Updated by J. Eric Ivancich over 2 years ago

Casey Bodley wrote:

has this shown up at all recently? rgw runs these tests under qa/suites/rgw/verify/tasks/cls.yaml and i haven't seen any failures there

I haven't seen it, and I agree with your speculation above that it was likely a timing issue. If the unlock took place 3 seconds after the lock, when it's expected to run less than 1.5 seconds after the lock, it would generate this error.

I think we can close this as Can't Reproduce.

Actions

Also available in: Atom PDF