Project

General

Profile

Bug #24910

run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run

Added by Kefu Chai over 5 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

 88/154 Test   #1: run-rbd-unit-tests.sh ...................***Failed   51.02 sec
++ dirname /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/run-rbd-unit-tests.sh
+ source /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/detect-build-env-vars.sh
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build/lib ']'
++ echo 'Enivronment Variables Already Set'
Enivronment Variables Already Set
+ PATH=/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin:/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin:/home/jenkins-build/build/workspace/ceph-pull-requests/src:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
+ unset RBD_FEATURES
+ unittest_librbd
seed 12581
[==========] Running 680 tests from 66 test cases.
[----------] Global test environment set-up.
[----------] 5 tests from TestMockDeepCopyRequest

...
...

[ RUN      ] TestLibRBD.TestCopyPP
using old format
num images is: 1
expected: 1
image: imaErrors while running CTest

not sure it's a test related issue or environment related one though..


Related issues

Copied to rbd - Backport #27060: mimic: run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run Resolved
Copied to rbd - Backport #27061: luminous: run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run Resolved

History

#1 Updated by Kefu Chai over 5 years ago

  • Description updated (diff)

#2 Updated by Kefu Chai over 5 years ago

  • Description updated (diff)

#3 Updated by Mykola Golub over 5 years ago

Kefu, why do you think that TestLibRBD.TestCopyPP fails to finish?

The log says that run-rbd-unit-tests.sh was running for 51.02 sec and failed. For me it rather looks like the test output was truncated and we can't see the real error.

#4 Updated by Kefu Chai over 5 years ago

Mykola, i am not sure on this. it just appears to be so. and i think you are also right, the output could be truncated as well.

#5 Updated by Kefu Chai over 5 years ago

  • Subject changed from TestLibRBD.TestCopyPP fails to finish in jenkin's "make check" run to run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run

another occurrence:

 
 98/157 Test   #1: run-rbd-unit-tests.sh ...................***Failed   67.42 sec
++ dirname /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/run-rbd-unit-tests.sh
+ source /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/detect-build-env-vars.sh
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build/bin ']'
++ '[' -n /home/jenkins-build/build/workspace/ceph-pull-requests/build/lib ']'
++ echo 'Enivronment Variables Already Set'
Enivronment Variables Already Set
+ PATH=/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin:/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin:/home/jenkins-build/build/workspace/ceph-pull-requests/src:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
+ unset RBD_FEATURES
+ unittest_librbd
seed 20923
[==========] Running 681 tests from 66 test cases.
[----------] Global test environment set-up.
[----------] 5 tests from TestMockDeepCopyRequest

...
...
[----------] 13 tests from TestMockOperationSnapshotRemoveRequest
[ RUN      ] TestMockOperationSnapshotRemoveRequest.Success
[       OK ] TestMockOperationSnapshotRemoveRequest.Success (5 ms)

...
...

[ RUN      ] TestMockOperationSnapshotRemoveRequest.RemoveChildParentError
SKIPPING
[       OK ] TestMocErrors while running CTest
Build step 'Execute shell' marked build as failure

#6 Updated by Jason Dillaman over 5 years ago

I've seen the log truncation issue lots of times. It seems to only affect the x86_64 builds, so I always try to look at the arm build logs for the full failure log.

#7 Updated by Kefu Chai over 5 years ago

  • Priority changed from Normal to Urgent

marking it urgent as it happens very often now.

#8 Updated by Jason Dillaman over 5 years ago

@Kefu: any clue / suggestion for fixing the test output under jenkins? The logs are clearly cut off so we cannot see the actual failure and this is a change to the jenkins environment within the past several months.

#9 Updated by Kefu Chai over 5 years ago

Jason, what was the change?

#10 Updated by Jason Dillaman over 5 years ago

No clue -- that's the problem. The jenkins bots used to always include the full stdout but now they don't.

#11 Updated by Mykola Golub over 5 years ago

  • Status changed from New to Fix Under Review
  • Backport set to mimic,luminous

Due to truncated jenkin's log I am not sure I reproduced the same case that is observed on jenkin, but I can make unittest_librbd sporadically fail locally when running it on one cpu. It reports a leaked mock ioctx for a "SKIPPED" mock test. A way to reproduce:

maha:~/ceph/ceph.trociny/build% while RBD_FEATURES=0 taskset --cpu-list 1 unittest_librbd --gtest_filter=TestMockObjectMap.NonDetainedUpdate; do sleep 0; done 
...
seed 18952
Note: Google Test filter = TestMockObjectMap.NonDetainedUpdate
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from TestMockObjectMap
2018-08-18 19:09:45.709 7fbe59ca3a00 -1 WARNING: all dangerous and experimental features are enabled.
[ RUN      ] TestMockObjectMap.NonDetainedUpdate
Failed to load class: cephfs (lib/libcls_cephfs.so): lib/libcls_cephfs.so: undefined symbol: _ZTI10PGLSFilter
Failed to load class: hello (lib/libcls_hello.so): lib/libcls_hello.so: undefined symbol: _ZTI10PGLSFilter
Failed to load class: log (lib/libcls_log.so): lib/libcls_log.so: undefined symbol: _Z24cls_cxx_map_write_headerPvPN4ceph6buffer4listE
Failed to load class: lua (lib/libcls_lua.so): lib/libcls_lua.so: undefined symbol: _Z24cls_cxx_map_write_headerPvPN4ceph6buffer4listE
Failed to load class: refcount (lib/libcls_refcount.so): lib/libcls_refcount.so: undefined symbol: _Z14cls_cxx_removePv
Failed to load class: rgw (lib/libcls_rgw.so): lib/libcls_rgw.so: undefined symbol: _Z24cls_cxx_map_write_headerPvPN4ceph6buffer4listE
Failed to load class: sdk (lib/libcls_sdk.so): lib/libcls_sdk.so: undefined symbol: _Z14cls_cxx_removePv
Failed to load class: user (lib/libcls_user.so): lib/libcls_user.so: undefined symbol: _Z24cls_cxx_map_write_headerPvPN4ceph6buffer4listE
Failed to load class: version (lib/libcls_version.so): lib/libcls_version.so: undefined symbol: _Z19cls_gen_rand_base64Pci
SKIPPING
[       OK ] TestMockObjectMap.NonDetainedUpdate (8 ms)
[----------] 1 test from TestMockObjectMap (8 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (27 ms total)
[  PASSED  ] 1 test.

/home/mgolub/ceph/ceph.trociny/src/test/librados_test_stub/MockTestMemRadosClient.h:66: ERROR: this mock object should be deleted but never is. Its address is @0x55c2350157e0.
/home/mgolub/ceph/ceph.trociny/src/test/librados_test_stub/MockTestMemIoCtxImpl.h:197: ERROR: this mock object (used in test TestMockObjectMap.NonDetainedUpdate) should be deleted but never is. Its address is @0x55c235203910.
ERROR: 2 leaked mock objects found at program exit.

I traced this to a race when completing `image::CreateRequest`: m_on_finish callback is called (and test finishes) before the request (and data ioctx) is destroyed.

PR: https://github.com/ceph/ceph/pull/23639

#12 Updated by Nathan Cutler over 5 years ago

  • Assignee set to Mykola Golub

#13 Updated by Nathan Cutler over 5 years ago

  • Status changed from Fix Under Review to Pending Backport

#14 Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #27060: mimic: run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run added

#15 Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #27061: luminous: run-rbd-unit-tests.sh test fails to finish in jenkin's "make check" run added

#16 Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF