Actions
Bug #18018
closedtests: ceph-helpers.sh races when killing daemons
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
https://jenkins.ceph.com/job/ceph-pull-requests/14885/console
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:50: add_something: local dir=td/osd-scrub-repair /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:51: add_something: local poolname=ecpool /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:52: add_something: local obj=SOMETHING /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:54: add_something: ceph osd set noscrub noscrub is set /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:55: add_something: ceph osd set nodeep-scrub nodeep-scrub is set /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:57: add_something: local payload=ABCDEF /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:58: add_something: echo ABCDEF /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:59: add_something: rados --pool ecpool put SOMETHING td/osd-scrub-repair/ORIGINAL /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh: line 49: 10051 Terminated rados --pool $poolname put $obj $dir/ORIGINAL /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:59: add_something: return 1
One possible explanation for this unexpected kill is a kill_daemon still running in the background although it should not.
Files
Updated by Loïc Dachary over 7 years ago
- File osd-scrub-repair.txt.gz osd-scrub-repair.txt.gz added
Updated by Loïc Dachary over 7 years ago
Blocked by http://tracker.ceph.com/issues/18019, wait for it to be resolved to restore something stable.
Updated by Loïc Dachary over 7 years ago
- File osd-crush.txt.gz osd-crush.txt.gz added
osd-crush.sh experienced a similar and unexplained termination (see logs)
Updated by Loïc Dachary over 7 years ago
I thought maybe it was jenkins being rebooted or something, but both osd-crush.sh and osd-scrub-repair.sh happened hours from each other.
Updated by Loïc Dachary over 7 years ago
- Status changed from In Progress to Can't reproduce
Updated by Kefu Chai about 7 years ago
i spotted it again
/home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:150: corrupt_and_repair_one: rados --pool ecpool get SOMETHING td/osd-scrub-repair/COPY /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh: line 132: 20631 Terminated rados --pool $poolname get SOMETHING $dir/COPY /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:150: corrupt_and_repair_one: return 1 2017-04-18 09:39:52.460333 7f4e62f9fb00 1 journal _open td/osd-scrub-repair/3/journal fd 25: 104857600 bytes, block size 4096 bytes, directio = 1, aio = 0 2017-04-18 09:39:52.460608 7f4e62f9fb00 1 filestore(td/osd-scrub-repair/3) upgrade 2017-04-18 09:39:52.460692 7f4e62f9fb00 -1 filestore(td/osd-scrub-repair/3) could not find #-1:7b3f43c4:::osd_superblock:0# in index: (2) No such file or directory 2017-04-18 09:39:52.521025 7f4e62f9fb00 1 journal close td/osd-scrub-repair/3/journal 2017-04-18 09:39:52.523247 7f4e62f9fb00 -1 created object store td/osd-scrub-repair/3 for osd.3 fsid 030953c5-acc6-4470-a57b-301b0d90ec27 2017-04-18 09:39:52.523286 7f4e62f9fb00 -1 auth: error reading file: td/osd-scrub-repair/3/keyring: can't open td/osd-scrub-repair/3/keyring: (2) No such file or directory 2017-04-18 09:39:52.523414 7f4e62f9fb00 -1 created new key in keyring td/osd-scrub-repair/3/keyring 2017-04-18 09:39:52.728703 7f2b38f62b00 0 ceph version 12.0.0-2702-g540e725 (540e7255e8ad650639353cb8461dfef832051d28), process ceph-osd, pid 7022 2017-04-18 09:39:52.728789 7f2b38f62b00 5 object store type is filestore /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:171: corrupt_and_repair_erasure_coded: return 1 /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:241: TEST_corrupt_and_repair_jerasure: return 1 /home/jenkins-build/build/workspace/ceph-pull-requests/src/test/osd/osd-scrub-repair.sh:45: run: return 1
in "TEST_corrupt_and_repair_jerasure" this time.
Actions