Bug #11930
closedtest/ceph_objectstore_tool.py fails with OSD has the store locked
0%
Description
When I run "run-make-check.sh" on master, all tests pass except ceph_objectstore_tool.py
$ cat ./src/test/ceph_objectstore_tool.py.log vstarting.... DONE Wait for health_ok... DONE Created Replicated pool #1 Created Erasure coded pool #2 Creating 4 objects in replicated pool Creating 4 objects in erasure coded pool Test invalid parameters Test --op dump-journal Journal max_size = 104857600 Test --op list variants Test --op list by generating json for all objects using default format Test get-bytes and set-bytes Test list-attrs get-attr Test pg info Test pg logging Test list-pgs Test pg export --dry-run Test pg export Test pg removal Test pg import Verify replicated import data vstarting.... DONE Wait for health_ok... DONE Verify erasure coded import data Test import-rados Wait for health_ok... DONE Remove pgs for another import OSD has the store locked ERROR:Removing failed for pg 1.0 on osd0 with 1 WARNING:SKIPPING IMPORT TESTS DUE TO PREVIOUS FAILURES vstarting.... DONE Wait for health_ok... DONE TEST FAILED WITH 1 ERRORS
Updated by Nathan Cutler almost 9 years ago
(11:29:50) loicd: https://github.com/ceph/ceph/blob/master/src/init-ceph.in#L93
(11:31:17) loicd: https://github.com/ceph/ceph/blob/master/src/test/ceph_objectstore_tool.py#L325 is where it's called
Updated by Loïc Dachary almost 9 years ago
- Status changed from New to 12
- Assignee deleted (
Loïc Dachary)
Updated by Loïc Dachary almost 9 years ago
the machine running the tests does not have a SSD and make -j4 ( nproc == 8 ) and is therefore expected to be very slow at times.
Updated by Loïc Dachary almost 9 years ago
maybe the pid file does not exist for some reason: it's the only way kill_daemon could return without actually killing the daemon
Updated by Loïc Dachary almost 9 years ago
- Subject changed from test/ceph_objectstore_tool.py fails on fast machine with HDD to test/ceph_objectstore_tool.py fails with OSD has the store locked
Updated by Nathan Cutler almost 9 years ago
I ran run-make-check.sh again on a clean checkout of master. Again ceph_objectstore_tool.py is the only failing test, but the log is different:
FAIL: ./src/test/ceph_objectstore_tool.py.log vstarting.... DONE Wait for health_ok... DONE Created Replicated pool #1 Created Erasure coded pool #2 Creating 4 objects in replicated pool Creating 4 objects in erasure coded pool Test invalid parameters Test all --op dump-journal Journal max_size = 104857600 WARNING:No ops found in entry 66 trans 0 WARNING:No ops found in entry 74 trans 0 WARNING:No ops found in entry 79 trans 0 WARNING:No ops found in entry 84 trans 0 Journal max_size = 104857600 Journal max_size = 104857600 WARNING:No ops found in entry 7 trans 0 WARNING:No ops found in entry 11 trans 0 WARNING:No ops found in entry 15 trans 0 WARNING:No ops found in entry 28 trans 0 Journal max_size = 104857600 Test --op list variants Test --op list by generating json for all objects using default format Test get-bytes and set-bytes Test list-attrs get-attr Test pg info Test pg logging Test list-pgs Test pg export --dry-run Test pg export Test pg removal Test pg import Verify replicated import data Test all --op dump-journal again Journal max_size = 104857600 Journal max_size = 104857600 Journal max_size = 104857600 Journal max_size = 104857600 vstarting.... DONE Wait for health_ok... DONE Verify erasure coded import data Test import-rados Testing import all objects after a split Wait for health_ok... DONE ERROR:Exporting failed for pg 4.0 on osd0 with 1 TEST FAILED WITH 1 ERRORS
Updated by Nathan Cutler almost 9 years ago
For completeness, here is how I am setting up the build environment:
test/docker-test.sh --os-type fedora --os-version 21 --shell
Updated by David Zafman almost 9 years ago
- Status changed from 12 to Need More Info
I've never run into this issue. Please try the following after you've built source to see if it fails:
# cd src # test/ceph_objectstore_tool.py
If it does fail lets get some extra output by making a change to the test code and running the test again. Also, I'd be curious if it fails in different ways if you try it multiple times.
diff --git a/src/test/ceph_objectstore_tool.py b/src/test/ceph_objectstore_tool.py index 3bb1d34..9b6901d 100755 --- a/src/test/ceph_objectstore_tool.py +++ b/src/test/ceph_objectstore_tool.py @@ -357,7 +357,7 @@ def check_data(DATADIR, TMPFILE, OSDDIR, SPLIT_NAME): def main(argv): sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0) - nullfd = open(os.devnull, "w") + nullfd = sys.stdout call("rm -fr {dir}; mkdir {dir}".format(dir=CEPH_DIR), shell=True) os.environ["CEPH_DIR"] = CEPH_DIR
Updated by David Zafman about 8 years ago
- Status changed from Need More Info to Can't reproduce
I bet this has been fixed, but if not reopen.