Bug #11930
closed
test/ceph_objectstore_tool.py fails with OSD has the store locked
Added by Nathan Cutler almost 9 years ago.
Updated about 8 years ago.
Description
When I run "run-make-check.sh" on master, all tests pass except ceph_objectstore_tool.py
$ cat ./src/test/ceph_objectstore_tool.py.log
vstarting.... DONE
Wait for health_ok... DONE
Created Replicated pool #1
Created Erasure coded pool #2
Creating 4 objects in replicated pool
Creating 4 objects in erasure coded pool
Test invalid parameters
Test --op dump-journal
Journal max_size = 104857600
Test --op list variants
Test --op list by generating json for all objects using default format
Test get-bytes and set-bytes
Test list-attrs get-attr
Test pg info
Test pg logging
Test list-pgs
Test pg export --dry-run
Test pg export
Test pg removal
Test pg import
Verify replicated import data
vstarting.... DONE
Wait for health_ok... DONE
Verify erasure coded import data
Test import-rados
Wait for health_ok... DONE
Remove pgs for another import
OSD has the store locked
ERROR:Removing failed for pg 1.0 on osd0 with 1
WARNING:SKIPPING IMPORT TESTS DUE TO PREVIOUS FAILURES
vstarting.... DONE
Wait for health_ok... DONE
TEST FAILED WITH 1 ERRORS
- Status changed from New to 12
- Assignee deleted (
Loïc Dachary)
the machine running the tests does not have a SSD and make -j4 ( nproc == 8 ) and is therefore expected to be very slow at times.
maybe the pid file does not exist for some reason: it's the only way kill_daemon could return without actually killing the daemon
- Subject changed from test/ceph_objectstore_tool.py fails on fast machine with HDD to test/ceph_objectstore_tool.py fails with OSD has the store locked
- Description updated (diff)
I ran run-make-check.sh again on a clean checkout of master. Again ceph_objectstore_tool.py is the only failing test, but the log is different:
FAIL: ./src/test/ceph_objectstore_tool.py.log
vstarting.... DONE
Wait for health_ok... DONE
Created Replicated pool #1
Created Erasure coded pool #2
Creating 4 objects in replicated pool
Creating 4 objects in erasure coded pool
Test invalid parameters
Test all --op dump-journal
Journal max_size = 104857600
WARNING:No ops found in entry 66 trans 0
WARNING:No ops found in entry 74 trans 0
WARNING:No ops found in entry 79 trans 0
WARNING:No ops found in entry 84 trans 0
Journal max_size = 104857600
Journal max_size = 104857600
WARNING:No ops found in entry 7 trans 0
WARNING:No ops found in entry 11 trans 0
WARNING:No ops found in entry 15 trans 0
WARNING:No ops found in entry 28 trans 0
Journal max_size = 104857600
Test --op list variants
Test --op list by generating json for all objects using default format
Test get-bytes and set-bytes
Test list-attrs get-attr
Test pg info
Test pg logging
Test list-pgs
Test pg export --dry-run
Test pg export
Test pg removal
Test pg import
Verify replicated import data
Test all --op dump-journal again
Journal max_size = 104857600
Journal max_size = 104857600
Journal max_size = 104857600
Journal max_size = 104857600
vstarting.... DONE
Wait for health_ok... DONE
Verify erasure coded import data
Test import-rados
Testing import all objects after a split
Wait for health_ok... DONE
ERROR:Exporting failed for pg 4.0 on osd0 with 1
TEST FAILED WITH 1 ERRORS
For completeness, here is how I am setting up the build environment:
test/docker-test.sh --os-type fedora --os-version 21 --shell
- Assignee set to David Zafman
- Status changed from 12 to Need More Info
I've never run into this issue. Please try the following after you've built source to see if it fails:
# cd src
# test/ceph_objectstore_tool.py
If it does fail lets get some extra output by making a change to the test code and running the test again. Also, I'd be curious if it fails in different ways if you try it multiple times.
diff --git a/src/test/ceph_objectstore_tool.py b/src/test/ceph_objectstore_tool.py
index 3bb1d34..9b6901d 100755
--- a/src/test/ceph_objectstore_tool.py
+++ b/src/test/ceph_objectstore_tool.py
@@ -357,7 +357,7 @@ def check_data(DATADIR, TMPFILE, OSDDIR, SPLIT_NAME):
def main(argv):
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
- nullfd = open(os.devnull, "w")
+ nullfd = sys.stdout
call("rm -fr {dir}; mkdir {dir}".format(dir=CEPH_DIR), shell=True)
os.environ["CEPH_DIR"] = CEPH_DIR
- Status changed from Need More Info to Can't reproduce
I bet this has been fixed, but if not reopen.
Also available in: Atom
PDF