Actions
Bug #9620
closedtests: qa/workunits/cephtool/test.sh race condition
% Done:
100%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
osd are marked down and a loop checking there are no osd down immediately follows and uses osd dump. The following happened:
test_mon_osd: 600: ceph osd dump test_mon_osd: 600: grep 'osd.0 up' *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** osd.0 up in weight 1 up_from 143 up_thru 143 down_at 140 last_clean_interval [6,142) 127.0.0.1:6800/17838 127.0.0.1:6815/1017838 127.0.0.1:6816/1017838 127.0.0.1:6817/1017838 exists,up 16d58ecc-f79f-43cd-ad7f-074cc384e12b test_mon_osd: 602: ceph osd thrash 10 *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** will thrash map for 10 epochs test_mon_osd: 603: seq 0 31 test_mon_osd: 603: ceph osd down 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** marked down osd.0. osd.1 is already down. osd.2 is already down. osd.3 does not exist. osd.4 does not exist. osd.5 does not exist. osd.6 does not exist. osd.7 does not exist. osd.8 does not exist. osd.9 does not exist. osd.10 does not exist. osd.11 does not exist. osd.12 does not exist. osd.13 does not exist. osd.14 does not exist. osd.15 does not exist. osd.16 does not exist. osd.17 does not exist. osd.18 does not exist. osd.19 does not exist. osd.20 does not exist. osd.21 does not exist. osd.22 does not exist. osd.23 does not exist. osd.24 does not exist. osd.25 does not exist. osd.26 does not exist. osd.27 does not exist. osd.28 does not exist. osd.29 does not exist. osd.30 does not exist. osd.31 does not exist. test_mon_osd: 604: wait_no_osd_down wait_no_osd_down: 15: seq 1 300 wait_no_osd_down: 15: for i in '$(seq 1 300)' wait_no_osd_down: 16: check_no_osd_down check_no_osd_down: 10: ceph osd dump check_no_osd_down: 10: grep ' down ' *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** osd.0 down out weight 0 up_from 143 up_thru 145 down_at 147 last_clean_interval [6,142) 127.0.0.1:6800/17838 127.0.0.1:6815/1017838 127.0.0.1:6816/1017838 127.0.0.1:6817/1017838 exists 16d58ecc-f79f-43cd-ad7f-074cc384e12b osd.2 down in weight 1 up_from 12 up_thru 143 down_at 146 last_clean_interval [0,0) 127.0.0.1:6810/18282 127.0.0.1:6811/18282 127.0.0.1:6812/18282 127.0.0.1:6813/18282 exists c9d035f4-f848-45fd-8f56-16d5935d2d49 wait_no_osd_down: 17: echo 'waiting for osd(s) to come back up' waiting for osd(s) to come back up wait_no_osd_down: 18: sleep 1 wait_no_osd_down: 15: for i in '$(seq 1 300)' wait_no_osd_down: 16: check_no_osd_down check_no_osd_down: 10: ceph osd dump check_no_osd_down: 10: grep ' down ' *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** osd.0 down out weight 0 up_from 143 up_thru 145 down_at 147 last_clean_interval [6,142) 127.0.0.1:6800/17838 127.0.0.1:6815/1017838 127.0.0.1:6816/1017838 127.0.0.1:6817/1017838 exists 16d58ecc-f79f-43cd-ad7f-074cc384e12b osd.1 down in weight 1 up_from 148 up_thru 148 down_at 150 last_clean_interval [0,0) :/0 :/0 :/0 :/0 exists 4d383cb1-db68-4fa1-a94b-3f8a9931943c osd.2 down out weight 0 up_from 149 up_thru 149 down_at 150 last_clean_interval [0,0) :/0 :/0 :/0 :/0 exists c9d035f4-f848-45fd-8f56-16d5935d2d49 wait_no_osd_down: 17: echo 'waiting for osd(s) to come back up' waiting for osd(s) to come back up wait_no_osd_down: 18: sleep 1 wait_no_osd_down: 15: for i in '$(seq 1 300)' wait_no_osd_down: 16: check_no_osd_down check_no_osd_down: 10: ceph osd dump check_no_osd_down: 10: grep ' down ' *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** wait_no_osd_down: 20: break wait_no_osd_down: 23: check_no_osd_down check_no_osd_down: 10: ceph osd dump check_no_osd_down: 10: grep ' down ' *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** osd.2 down in weight 1 up_from 151 up_thru 151 down_at 155 last_clean_interval [0,0) :/0 :/0 :/0 :/0 exists c9d035f4-f848-45fd-8f56-16d5935d2d49
Updated by Loïc Dachary over 9 years ago
The following sequence happens:
- ceph osd dump finds 3 osd "down"
- ceph osd dump finds no osd "down"
- ceph osd dump finds one osd "down"
could it be a side effect of ceph osd thrash 10 that happened a few lines above ?
Updated by Loïc Dachary over 9 years ago
- Status changed from New to 12
- Assignee set to Loïc Dachary
The ceph osd thrash command will randomly mark osds down and up which explains the above.
Updated by Loïc Dachary over 9 years ago
- Status changed from 12 to Fix Under Review
- % Done changed from 0 to 80
Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Loïc Dachary over 9 years ago
- Status changed from Pending Backport to Fix Under Review
Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Resolved
i jumped the gun and merged, oops!
Updated by Loïc Dachary over 9 years ago
I will verify the result when they are ready but I'm not too concerned ;-)
Updated by Loïc Dachary over 9 years ago
- Status changed from 7 to Resolved
- % Done changed from 80 to 100
Actions