Bug #9074
closed
- File osd.0.log osd.0.log added
- Status changed from New to 12
- Assignee set to Loïc Dachary
- Priority changed from Normal to High
test.sh fails to complete (~50% of the time) when testing noup":https://github.com/ceph/ceph/blob/ea731ae14216bb479eff1f86ed6bd4a7cb71fb56/qa/workunits/cephtool/test.sh#L517 with the following trace:
....
pool 0 'rbd' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 1 flags hashpspool stripe_width 0
max_osd 3
osd.0 down in weight 1 up_from 4 up_thru 108 down_at 140 last_clean_interval [0,0) 127.0.0.1:6800/31456 127.0.0.1:6801/31456 127.0.0.1:6802/31456 127.0.0.1:6803/31456 exists 5141c944-afcb-42b8-90d3-e7344a6fb169
osd.1 up in weight 1 up_from 8 up_thru 140 down_at 0 last_clean_interval [0,0) 127.0.0.1:6805/31667 127.0.0.1:6806/31667 127.0.0.1:6807/31667 127.0.0.1:6808/31667 exists,up 30553181-6a93-466b-9372-08baf202abd5
osd.2 up in weight 1 up_from 13 up_thru 140 down_at 0 last_clean_interval [0,0) 127.0.0.1:6810/31901 127.0.0.1:6811/31901 127.0.0.1:6812/31901 127.0.0.1:6813/31901 exists,up 23ab6473-d56c-4b9e-91f0-4f237e2bb7d0
test_mon_osd: 519: ceph osd dump
test_mon_osd: 519: grep 'osd.0 down'
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
osd.0 down in weight 1 up_from 4 up_thru 108 down_at 140 last_clean_interval [0,0) 127.0.0.1:6800/31456 127.0.0.1:6801/31456 127.0.0.1:6802/31456 127.0.0.1:6803/31456 exists 5141c944-afcb-42b8-90d3-e7344a6fb169
test_mon_osd: 520: ceph osd unset noup
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
unset noup
test_mon_osd: 521: (( i=0 ))
test_mon_osd: 521: (( i < 100 ))
test_mon_osd: 522: grep 'osd.0 up'
test_mon_osd: 522: ceph osd dump
*** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH ***
test_mon_osd: 523: echo 'waiting for osd.0 to come back up'
waiting for osd.0 to come back up
test_mon_osd: 524: sleep 10
test_mon_osd: 521: (( i++ ))
test_mon_osd: 521: (( i < 100 ))
test_mon_osd: 522: ceph osd dump
test_mon_osd: 522: grep 'osd.0 up'
...
Attached are logs of the mon + osd 0 when it is ok and when it is not, for comparison.
Wrong diagnostic, the error is not from here. It loops while waiting for osds to come back up a few lines below I was confused because the error messages are similar
- Status changed from 12 to Duplicate
Also available in: Atom
PDF