Project

General

Profile

Actions

Bug #50921

closed

--nuke --stale needs to retry

Added by David Galloway almost 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
High
Category:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Paddles throws a bunch of 500 errors when attempting to --nuke --stale lots of machines at a time.

2021-05-14 18:04:05,793.793 ERROR:teuthology.lock.ops:failed to unlock gibba041.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,795.795 ERROR:teuthology.lock.ops:failed to unlock gibba008.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,803.803 INFO:teuthology.lock.ops:unlocked smithi114.front.sepia.ceph.com
2021-05-14 18:04:05,810.810 ERROR:teuthology.lock.ops:failed to unlock gibba040.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,811.811 ERROR:teuthology.lock.ops:failed to unlock gibba036.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,812.812 ERROR:teuthology.lock.ops:failed to unlock gibba032.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,813.813 ERROR:teuthology.lock.ops:failed to unlock gibba018.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,814.814 INFO:teuthology.lock.ops:unlocked gibba030.front.sepia.ceph.com
2021-05-14 18:04:05,814.814 ERROR:teuthology.lock.ops:failed to unlock gibba020.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,820.820 INFO:teuthology.lock.ops:unlocked gibba013.front.sepia.ceph.com
2021-05-14 18:04:05,833.833 INFO:teuthology.lock.ops:unlocked gibba022.front.sepia.ceph.com
2021-05-14 18:04:05,834.834 INFO:teuthology.lock.ops:unlocked gibba002.front.sepia.ceph.com
2021-05-14 18:04:05,835.835 ERROR:teuthology.lock.ops:failed to unlock gibba005.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,839.839 WARNING:teuthology.lock.query:Failed to query lock server for status of gibba026.front.sepia.ceph.com
2021-05-14 18:04:05,859.859 INFO:teuthology.lock.ops:unlocked gibba026.front.sepia.ceph.com
2021-05-14 18:04:05,863.863 ERROR:teuthology.lock.ops:failed to unlock gibba038.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,872.872 ERROR:teuthology.lock.ops:failed to unlock gibba035.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,875.875 INFO:teuthology.lock.ops:unlocked gibba043.front.sepia.ceph.com
2021-05-14 18:04:05,883.883 INFO:teuthology.lock.ops:unlocked gibba028.front.sepia.ceph.com
2021-05-14 18:04:05,884.884 INFO:teuthology.lock.ops:unlocked gibba010.front.sepia.ceph.com
2021-05-14 18:04:05,923.923 ERROR:teuthology.lock.ops:failed to unlock gibba031.front.sepia.ceph.com. reason: 500
2021-05-14 18:04:05,936.936 INFO:teuthology.lock.ops:unlocked smithi039.front.sepia.ceph.com
2021-05-14 18:04:05,940.940 INFO:teuthology.lock.ops:unlocked gibba045.front.sepia.ceph.com
2021-05-14 18:04:05,940.940 INFO:teuthology.lock.ops:unlocked gibba027.front.sepia.ceph.com
2021-05-14 18:04:08,558.558 INFO:teuthology.lock.ops:unlocked gibba006.front.sepia.ceph.com
2021-05-14 18:04:13,890.890 INFO:teuthology.orchestra.console:Power off for gibba009 completed
2021-05-14 18:04:14,029.029 INFO:teuthology.lock.ops:unlocked gibba009.front.sepia.ceph.com
2021-05-14 18:04:14,054.054 INFO:teuthology.orchestra.console:Power off for gibba014 completed
2021-05-14 18:04:14,059.059 INFO:teuthology.orchestra.console:Power off for gibba003 completed
2021-05-14 18:04:14,191.191 INFO:teuthology.lock.ops:unlocked gibba014.front.sepia.ceph.com
2021-05-14 18:04:14,195.195 ERROR:teuthology.lock.ops:failed to unlock gibba003.front.sepia.ceph.com. reason: 500
Actions #1

Updated by David Galloway almost 3 years ago

This is killing us. This is less than 24 hours since I ran --nuke --stale

dgalloway@teuthology:~$ for owner in $(teuthology-lock --summary | grep -i 'scheduled\|vshankar' | awk '{ print $4 }' | sort -u); do echo $owner; teuthology-nuke --stale --owner $owner --unlock --dry-run; done
scheduled_gsalomon@teuthology
2021-05-27 14:07:23,320.320 INFO:teuthology.nuke:targets:
  smithi081.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCjRGF3quvtdo0B6C0wAzJqegMDA9OsGamtJ8J44FneU/cKw8t5DY5PY9RQeTZi2xrYT1pHrwbDu3g32GbmfsaiERI2r9QrdoXmVlef+uh37TL8oSxufkHaO1QmvdgSspbnHkw9Ke2mA9pL08mf9c5ZQxjo0h3TRhTz2d46dz8RsNsLfb5UDFWaQA+TJZR/4Q35b/uf9y2brFE5v0A7z4AdROC3O/ap6YHG8Nlc/CGvWX6FpBp5nz5odN4vTL8f/iKD5DtK70MkDbA0+J0vg2RzC7JdHGedT6wpCkgA0s8Ll6a1t3a+uiuLgO2GjcnWc2oVHWVw7R6qaexyWr+B+M8RoUmR0Il1CKuhXOs07J04lVRkYI00Hdzc5MT+bAce6TmEmkfXL7cM7LTpFrO+PI1WniaDIdZeaDUHxENP6S+EZ/RNXcP+rztIBUuFm4qqkyr/MmnIEgFy7/kJoAcx71ubHbETamjgcAF5AjFORRit2wQi0MHAdIolViAMDruCRxE=
  smithi082.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDDCQ0Xnd72XvraJOaebyb63WJV9jh5kwvxOZTfgDiuYdTBIaINm4J7rZ3kxefTEkpJDW0KjxaLNVjFiouSXzkW3msPCZairy+zK8yJPM7J0QiMRLYZhnWE0Y6IYF1xWrm/d+Q13V7VcKHlE9SfpIHbefVJogddvsgJp7SLuCbXT0S95WsuY2VymItt/cK/eakipFxya6hctzI8Lz2f5+dVL94hYNTvADyVJ+zsfUoDdFvRw8YF2P6SRTvQc/V2V4Qmvfo/NVxJnB2I+xz6Oxv/07/CYVVUdB7WbGNZKrdGc10qp7NKeJRE+xuLL2YLFBeJGStOZtdU+yD1ZRjWImlr/nxg0oFMZi7I4RZEvlBcGkCsGWVd5gS7VOnku4e0J4/0mt6lJ5i+cwU6pUnvClfVA0w3jJsKr3gqoErvkQOMO6G+4nu/OOmsn6RvblsWm39GeYA2SrO8auuIeeqDKf56vLb+mVPFuYvOcCpF7V1wfD2jgyPSjz7YSOymnKFkDa0=
2021-05-27 14:07:23,320.320 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_ideepika@teuthology
2021-05-27 14:07:24,268.268 INFO:teuthology.nuke:targets:
  smithi097.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC3nt3vS/XzimPwf86fznbN6VFhYe2P9xaPKmHDXJXV4+/xvOoboGWV6owh3qDaNvdEV9wziVibNjlF0iMNBF0kLJ8OhVuwun2EfDFmDFnRnfupHuiMAlh8yOQl4/NOv5m3TLHNsiGELH8vavsfeFUEHuEV15arjwHLIkzBAaYFWjQNB6/MhnVrPvhEJJNVb5VbqHlvXupv/d0rVylU7azFAuR+49fXqbXsj2y8Cmomtm3NwLY89uglHDkkdSNIACLswKrRtP8lmdJPDiuL+f8WvOYEPb6aw77Qss6wIOeU502uzyzYfic0IqbOV9Zmro1t1AB9cEoF6ymz4A6Nvd8CrYs46DNm8QfoXVpWZ2HwlS0P0HOIlLnEmKNa4q2nScY38c6WcLiodYs1C4yGFKuqzgMoqyT2zE38JKiTVz22o4swYxBrACOBhVXHLRXO8PxVRqHq33kiT9whIWUWBbV3HEOker4XR5pAlKzoxEhEGaTp/YjlKJr7Y4HFfc7HxH8=
2021-05-27 14:07:24,268.268 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_kapandya@teuthology
2021-05-27 14:07:25,274.274 INFO:teuthology.nuke:targets:
  smithi171.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCnedXVhGCWtTNU6wefo1v3DLBq2/iNjomkX/bOdEklTEZ07v101ugtijfab+Rg+zKlPAKegcvN5uy0syv/yZnl/App7/seR4j7btHxcbuQ0tKeo9Ew27yHZi5TE1QkxSha3L0zA2u4Yd9Y1/yXwAuaU64P/fEiez9SeurQuYBpIz6vsxXeIUcArd9P9QhZbV2jWCwugeUQTnftJZ6AwA9+FCHE+PpSp8fHvcb9N72rY/uhipTsqP/8ScBLFOUi6e83qcZA6mB7hfTXVxDqrtAZ8r2rFSOXV1h+PXhTOqe3fH5E/teI4UC/fOWMv5KMEsnuS+3rI8eTqR5MGs3WDNfUeHV5WtAFB1JQ66RmwU7PyfVghkkUvNFUsrJ13/cHdOe4x1Yhtqv1JRAwQBDXhVRQbb6di2USIMYim924OtOaOKmIt245/m0Ftmc9YeoyIT2b+Oe4rfEcAOzLt1Q4VLZGIRDVYnfn+CWERc9OkUjIluHfgFX2RzbpixzVvHlSJvU=
2021-05-27 14:07:25,275.275 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_kchai@teuthology
2021-05-27 14:07:26,213.213 INFO:teuthology.nuke:targets:
  {}
2021-05-27 14:07:26,213.213 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_mkogan@teuthology
2021-05-27 14:07:27,225.225 INFO:teuthology.nuke:targets:
  smithi132.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC0PfYbB512eLcjCfG+hbgzZgr7VIGrY2mlOGzVi7NdZG8GTOU2A+b7vH9+KVs02WI/bPrMIt0iZz6IaOw8o0t55k5dfwi+y0kP/llqynui29VTV10xlt/T/IHRCfxtY273PWI6t4cQNRvWMqD8G4QkN2Cf03td65oQt40OS3Atydbup+vygmwcFVWsLByQsqyCG1i5A3R3mCt1FPeP/suByyGt22rcFMsK5ok9GInKQGh+kqHNQqefVTH4nD1so8bhivVItiCS39v1n1/+DaDsq8s17TNJ9zDRP2AOb6So0WSR/gWcPU+ksdVjhc1tHvFDj00oD/Ah+Un1GZkmY6a6bIB3ZQ6wSof47l58OlgFgMw8yG+UnJAoOGsxns8ZJK60DkUFnJfLXr8O1tFYvS5HkkZknxFmHdTvsjIA3B6jLivgQzSffsXb0mN7/anOotmGq16pveQcsQY1UzTS1TL4P1KzYQ5w+OjgUTJQGZo/maCf1OUcTwnZvV9pU2Ul2xs=
  smithi185.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCmWXnoDKv2IjBEpD5UMWsBGKr+hrAWtLNvTm4o+tSUz+4MVgWsQHfF1AkNrnU+3D4IFOjYFdUowIPw68hJHAuvB3jVf2iIG1YVcZXqISFCAOld7fHLmFCiDl8Dx7t6LU+K+KjQqLH64IxuL90Hqdhl1k7iYrIsoSUI2nx0NdZVlQp9anYbt0bvZbuBhfAGbK0bQqDD6vk1a/PvAJFo0waAAGlhyNE+oUazeR9OT2xNEwIUytZxHxGdAp/yyoYaPzk6sWMvV7bnkG8QR3LQJCjYBxo71l18ULZSWP7yD48tKQ0vb7ADYChVIIyUR0h9jaqqKgF7K+ZUqrEsCrK5EZNYgo1sEDLWCTdseRdzJ1MCt29J6y9o+7vEXjlzOWjmBBv5pzgxhWuTxB9Ls4lAh5752wrMBAjKeaUmexliltQZs2pduhFEafDth92XFZPXL9KoK4yNy/NBcIv1/UqDevri5SdNUx5J2CWYvw4KdSgadea+eR9PyHpB8Qm6y+BUc8s=
2021-05-27 14:07:27,225.225 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_myoungwon@teuthology
2021-05-27 14:07:28,176.176 INFO:teuthology.nuke:targets:
  {}
2021-05-27 14:07:28,176.176 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_ofriedma@teuthology
2021-05-27 14:07:29,177.177 INFO:teuthology.nuke:targets:
  smithi041.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC3rODtuCX4tjhzAqkKtLkqR0eVrQvOk+Yfk1U79IXy6Owtf/8jxfq9WLxjtL4KJg7JFrYrEjMihvsjcBU83jOrP6uYDDRKECX6HRhhImasgS1UrJ5f+OtafHxZ4lmvVjcGrUFShVxo3g0jOb0meXlXqNeP3vTb0togt5i1tE8jCVPLEs+iz0mdM9auC/QHVenEuwJH3WEnJWEXGdQBcLKmA7AV9ksFdyCtyxuTtiJG27IztE5lRDXL/mA0ZJBV3cAAb55VN2PiWMGs4MXmGD7Ms4MlRTmU5LOJ8ud0CbBQqZVoQCxleXflSsuEdJmmeWWkQQKIjHgG/cuwhtkpmhUpHFbH3i+Ai4RtjbruHOaH//1GNaTIRJZ0eplQxLBHytdqiWX974KFn+v381O4FxhveBZ9Nrh2hfBOOsxLEngvHXP2f1FpYOZ/OUGDMYVtdmD+msykfrkSRQRbzORtTazYnFfMerFT7MAKK26IJO788mqCHpvIoxb7lWFBcYJEFes=
2021-05-27 14:07:29,178.178 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_pdonnell@teuthology
2021-05-27 14:07:30,122.122 INFO:teuthology.nuke:targets:
  {}
2021-05-27 14:07:30,122.122 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_rfriedma@teuthology
2021-05-27 14:07:31,092.092 INFO:teuthology.nuke:targets:
  gibba030.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCp/pFRHuyyVh8rlXt9/8zP6wNPOUcDTKnQUDtyLPv1xYnb0k7t2q56auS4L+eEjdC2qwKwVj9yL5McI2DiHyJPpIpFpmqUg5ybw4GvV/+XVi6NsqF1a2fpwcPV1AzxHST3X12EmU7KwLZr7LFB8yu2F2wQ3ukij8sX4kCnMO9IcZ3MMAA0TLHRxc7tHW8jT/pclBCnwi72DJPSVTQoOQcwPhMRHP0EkErzlCxGuHcwTQQgAstGS/mV4SKAsyLk2Vjg0fKwqjdMYieoC34NkycFZbhiRfAA4/mndG/BW1dCtbETbm1K2LKc9RSxnflwn++f9tPMB+OuYXHSb10Loww0PJuO6Dg+TaR+c4Z4tVyXjLEdzG5UFH+YQ1F9HT+z9GOtlDrpmZ9id49VA04LabqtBOfq73NfdOnqHSLl+Km7IdsV3AIWgVlb+Ds6B9488i4FZrkzxpeyLXkXNQEii9EZRsWE5tnw8wtNzrcDurGTAY9XrYgwfAbT9iyX8Il/8E0=
2021-05-27 14:07:31,092.092 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_rzarzynski@teuthology
2021-05-27 14:07:32,084.084 INFO:teuthology.nuke:targets:
  smithi110.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQD0OFAKPNZbPN3J+RN9xZOAPBio5YmfS0vj8BYlfB/6mLmtXZFxXNZ2recyKJFTemChLkLMfwrkTYyM4wBvJA8J0sAA+cHljtLgNiYWwd4ApwGfhaGqPuRTZjP621RxqxcVLe3POPgcKF3q6DulkoAfawQh1kdPvtEWtbsCGfLKxhBHRklpGN1TaUCeXuTAZqWe8e5uY/PWyfuqIVU06CG85FSGClU+OXqkckXaYJpKEJew/3yQ/szzv2Ivtk7coLI0s9JAtyKfTf/B/h7cV3satV6W9D5Zoh9OPVaU8GDE7kRMnZMOjdHp4zkAiXELTAVkq5X+1TRxfRJ8gVcmrqb45WFt8cj0KYI+vucdZDLoGdYQjrutyQuAZg4ahhGjtxChkjlb4nza/6Dm6ieSdahYHGqFKDtYGbwTDpHSbMhe3AERgwHLrbhdrzzO32cDzyCcpn3ghWVDnVqQMMQs3sA82ILLkfQ403awk3uqlH1U0TBTeHwQ7fjxcQPuqwYKjKU=
2021-05-27 14:07:32,084.084 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_sage@teuthology
2021-05-27 14:07:33,544.544 INFO:teuthology.nuke:targets:
  {}
2021-05-27 14:07:33,544.544 INFO:teuthology.nuke:Not actually nuking anything since --dry-run was passed
scheduled_teuthology@teuthology
2021-05-27 14:07:35,250.250 INFO:teuthology.nuke:targets:
  gibba003.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCld0hdLKMrYfqhXRPeI3Jt4gFSPeVnAzya3a7OBXInfaiYAmRR1kKzpEBsgkMTBLmf9mEYGiYwBPXlzp3jHGSMiHsujAqkJUp9Li0T+Sk0hXdy/14IzZPIaOmR+6hKSB3T9S1h5adUOh7ziaFVtHl3IWes1sg56qjNuV/fensHkgnww59QnH4WuYX5UQg3Fky7F0angBxO9Kgbs7RbEMLoqPayWE1NxvWCnSM9YS3kSOlxNbggtfU2dIfm4IxTwLgCgqfk91FYPqaFPY6mPePZloIFREvQtcMqblFUTJXvinmDmtMgYe0X/GMlnidvYT/mAIqAuyfIxI75Wjv5L6wZd1KWPhf0rE2WZ645btD0kznJ9hLyzRHpayFdTIOrEahAW3GOsNx5gyBpdRAKDAbCKOnWOyKvc34HENl3m2oMqd8u9d3TrbOUpoe5lbNzOFqVJuwBaidSNcg2xuRYYukxtT5Tl5txzTfDSp+sH90SNAP/VQgxK6p66EjgWq3pLok=
  gibba005.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDfLj2jT95cDp/IB1dhkTzPsZKihcSkHH+rK/1Iy9BC7W2F85ComBGa1tL/XiBh7fPXSHHbuEBewFnhBdJF2nI990XZRi3onGMBV3obQivtjFb26KzPyW/tkENpbSrYNrQ37cMPC8zKxnu7leKXBO7h/XxetRDQLJW9nAmKZk8BKggnnj58Yt8Ts2LOzdw1pOoR0b/i5IPWgyvT31XNklo44QxWpmX4BDOaTShRDaDCzmGq85TM7X21TaCJjh2BuHIzBvM081NGDYN7vt6ah+sEv/TipIycMQwEJt6irpHKYITEdgcdnSMq7Q7yWsoqRxwxPA6U7CIcIWd9/B8x+8Xr
  gibba011.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCkmOIsZw76zv+3sukgiuDakHRVt0zRRUaDz1zMIXM3FFXZfse+4g6+uZNX6vbLxguO3m5swvA9sbKIFzWcnkPyfPkGCB7d0fwarpTev92y4wYtqdVZ7eqJM5SaDyyzWEppHZBLPWX2TglUgax0JReXaHcDX5SSIr18EHt4wv33Hk6d/0Mc6mdfo2lbq8ETIYA2b5Sl92s9367TUywY2dIr0VEzu+tR/41xOIsaxbsDMpiw0u23tQkSh9+nzVlQv9G1Wl3rJrdRVFklVvQ6Wg1FLEdtCKUqSj2OnW8BuDUSZBEawn3Lb+xyCX5eiY0oy6ey2Zfpd5J1sq747y1OgRXlSZRqo+kBOsBgqqhWhTonhh53ylXbxKO/a5LsjjitE6dd7JOMIOjATmYCqUu++v0K/zOGVSEsOgP5c/8X94r4QpDp3QWEGerF0crYnRHH5j6HaHO+wmuIfmsUOJPAnSm6a4M3SBVzAzNRLgsP4td5nwCVRpK7SeGn4FPOQkcQ8cU=
  gibba014.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDpQAOzuQgjsSxIawFO3gd8th4WBWPaUMnWE3Uy3AwC52PaVCVFqbKIeFxehsgeZiCkYdDwjLHLA9E64yn29zWPf8RLU3uLcnyj6eMz/drZZhMjbLsQm7mf7yTx9aIuWp2svTPIrVRQ+j/5gd3qzi58lGVB/xOoKwMoXsIJf2BNGsoQe1nAltgVCRu/lGLKS/iDZxdqPsIE6WMGxfTHKxb7PIp1UG+mvY3K1/EZvBaElBraymBcrD4WRRSxQ/9p2mu32feAYR12ULHFB7LVbFawqCuwoJFG6mHVGY9TalMcrD+I9qKTxAkT9S1xaSspvAzsOBMkufFoJxoXmGy+rcJXG1fPs18MAjslB2FhAXKHiwltN3xYRYLthNmzRTPGDzglGP53sIkKg8L4UQgVb8FQLo7frINON3fGd1jue3fNNKqTxFJkOpzLwgTtgyx+8gdTsYtfay0OlmS1ZwXvC76mLMEsof6P/S7tYatbY8GT30+tnE7HkR6DvbIJALmeHCU=
  gibba031.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDUNyeAm9eVpAQA7q7QDB66UhkKV/MvEFOqMgiJ6pyMC/Nr8Uqi6jGHxaOpAn8nAaYfBQjdg7ATtH5qstYo50aeb75aMl/pNrxa7Naz4BJPTFjsRQVZ8/gLNbmyQW6FKmB5+ajtLfCtXdZ9A0feKFOc53Y4R4YfNfEGud7iU9jNK9lFtUpVYk1QvDp7h+V1+bgacuXC/aXo0hRcdNKrjeBYHpuUWlIhocM2upR5nAojA+qgfV1QCU59YDA2d6Q1akhdPwvEmBJ/hnkACD54vdqgS5X1slwY5SotnYi1LN/yABJWv9BEzOQC5SHyEHw0x75bxah7upluy0uxGG5muCc7lm/zqtS7GMuxq+pJW+ZJkPGzZfsnFM3b+nHvVD1zK3xqML4r2jF5bEUZYCvclLwjfTYvDxL9SAyw7GJHGqWY5kOQW1V6wJUbqUnwQKcSuteByJlWVtr0ts9HyqmrXbLZgxvCIBGzH7PweKWPdINinS1ZPdv4sae+u9v8GQIgyvM=
  gibba034.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDXa+O0TxqAY436A1QUd0g2P1heTgulxdZCy4AQELjZ30NqXsoM/aWEcLGuDcF4n9/8R/obG51en6V405e2Qi8xGFs/WvvqDyu7PLzCQttKB2mmxQ9rUM4hatAx2cPARxt3U+JL+RS9QJQ0CHUMY9GVwNRh7nAgCko1NGppVc04MPubcUZM1bGuZIXD/V01OT76Y0Mgaq58G+1u0OSC9jbs0p0HYZsNw4Wb+i4yMceZVzDCpsJDaO6OpWXTC2b0SxYQjGDRMIhINpMY9n61hln50kS3nTPRk4itF0A+GiEycXvOL4g+o0V4GK827AlaZNr7jkWzil6ScqNKNlWh66pGdvVSE85iSokmCCp/T1R+l67A1J/GNd1QI/3lWYKVJInKIruqD6LkELDCVbLTE884EBwkA41ZSTP0Ce6GsWLFarHfjTuc755kHwkQ+5nfcxQ4ZsQe1s5grHYzMT7VVW330eFWax9JWmCqd57OgOYot+zXYE1mGzyT62gvubT4im8=
  gibba037.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDiQMl3x0INlBfZHDX+lQb4u5XaL9CM8wVuFCrSaNifdHN0bfn5GjAQ0io07CGVxshQC7J0D6eRc1Ui66C3bRSfCq6A5oln9a/eVV4gMf44HAWtTvOpseZt8d41ZENV2mRIeDvwS/C7y0hiiiwL7L4XBOu6wX4SBHKXkJWqLPovU9buF2z242fPEnEd6vjhS4XTrAOIFPY/8rA9dOu9ei83WAcK/uZSOqEDv8FknYV0+bhSxWqn7IuuavYL/mYzLp9aDa2dVhuXphW1AtelL6an8StyfOzAfOyIVP3XCTf0DKueDGKZW7kF3xk+a6994ch0tAIa+1VM+KbZtooDZ/Ah
  gibba038.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCvyN6UzgfY73JsY4CaPF5GHIXEz5hr0SojfNz+fASwE7C3jUirx5JU3ORPTyj4VBm7IlJdaCzs0fd4XxLCCemxgKzPCdVuLGmR+1Er14iCjEKfKAqYHNc/I80m/0GH9BcuomPy5MoU0Bek/lGygktQfV+V+Mubwp9TPqhZmUrodWKtPTIl55A1Vt+bP60YJkYyrV4hIUpx26eFZUbH+uAZOGF1ET6e2LuNaPaFk157CvHsIv92mmXejqNadRRBgfxpyI5fh67YNR84VIWo+yEUacphw5S+fihgFe7q17q3gIfvXyo9wSf0FYgVRzze/ClIY2gOK+jx1jxECN44Qu4WpGfJMAYVfmlNNRA2njk2k+ewSKFzj539YaU36F6hSDq4m0ps2CG/lILKK2UKkgjdXS5T7gDKw+uglwrwqR/CLW9XbUeWnlNh5D8YsKV0esjNW0oUciZiqcMTrJKuP9oj+hQ3iPoj+6Nsff+5xukYpKFT/zfiYLInFwoAhvKRtzk=
  gibba040.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCz4t+r89sadK2l8kHV3+miBfug651GWmWIohj6cH/e3KLgNJBPA7e/9oQATGkX8SrSTJxKvrByLWCLZaBxpjkTTX6JAGql/9/kJTFELUD/43RymeqZIoY50F2TJTSNyMY85+uRyO50GuX8qDxc8t6n837xQL3oRWvJnCYhrhuO9/iYd1R80FMKXpKfzwE/p9T4Mj34l6jkdVPWQC+hxhcknXnG7l7rdv3JdhZV+QoffGw+k2bGbVlAKynmxSYIA/efE3Mf5Bo9TOn0o5l8Mg3DOxjb0PS9m6+qluLlzwgOhck2VOKAbCazN5i1aN2LpDU+9bYL8cSf5hz7yjQwHQznm1iUHMKe6ustaoYMoEN1uJdyljRz8duUr1YxBYgdTl6zEJDoBdPzhxrPBQBxFHTEYlEOCS3mutfNNcYfkJDh0iagn+ypyS7oF3XwagjcXQvAB07vvl7M8cQKHqwycJr0cp2HoQrzrn+Esz/JSPVLjuZ/e6ajYZwioHfNhcra9qM=
  gibba042.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCvcvbwNjistr84ZeAYBSe/RuJCEt7WXUQu+msUPlov4Nr0n6nb8WxH2LkKio+QtXIrSeG21xbhdXUxSPownGhxZAQPHOpKt4rEbIL7cQWc6Qt1Z8eHkqKLbNBu4Sa3BGUj19Dj9ORQIhgMI8i9UGZOi3N8H8bko+VMfJXTn3cJiMQf6A+pFQi5XcIfaMQfL3AsiLDOPw+SoTiNiC7r3WseaWP5a/v6RCvMVDR/Isy9H3xWC+C+El1sJm7kk87/rnszSIiz4ozbt6qdy+j8u+ENJ13hUp9t/2HXkS10tXXyVyzfLAEsqx0rKOXoopjHpKIeOdvmmtANT0MHEsOzuES8dGowJ3YaSG4S3mzVYfn/JCL6S0LxhREX/Pj+EXOFw9tGthOl7GPkxOIvWOkLd7IfybQNvhcnv9ye+WeawsPP3PsU/aFiMhcrARc4JnvYDq6YeTjYDP9TXPkUZN5hs2RlAatjupaxB7PU4P34vxpynKzzh2oHRDCQmetqoV/vhTE=
  smithi005.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDNc8eiEzpT+1C4Ux1mKGSi2ngie6RibD30k672ucLrakfRDyF6aIbDBFOQPg8pFPy6w+hyZ5Iz120aOzp9yR9qjnnGC+PFkJhQZRrs03WjzUA89gejoSO/GO6Wllx7nGXXY3SJOYz8uSCrp0/hOj/gZM6pl3bAKUo4kX4osd/I2evX0bG3/dwr2m09io3YtWWD3tdfF+iROtECxeD2kMYKS2hA/UoI+fbsqyYA0OQ3AvFFISwuZCXWBKD6XEX5XmoyhoLe3V33n8JIszu6jYHZamxvVss8j8me54rVDDZUyMSijM8wQ/8AYkHzK/GklootrUHR232D2djdu9MFxrVZ
  smithi053.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCjjgHiWJpWAcsdAlz+BhdtMF3XE4GIYdArSxyCY9ZIfRzafFAg+R9BWkZoKwWDZphGstqx7GIA2k7FtSq0o6lO5+fP/rSdO6AKwk4MPm5k8zbsDcrHk5NY0obML/QZWNvrRS7+GgrAHizAdGpgh3AI1F0zz7Mlalh4nfsZjarNw5Pz+qrTNpvhNHDQc+WpV7jwtqbJJPNBtJXvvfWaOySITGD+x+p0suU83mTtacruIZmh2EPgukJEXu8+i7qkTYeRs1NVr+LcbHNDNAEZwrIGOzjd9WWf1osort3HUzCMDiAtqaw4eQUwtfJWm2LwWwsH9k+Gc0ewYGkbgWOOclUbGQkfLGoKGVtEuDzCxyH6uw3x92joTO9ZWQyFUJr9HeR2pY67jcgXdSeImG6EdMOMoyZq7hXw2LzHmdd8obV54RxAdZMGPUf4zyWvikB/HDu28QjI3S4thVYjlFbTDe0F2zqndrvbs2myr8IsOavn23xo06l0aKjvyZYKsuseaAk=
  smithi074.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQCgM7XkVjXwrJeQy50XRpEp2B5qYcZh7+kWl0wQXfhxQcN87DwZNneF8vsqAlx7suaTo+cGBFlUqdY/YtcCaHhfR1M0fasRmyhXLETtBdv69cZ5/Tp7mtNS9zsK02D9iB/HJYZ3kxg+NP8Bpo8JH+s6Rt14vlOmplaVFMhDaQXWdNbdS7HCKPNpcCmZZEe8k1xlAn18lVT4PlnGY0zL86za1rdNRggHHrYGFau3IByDPRABYo8ZGSDwP9UMN4PTIiYy5peK8YXKSWLfeeIb5Ip5sPsRbW5bER8pQ7yej0k1TlJAZkaIKdS+SGkSSEraXAM61lqazYk5GCFo9BA+cgwr7yG9rvksCD/0rNRWNNhPxxfYgsKLO3J2BZMDMSpTpTTKaCNg5ggCj2oXtdfeJI0LnPBYOWt8lkKJstjdXskERytL7g2KcVxWz7BzGObHKXlJo5CdihcNPvG0hL88Rf2L8dOsXb3fVP9mxx/Wu7aL7JiTkErb3yglbzupW1B1+qk=
  smithi092.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDo8qJMWHgHQDZEBAgmGwaAzlvLMR/9yGwsd0gR9OyZD2Y/MXhYva/5hHwvwhds3KTb1ylusEX7ZNcxaDQh/K8n85JYhP8M7tZ15WqiQ9M26CLb+5IWlKwckuKAllkskuwu2gpQouk2KDhHAbhpYqA6sYlpP/BEKUcLRhQvUb4E9DU0f93ekd3EgdsNa84O7P5Po2JC/QNm0l1qHEmZK0jF9rBl7Eq3tmMT4sxSH8r43Z4E2Zn6nO70/UVlCmk0W0B9AZgbughLNfnO626ZHMkfrsBydFfxq5g5uPCrpeNx+ZTu12D6fS3i+QI2JzYEEAouub0D7Hpu4EZiBM+awlRvmKe7aeb1v4gNkhu5IMm777VfMhzdresoEUtq15HhVSJSl9hQCgc600HEqOy68290Lag601qHsQa1ZbpbL6eUIrL4oAWItA/btc0YerndgvB4NwdP3r6cBnTYBSvm4FFo8Qqp8sOWupgE/dbTRKnqmaNPRRXdbT34pCknJY8OzmE=
  smithi115.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQC/XmjH+08RDSc/AwmmCWRTWIz//I9qR2yZCTsTKMz6XhgxbXFX8ZNvTyemWR1C3bqZH25bwnyMBm2sdf1O3zF8L4ufKshHeO2fTZ42Wg+OA4tjdH8Zgx5F+TDXClxFLLQwgK3yITWx0Pt4ocLDiNQ2lT6gKUArLJ3oHzUMGUlga8K5kVi+/yO+8HDNqD0LEKWBiemLSgcxRbhsL5kcygjwRcibRgo+8VgEOLHVJNpCTHg5RaEVvQ7CqD0L58fhxXUhrVGINp6oLJqUYb9Ff7fR3GG09HVRpaSiJaVE67M7AIjCfdXqpDAL47NkNEX7FsVILIHrRzJM5Qr4k3VxoIbE/viKKX04rpTekgWvnzfsNlxYCE78CrqETK6ZKGPBQYOjbeZY13URvSE7AdbaecfF7JjRHFZuFwGh3625tnfmrSY/BZ6bzPf5sM1WRTfNa1BSopKTOrNlRXwlWAL91E/iAycDrZ8ixai7hAqaqQaJb1aA15QOWkjT0AqOoyjlAwE=
Actions #2

Updated by David Galloway almost 3 years ago

Rishabh asked me:

i spent time with this, went through the ticket and PR but i am not fully sure about whether we want teuthology-nuke --stale --owner $owner --unlock to not to give up unless it has retried a few times or we want it to retry only when it receives error 500 or something else.

along with the ideal behaviour, can you also please leave the recipe, here or on the ticket, for reproducing the issue? it would help me out a lot since i don't ever use these commands or review this codebase.

I personally think -nuke should retry X times if it fails for any reason.

You'd need sudo access on teuthology.front.sepia.ceph.com to reproduce. I can grant that temporarily. But what I run to nuke all stale machines is:

for owner in $(teuthology-lock --summary | grep -i 'scheduled' | awk '{ print $4 }' | sort -u); do echo $owner; teuthology-nuke --stale --owner $owner --unlock ; done
Actions #3

Updated by Josh Durgin almost 3 years ago

Essentially all of the paddles calls should retry - most of them do already, there are a few used by nuke --stale that aren't wrapped in safe_while() yet. That should be rectified ala https://github.com/ceph/teuthology/pull/1633

Actions #4

Updated by David Galloway almost 3 years ago

Josh Durgin wrote:

Essentially all of the paddles calls should retry - most of them do already, there are a few used by nuke --stale that aren't wrapped in safe_while() yet. That should be rectified ala https://github.com/ceph/teuthology/pull/1633

Not quite yet.

2021-06-16 17:18:29,045.045 ERROR:teuthology.lock.ops:failed to unlock gibba010.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,045.045 ERROR:teuthology.lock.ops:failed to unlock gibba008.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,045.045 ERROR:teuthology.lock.ops:failed to unlock gibba034.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,045.045 ERROR:teuthology.lock.ops:failed to unlock gibba012.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,046.046 ERROR:teuthology.lock.ops:failed to unlock gibba018.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,046.046 ERROR:teuthology.lock.ops:failed to unlock gibba001.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,047.047 ERROR:teuthology.lock.ops:failed to unlock gibba035.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,047.047 INFO:teuthology.orchestra.console:Power off for gibba039 completed
2021-06-16 17:18:29,051.051 ERROR:teuthology.lock.ops:failed to unlock gibba019.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,053.053 ERROR:teuthology.lock.ops:failed to unlock gibba040.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,054.054 ERROR:teuthology.lock.ops:failed to unlock gibba030.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,055.055 ERROR:teuthology.lock.ops:failed to unlock gibba026.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,055.055 ERROR:teuthology.lock.ops:failed to unlock gibba017.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,059.059 ERROR:teuthology.lock.ops:failed to unlock gibba032.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,059.059 ERROR:teuthology.lock.ops:failed to unlock gibba007.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,060.060 ERROR:teuthology.lock.ops:failed to unlock gibba023.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,060.060 ERROR:teuthology.lock.ops:failed to unlock gibba014.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,061.061 INFO:teuthology.lock.ops:unlocked gibba013.front.sepia.ceph.com
2021-06-16 17:18:29,062.062 ERROR:teuthology.lock.ops:failed to unlock gibba022.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,064.064 ERROR:teuthology.lock.ops:failed to unlock gibba025.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,071.071 INFO:teuthology.orchestra.console:Power off for gibba043 completed
2021-06-16 17:18:29,078.078 INFO:teuthology.orchestra.console:Power off for gibba036 completed
2021-06-16 17:18:29,080.080 ERROR:teuthology.lock.ops:failed to unlock gibba011.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,080.080 ERROR:teuthology.lock.ops:failed to unlock gibba037.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,084.084 ERROR:teuthology.lock.ops:failed to unlock gibba005.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,084.084 ERROR:teuthology.lock.ops:failed to unlock gibba044.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,090.090 INFO:teuthology.lock.ops:unlocked gibba002.front.sepia.ceph.com
2021-06-16 17:18:29,093.093 INFO:teuthology.lock.ops:unlocked gibba029.front.sepia.ceph.com
2021-06-16 17:18:29,093.093 INFO:teuthology.lock.ops:unlocked gibba041.front.sepia.ceph.com
2021-06-16 17:18:29,093.093 INFO:teuthology.lock.ops:unlocked gibba020.front.sepia.ceph.com
2021-06-16 17:18:29,171.171 INFO:teuthology.lock.ops:unlocked gibba015.front.sepia.ceph.com
2021-06-16 17:18:29,172.172 INFO:teuthology.lock.ops:unlocked gibba009.front.sepia.ceph.com
2021-06-16 17:18:29,201.201 ERROR:teuthology.lock.ops:failed to unlock gibba043.front.sepia.ceph.com. reason: 500
2021-06-16 17:18:29,203.203 INFO:teuthology.lock.ops:unlocked gibba027.front.sepia.ceph.com
2021-06-16 17:18:29,210.210 INFO:teuthology.lock.ops:unlocked gibba039.front.sepia.ceph.com
2021-06-16 17:18:30,144.144 INFO:teuthology.lock.ops:unlocked gibba038.front.sepia.ceph.com
2021-06-16 17:18:30,177.177 INFO:teuthology.lock.ops:unlocked gibba028.front.sepia.ceph.com
2021-06-16 17:18:31,525.525 INFO:teuthology.lock.ops:unlocked gibba036.front.sepia.ceph.com

dgalloway@teuthology:~$ teuthology --version
1.1.0-f359b10d
Actions #5

Updated by Aishwarya Mathuria almost 3 years ago

  • Assignee set to Aishwarya Mathuria
Actions #6

Updated by Aishwarya Mathuria over 2 years ago

Looks like there is a problem with the retry mechanism of this particular unlock request to paddles. I have a fix out for review: https://github.com/ceph/teuthology/pull/1675

Actions #7

Updated by David Galloway over 2 years ago

I just used https://github.com/ceph/teuthology/pull/1675 to run --nuke --stale and did not get any 500 errors!

Actions #8

Updated by David Galloway over 2 years ago

We need this fix applied to unlocking as well apparently.

http://qa-proxy.ceph.com/teuthology/teuthology-2021-09-05_12:15:02-powercycle-octopus-distro-basic-gibba/6376171/supervisor.6376171.log

2021-09-20T17:55:03.486 ERROR:teuthology.lock.ops:failed to unlock gibba015.front.sepia.ceph.com. reason: 500
Actions #9

Updated by Josh Durgin over 2 years ago

The run where David saw this was using an older version of teuthology without this fix - since he cleared out older jobs from the queue last week we shouldn't see this anymore.

Actions #10

Updated by Aishwarya Mathuria over 2 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF