Project

General

Profile

Bug #22645

chacra tries only three times at 30 sec. intervals for callbacks

Added by Alfredo Deza about 6 years ago. Updated over 2 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This callback is what pings shaman about the status of a repo, and that can set the status of a server.

The following indicates that just 3 times is not enough:

[2018-01-09 19:17:52,153: DEBUG/Worker-3] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:17:56,087: INFO/Worker-4] polling repos....
[2018-01-09 19:17:56,122: INFO/Worker-4] repo <Repo ceph/wip-yuri2-testing-2018-01-09-1813/d46c090673a3203b36a7bac66097eb7b64b6bd2e/centos/7> needs to be updated/created
[2018-01-09 19:17:56,169: DEBUG/Worker-1] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:17:56,171: INFO/Worker-4] completed repo polling
[2018-01-09 19:18:02,170: WARNING/Worker-3] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f338a3ad450>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:18:16,266: INFO/Worker-3] processing repository: <Repo ceph/wip-yuri2-testing-2018-01-09-1813/d46c090673a3203b36a7bac66097eb7b64b6bd2e/centos/7>
[2018-01-09 19:18:16,268: DEBUG/Worker-3] checking if repository should be disabled for project: ceph
[2018-01-09 19:18:16,268: DEBUG/Worker-2] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:18:16,292: WARNING/Worker-3] ceph-deploy does not exist but is configured, no binaries fetched
[2018-01-09 19:18:16,294: WARNING/Worker-3] ceph-medic does not exist but is configured, no binaries fetched
[2018-01-09 19:18:26,279: WARNING/Worker-2] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fecff404950>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:18:28,065: INFO/Worker-3] finished processing repository: <Repo ceph/wip-yuri2-testing-2018-01-09-1813/d46c090673a3203b36a7bac66097eb7b64b6bd2e/centos/7>
[2018-01-09 19:18:28,132: DEBUG/Worker-3] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:18:32,201: DEBUG/Worker-1] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:18:38,171: WARNING/Worker-3] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fecff3d4450>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:18:38,895: WARNING/Worker-1] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f338a401dd0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:18:56,917: DEBUG/Worker-2] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:19:03,754: WARNING/Worker-2] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f338a3ed350>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:19:08,243: DEBUG/Worker-3] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:19:08,930: DEBUG/Worker-4] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:19:18,949: WARNING/Worker-4] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7fecff4290d0>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))
[2018-01-09 19:19:18,972: ERROR/MainProcess] Task chacra.async.recurring.callback[34690862-b064-48a9-9f52-49ed631637bf] raised unexpected: ConnectionError(MaxRetryError('None: Max retries exceeded with url: /api/repos/ceph/ (Caused by None)',),)
Traceback (most recent call last):
  File "/opt/chacra/local/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
    R = retval = fun(*args, **kwargs)
  File "/opt/chacra/local/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
    return self.run(*args, **kwargs)
  File "/opt/chacra/src/chacra/chacra/async/recurring.py", line 205, in callback
    raise self.retry(exc=exc)
  File "/opt/chacra/local/lib/python2.7/site-packages/celery/app/task.py", line 676, in retry
    maybe_reraise()
  File "/opt/chacra/local/lib/python2.7/site-packages/celery/utils/__init__.py", line 248, in maybe_reraise
    reraise(exc_info[0], exc_info[1], exc_info[2])
  File "/opt/chacra/src/chacra/chacra/async/recurring.py", line 200, in callback
    headers=headers
  File "/opt/chacra/local/lib/python2.7/site-packages/requests/api.py", line 111, in post
    return request('post', url, data=data, json=json, **kwargs)
  File "/opt/chacra/local/lib/python2.7/site-packages/requests/api.py", line 57, in request
    return session.request(method=method, url=url, **kwargs)
  File "/opt/chacra/local/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "/opt/chacra/local/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
    r = adapter.send(request, **kwargs)
  File "/opt/chacra/local/lib/python2.7/site-packages/requests/adapters.py", line 467, in send
    raise ConnectionError(e, request=request)
ConnectionError: None: Max retries exceeded with url: /api/repos/ceph/ (Caused by None)
[2018-01-09 19:19:34,275: DEBUG/Worker-4] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2018-01-09 19:19:44,294: WARNING/Worker-4] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/ceph/ (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f338a448290>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution',))

Yuri suggested 33 times. But I think the right step forward should be to not only increase the retries, but also the timing. Maybe progressively adding 30 more seconds to the retry could
improve that aside from the retry count.

History

#1 Updated by David Galloway over 2 years ago

  • Status changed from New to Can't reproduce

The log in this bug looks different than anything I'm seeing. Perhaps things have just stabilized enough that this bug is no longer valid.

In 5 years worth of chacra logs on 2.chacra.ceph.com, "Max retries" showed up 9 times. Here's the most recent.

[2021-07-23 01:32:01,841: DEBUG/Worker-2] callback for url: https://shaman.ceph.com/api/repos/ceph/
[2021-07-23 01:32:19,154: WARNING/Worker-2] callback failed: HTTPSConnectionPool(host='shaman.ceph.com', port=443): Max retries exceeded with url: /api/repos/cep
h/ (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fc61ce0ee50>: Failed to establish a new connection: [Errno -3] Temporary failu
re in name resolution',))

Also available in: Atom PDF