Project

General

Profile

Fix #7969

Internal error executing OsdMapModifyingRequest

Added by John Spray almost 10 years ago. Updated almost 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (services)
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Someone apparently saw this on mira106 but didn't notice that their request had failed with an internal error. I spotted it in the logs while looking for errors for something else.

2014-04-02 16:16:08,918 - DEBUG - cthulhu Eventer.on_sync_object: osd_map
2014-04-02 16:16:08,919 - DEBUG - cthulhu.OsdMapModifyingRequest check passed (1677 >= None)
2014-04-02 16:16:08,919 - ERROR - cthulhu Request ea956d97-3b6f-4b86-bc6e-fc40aa0d0c38 threw exception in on_map
Traceback (most recent call last):
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 150, in on_map
    request.on_map(sync_type, sync_objects)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 248, in on_map
    self.complete()
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete
    assert self.jid is None
AssertionError
2014-04-02 16:16:08,919 - ERROR - cthulhu Exception handling message with tag salt/job/20140402161608799668/ret/mira110.front.sepia.ceph.com
Traceback (most recent call last):
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 283, in _run
    self.on_sync_object(data['id'], data['return'])
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/gevent_util.py", line 35, in wrapped
    return func(*args, **kwargs)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 411, in on_sync_object
    self._requests.on_map(sync_type, self._sync_objects)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 154, in on_map
    request.complete()
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete
    assert self.jid is None
AssertionError

The bug here is that we weren't coping with new OSD maps appearing in the interval between issuing a remote command to update the map and completing that command so that we know what version to wait for.

Associated revisions

Revision 46534761 (diff)
Added by John Spray almost 10 years ago

cthulhu: Fix handling maps during first part of PgCreatingRequest

There was an inconsistency in how awaiting_versions was handled.
This lines things up by making sure nothing calls on_map for
anything that isn't listed in awaiting_versions.

Fixes: #7969

History

#1 Updated by John Spray almost 10 years ago

  • Status changed from New to Fix Under Review

#2 Updated by John Spray almost 10 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF