Project

General

Profile

Fix #7969

Internal error executing OsdMapModifyingRequest

Added by John Spray over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (services)
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Someone apparently saw this on mira106 but didn't notice that their request had failed with an internal error. I spotted it in the logs while looking for errors for something else.

2014-04-02 16:16:08,918 - DEBUG - cthulhu Eventer.on_sync_object: osd_map
2014-04-02 16:16:08,919 - DEBUG - cthulhu.OsdMapModifyingRequest check passed (1677 >= None)
2014-04-02 16:16:08,919 - ERROR - cthulhu Request ea956d97-3b6f-4b86-bc6e-fc40aa0d0c38 threw exception in on_map
Traceback (most recent call last):
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 150, in on_map
    request.on_map(sync_type, sync_objects)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 248, in on_map
    self.complete()
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete
    assert self.jid is None
AssertionError
2014-04-02 16:16:08,919 - ERROR - cthulhu Exception handling message with tag salt/job/20140402161608799668/ret/mira110.front.sepia.ceph.com
Traceback (most recent call last):
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 283, in _run
    self.on_sync_object(data['id'], data['return'])
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/gevent_util.py", line 35, in wrapped
    return func(*args, **kwargs)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 411, in on_sync_object
    self._requests.on_map(sync_type, self._sync_objects)
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 154, in on_map
    request.complete()
  File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete
    assert self.jid is None
AssertionError

The bug here is that we weren't coping with new OSD maps appearing in the interval between issuing a remote command to update the map and completing that command so that we know what version to wait for.

Associated revisions

Revision 46534761 (diff)
Added by John Spray over 7 years ago

cthulhu: Fix handling maps during first part of PgCreatingRequest

There was an inconsistency in how awaiting_versions was handled.
This lines things up by making sure nothing calls on_map for
anything that isn't listed in awaiting_versions.

Fixes: #7969

History

#1 Updated by John Spray over 7 years ago

  • Status changed from New to Fix Under Review

#2 Updated by John Spray over 7 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF