Fix #7969
Internal error executing OsdMapModifyingRequest
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (services)
Target version:
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
Someone apparently saw this on mira106 but didn't notice that their request had failed with an internal error. I spotted it in the logs while looking for errors for something else.
2014-04-02 16:16:08,918 - DEBUG - cthulhu Eventer.on_sync_object: osd_map 2014-04-02 16:16:08,919 - DEBUG - cthulhu.OsdMapModifyingRequest check passed (1677 >= None) 2014-04-02 16:16:08,919 - ERROR - cthulhu Request ea956d97-3b6f-4b86-bc6e-fc40aa0d0c38 threw exception in on_map Traceback (most recent call last): File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 150, in on_map request.on_map(sync_type, sync_objects) File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 248, in on_map self.complete() File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete assert self.jid is None AssertionError 2014-04-02 16:16:08,919 - ERROR - cthulhu Exception handling message with tag salt/job/20140402161608799668/ret/mira110.front.sepia.ceph.com Traceback (most recent call last): File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 283, in _run self.on_sync_object(data['id'], data['return']) File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/gevent_util.py", line 35, in wrapped return func(*args, **kwargs) File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/cluster_monitor.py", line 411, in on_sync_object self._requests.on_map(sync_type, self._sync_objects) File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/request_collection.py", line 154, in on_map request.complete() File "/opt/calamari/venv/local/lib/python2.7/site-packages/calamari_cthulhu-0.1-py2.7.egg/cthulhu/manager/user_request.py", line 174, in complete assert self.jid is None AssertionError
The bug here is that we weren't coping with new OSD maps appearing in the interval between issuing a remote command to update the map and completing that command so that we know what version to wait for.
Associated revisions
cthulhu: Fix handling maps during first part of PgCreatingRequest
There was an inconsistency in how awaiting_versions was handled.
This lines things up by making sure nothing calls on_map for
anything that isn't listed in awaiting_versions.
Fixes: #7969
History
#1 Updated by John Spray almost 10 years ago
- Status changed from New to Fix Under Review
#2 Updated by John Spray almost 10 years ago
- Status changed from Fix Under Review to Resolved