Actions
Bug #9321
closedpgmap updates from OSDMap can be delayed indefinitely
% Done:
0%
Source:
Support
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
We saw a customer cluster in which a full OSD had been removed from the OSDMap, but after almost two hours that change had not propagated to the pgmap's list of full OSDs. Going through monitor logs, every time the PGMonitor tried to run update_from_osdmap, either the osdmap was unreadable or the pgmap was unwriteable.
After discussion with Joao and Sage, we think it's safe in our current implementation to simply drop the is_readable() check on the osdmonitor in this case, because while we might see an out-of-date map, we won't see an invalid one. In future we'll need to always provide a stable readable map for situations like this.
Actions