Fix #8075
Remove workaround for saltstack #11928
Status:
New
Priority:
Normal
Assignee:
-
Category:
Backend (services)
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
This was a workaround:
commit 2422a2115f09853177d51780655ad0debf033e7e Author: John Spray <john.spray@inktank.com> Date: Mon Mar 31 14:03:19 2014 +0100 cthulhu: send pings to late minions This is a bit hacky, but necessary. The problem is that when the salt master AES key changes, minions don't clue in until they receive a message they don't understand. Our minions are mostly only sending messages, not receiving them, so don't learn about the new key. The workaround is to send a ping to any minions that seem to have gone quiet, hopefully in time for them to learn about the new key before generating any spurious "server lost contact" type alerts. The alternative would be to do full PKI negotiation on every event with all the CPU cycles and roundtrips that that entails. We may see a better solution in the long run as salt adds crypto at the transport level. Because this change makes us functionally dependent on the expected contact interval, not just for alerting, now is the time to make the interval robustly defined via the scheduled interval in the pillar, rather than via an absolute time in a config setting, so do that to. Fixes: #7836
I've sent a PR to fix the underlying issue in salt: https://github.com/saltstack/salt/pull/11928
As soon as the upstream bug is fixed and a new salt is out, we should update our salt requirement to that new version, and remove the workaround.
Associated revisions
cthulhu: remove workaround for saltstack #11928
The upstream release version now includes the fix,
since 2013.1.2.
Fixes: #8075
Signed-off-by: John Spray <john.spray@inktank.com>