osd: marking peers down
I'm reliably seeing peers mark each other down when they shouldn't on benjamin. There are ~21 osds across 3 nodes, and simply restarting them all starts a storm. Something is broken in the heartbeat exchanges.
The workaround is to temporarily increase osd heartbeat grace until everything is up and then lower it again.