Project

General

Profile

Feature #3511

figure out how to best set the heartbeat grace periods

Added by Greg Farnum over 11 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

In Journal Club today we discussed failure detection, and we realized that while much of our failure detection has some thought behind it, most of the constants are just plucked out of the air. In particular I'd like to see us adjust the heartbeat grace — I bet that our guess about a node after 10 seconds of no heartbeat is going to be very, very close to our guess about a node after 30 seconds of heartbeat. We should figure out how to gather whatever statistics on this we need, and then use them to set more sane parameters.

(In particular, a general rule of thumb is setting a grace to be double the heartbeat interval. We are way off from that target and I don't have a rational reason for why.)

Also available in: Atom PDF