Project

General

Profile

Documentation #2274

Basic Availability Model

Added by Anonymous almost 12 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
common
Target version:
-
% Done:

0%

Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

(1) Construct a continuous-time markov availability model for a basic cluster (3 mons, 4 osds, 2 copy)
(Petri nets are actually better suited, but few people understand them and tools are harder to find)
(2) plug in standard FIT rates for nodes, controllers, disks, NICs, switches, fans, power-supplies
(3) plug in measured reboot and recovery times
(4) use embedded linux s/w FIT rates until we have data from nightlies and long running clusters
(5) estimate percentage of coupled software failures based known anecdotes
(6) publish the model and open it for critique by the community
(7) maintain and publish internal and field failure rate data

History

#1 Updated by Zac Dover about 4 years ago

  • Status changed from New to Closed

This bug has been judged too old to fix. This is because either it is either 1) raised against a version of Ceph prior to Luminous, or 2) just really old, and untouched for so long that it is unlikely nowadays to represent a live documentation concern.

If you think that the closing of this bug is an error, raise another bug of a similar kind. If you think that the matter requires urgent attention, please let Zac Dover know at .

Also available in: Atom PDF