Hardware Reliability

Hardware Reliability - 9 Hardware Reliability Irene...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon
9 Hardware Reliability Irene Eusgeld 1 , Bernhard Fechner 2 , Felix Salfner 3 , Max Walter 4 , Philipp Limbourg 6 , andLijunZhang 5 1 Swiss Federal Institute of Technology (ETH), Zurich, Switzerland 2 University of Hagen, Germany 3 Humboldt University Berlin, Germany 4 Technische Universität München, Germany 5 Saarland University, Germany 6 University of Duisburg-Essen, Germany Reliability is an important part of dependability. This chapter aims at supporting readers in the usage of the classical definitions, modelling and measures of (hardware) reliabil- ity metrics. 9.1 Introduction In the IT field the term “fault tolerance” is often widely used as “reliability improve- ment”. The question to be clarified is the relationship between reliability and fault toler- ance. In a general sense reliability will be understood as ability of a component/system to function correctly over a specified period of time, mostly under predefined condi- tions. Fault tolerance is defined as the ability of the system to continue operation in the event of a failure. Fault tolerance means that a computer system or component is de- signed such that, in case a component fails, a backup component or backup procedure can immediately take its place with no loss of functionality. Reliability can be improved through fault tolerance. Metrics of “classical” reliability theory are well known and nu- merous. Metrics of fault tolerance are less common, e.g. number of tolerated faults, number of checkpoints, reconfiguration time, etc. The most important method supporting fault tolerance/reliability is redundancy. Re- dundancy is duplication of components or repetition of operations to provide alterna- tive functional channels in case of failure. Redundancy can be implemented in different ways: structural (hot and standby redundancy), temporal, functional, etc. Application of redundancy is always connected with an increase in cost and/or complexity as well as sometimes with synchronisation problems. Predicting the system reliability by modelling during the design phase, and mea- suring the parameters of a real system are two completely different approaches. This chapter is sub-divided into five sections depending on the primary goal of the readers. The sections of this chapter are presented as set of references structured according to the various reliability metrics (RM). An index is provided at the end of the book so that specific issues can be referenced directly. The chapter is organised as follows: Sect 9.2 deals with the motivation on the application of reliability metrics. The reader should be able to define the reliability problem he/she is interested in. I. Eusgeld, F.C. Freiling, and R. Reussner (Eds.): Dependability Metrics, LNCS 4909, pp. 59–103, 2008.
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 2
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/03/2011 for the course TCS 402 taught by Professor Nitin during the Spring '11 term at Century College.

Page1 / 45

Hardware Reliability - 9 Hardware Reliability Irene...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online