Fault tolerant system with imperfect coverage, reboot and server vacation

Authors

Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India

Abstract

This study is concerned with the performance modeling of a fault tolerant system consisting of operating units supported by a combination of warm and cold spares. The on-line as well as warm standby units are subject to failures and are send for the repair to a repair facility having single repairman which is prone to failure. If the failed unit is not detected, the system enters into an unsafe state from which it is cleared by the reboot and recovery action. The server is allowed to go for vacation if there is no failed unit present in the system. Markov model is developed to obtain the transient probabilities associated with the system states. Runge–Kutta method is used to evaluate the system state probabilities and queueing measures. To explore the sensitivity and cost associated with the system, numerical simulation is conducted.

Keywords