Issues of organizing computations in multicomputer systems with the software-controlled failure- and fault-tolerance. Part III
This three-part paper analyzes existing approaches and methods of organizing failure- and fault-tolerant computing in distributed multicomputer systems (DMCS), identifies and provides rationale for a list of issues to be solved. We review the application areas of failure- and fault- tolerant control systems for complex network and distributed objects. The third part proceeds with the study of the problems of organizing failure- and fault-tolerant computing in distributed multicomputer systems (DMCS), carried out in parts I and II of this work, and deals with the issues related to the diagnosis of multiple faults. The paper describes the main differences in ensuring fault tolerance in systems with broadcast communication channels and point-to-point communication channels.