Fault Tolerance for Distributed and Networked Systems

Author(s):  
Wenbing Zhao ◽  
Louise E. Moser ◽  
P. Michael Melliar-Smith

The services provided by computers and communication networks are becoming more critical to our society. Such services increase the need for computers and their applications to operate reliably, even in the presence of faults. Fault tolerance is particularly important for distributed and networked systems (Mullender, 1993), including telecommunication, power distribution, transportation, manufacturing, and financial systems.

2020 ◽  
Vol 174 (3-4) ◽  
pp. 229-258
Author(s):  
Qian Matteo Chen ◽  
Alberto Finzi ◽  
Toni Mancini ◽  
Igor Melatti ◽  
Enrico Tronci

In critical infrastructures like airports, much care has to be devoted in protecting radio communication networks from external electromagnetic interference. Protection of such mission-critical radio communication networks is usually tackled by exploiting radiogoniometers: at least three suitably deployed radiogoniometers, and a gateway gathering information from them, permit to monitor and localise sources of electromagnetic emissions that are not supposed to be present in the monitored area. Typically, radiogoniometers are connected to the gateway through relay nodes. As a result, some degree of fault-tolerance for the network of relay nodes is essential in order to offer a reliable monitoring. On the other hand, deployment of relay nodes is typically quite expensive. As a result, we have two conflicting requirements: minimise costs while guaranteeing a given fault-tolerance. In this paper, we address the problem of computing a deployment for relay nodes that minimises the overall cost while at the same time guaranteeing proper working of the network even when some of the relay nodes (up to a given maximum number) become faulty (fault-tolerance). We show that, by means of a computation-intensive pre-processing on a HPC infrastructure, the above optimisation problem can be encoded as a 0/1 Linear Program, becoming suitable to be approached with standard Artificial Intelligence reasoners like MILP, PB-SAT, and SMT/OMT solvers. Our problem formulation enables us to present experimental results comparing the performance of these three solving technologies on a real case study of a relay node network deployment in areas of the Leonardo da Vinci Airport in Rome, Italy.


Sign in / Sign up

Export Citation Format

Share Document