A Novel Method for Fault Tolerance Intelligence Advisor System (FT-IAS) for Mission Critical Operations

Author(s):  
M. Balaji ◽  
C. Mala ◽  
M. S. Siva
2018 ◽  
Vol 8 (3) ◽  
pp. 20-31 ◽  
Author(s):  
Sam Goundar ◽  
Akashdeep Bhardwaj

With mission critical web applications and resources being hosted on cloud environments, and cloud services growing fast, the need for having greater level of service assurance regarding fault tolerance for availability and reliability has increased. The high priority now is ensuring a fault tolerant environment that can keep the systems up and running. To minimize the impact of downtime or accessibility failure due to systems, network devices or hardware, the expectations are that such failures need to be anticipated and handled proactively in fast, intelligent way. This article discusses the fault tolerance system for cloud computing environments, analyzes whether this is effective for Cloud environments.


Author(s):  
Sergey Bratus ◽  
James Oakley ◽  
Ashwin Ramaswamy ◽  
Sean W. Smith ◽  
Michael E. Locasto

The mechanics of hot patching (the process of upgrading a program while it executes) remain understudied, even though it offers capabilities that act as practical benefits for both consumer and mission-critical systems. A reliable hot patching procedure would serve particularly well by reducing the downtime necessary for critical functionality or security upgrades. However, hot patching also carries the risk—real or perceived—of leaving the system in an inconsistent state, which leads many owners to forgo its benefits as too risky; for systems where availability is critical, this decision may result in leaving systems un-patched and vulnerable. In this paper, the authors present a novel method for hot patching ELF binaries that supports synchronized global data and code updates, and reasoning about the results of applying the hot patch. In this regard, the Patch Object format was developed to encode patches as a special type of ELF re-locatable object file. The authors then built a tool, Katana, which automatically creates these patch objects as a by-product of the standard source build process. Katana also allows an end-user to apply the Patch Objects to a running process.


2010 ◽  
Vol 1 (3) ◽  
pp. 1-17 ◽  
Author(s):  
Sergey Bratus ◽  
James Oakley ◽  
Ashwin Ramaswamy ◽  
Sean W. Smith ◽  
Michael E. Locasto

The mechanics of hot patching (the process of upgrading a program while it executes) remain understudied, even though it offers capabilities that act as practical benefits for both consumer and mission-critical systems. A reliable hot patching procedure would serve particularly well by reducing the downtime necessary for critical functionality or security upgrades. However, hot patching also carries the risk—real or perceived—of leaving the system in an inconsistent state, which leads many owners to forgo its benefits as too risky; for systems where availability is critical, this decision may result in leaving systems un-patched and vulnerable. In this paper, the authors present a novel method for hot patching ELF binaries that supports synchronized global data and code updates, and reasoning about the results of applying the hot patch. In this regard, the Patch Object format was developed to encode patches as a special type of ELF re-locatable object file. The authors then built a tool, Katana, which automatically creates these patch objects as a by-product of the standard source build process. Katana also allows an end-user to apply the Patch Objects to a running process.


Author(s):  
Wenbing Zhao

The use of good random numbers is crucial to the security of many mission-critical systems. However, when such systems are replicated for Byzantine fault tolerance, a serious issue arises, i.e., how do we preserve the integrity of the systems while ensuring strong replica consistency? Despite the fact that there exists a large body of work on how to render replicas deterministic under the benign fault model, the solutions regarding the random number control are often overly simplistic without regard to the security requirement, and hence, they are not suitable for practical Byzantine fault tolerance. In this chapter, we present a novel integrity-preserving replica coordination algorithm for Byzantine fault tolerant systems. The central idea behind our CD-BFT algorithm is that all random numbers to be used by the replicas are collectively determined, based on the contributions made by a quorum of replicas, at least f+1 of which are not faulty.


2020 ◽  
Vol 174 (3-4) ◽  
pp. 229-258
Author(s):  
Qian Matteo Chen ◽  
Alberto Finzi ◽  
Toni Mancini ◽  
Igor Melatti ◽  
Enrico Tronci

In critical infrastructures like airports, much care has to be devoted in protecting radio communication networks from external electromagnetic interference. Protection of such mission-critical radio communication networks is usually tackled by exploiting radiogoniometers: at least three suitably deployed radiogoniometers, and a gateway gathering information from them, permit to monitor and localise sources of electromagnetic emissions that are not supposed to be present in the monitored area. Typically, radiogoniometers are connected to the gateway through relay nodes. As a result, some degree of fault-tolerance for the network of relay nodes is essential in order to offer a reliable monitoring. On the other hand, deployment of relay nodes is typically quite expensive. As a result, we have two conflicting requirements: minimise costs while guaranteeing a given fault-tolerance. In this paper, we address the problem of computing a deployment for relay nodes that minimises the overall cost while at the same time guaranteeing proper working of the network even when some of the relay nodes (up to a given maximum number) become faulty (fault-tolerance). We show that, by means of a computation-intensive pre-processing on a HPC infrastructure, the above optimisation problem can be encoded as a 0/1 Linear Program, becoming suitable to be approached with standard Artificial Intelligence reasoners like MILP, PB-SAT, and SMT/OMT solvers. Our problem formulation enables us to present experimental results comparing the performance of these three solving technologies on a real case study of a relay node network deployment in areas of the Leonardo da Vinci Airport in Rome, Italy.


Sign in / Sign up

Export Citation Format

Share Document