A Novel Method for Fault Tolerance Intelligence Advisor System (FT-IAS) for Mission Critical Operations

With mission critical web applications and resources being hosted on cloud environments, and cloud services growing fast, the need for having greater level of service assurance regarding fault tolerance for availability and reliability has increased. The high priority now is ensuring a fault tolerant environment that can keep the systems up and running. To minimize the impact of downtime or accessibility failure due to systems, network devices or hardware, the expectations are that such failures need to be anticipated and handled proactively in fast, intelligent way. This article discusses the fault tolerance system for cloud computing environments, analyzes whether this is effective for Cloud environments.

Download Full-text

Katana

Security-Aware Systems Applications and Software Development Methods ◽

10.4018/978-1-4666-1580-9.ch012 ◽

2012 ◽

pp. 217-233

Author(s):

Sergey Bratus ◽

James Oakley ◽

Ashwin Ramaswamy ◽

Sean W. Smith ◽

Michael E. Locasto

Keyword(s):

Object File ◽

Critical Systems ◽

End User ◽

Standard Source ◽

Novel Method ◽

Mission Critical ◽

Global Data

The mechanics of hot patching (the process of upgrading a program while it executes) remain understudied, even though it offers capabilities that act as practical benefits for both consumer and mission-critical systems. A reliable hot patching procedure would serve particularly well by reducing the downtime necessary for critical functionality or security upgrades. However, hot patching also carries the risk—real or perceived—of leaving the system in an inconsistent state, which leads many owners to forgo its benefits as too risky; for systems where availability is critical, this decision may result in leaving systems un-patched and vulnerable. In this paper, the authors present a novel method for hot patching ELF binaries that supports synchronized global data and code updates, and reasoning about the results of applying the hot patch. In this regard, the Patch Object format was developed to encode patches as a special type of ELF re-locatable object file. The authors then built a tool, Katana, which automatically creates these patch objects as a by-product of the standard source build process. Katana also allows an end-user to apply the Patch Objects to a running process.

Download Full-text

Specifying fault tolerance in mission critical systems

Proceedings. IEEE High-Assurance Systems Engineering Workshop (Cat. No.96TB100076) ◽

10.1109/hase.1996.618557 ◽

2002 ◽

Cited By ~ 1

Author(s):

T.S. Perraju ◽

S.P. Rana ◽

S.P. Sarkar

Keyword(s):

Fault Tolerance ◽

Critical Systems ◽

Mission Critical

Download Full-text

Methodology for cost-effective software fault tolerance for mission-critical systems

IEEE Aerospace and Electronic Systems Magazine ◽

10.1109/62.618016 ◽

1997 ◽

Vol 12 (9) ◽

pp. 25-30 ◽

Cited By ~ 3

Author(s):

R.J. Kreutzfeld ◽

R.E. Neese

Keyword(s):

Fault Tolerance ◽

Cost Effective ◽

Critical Systems ◽

Software Fault Tolerance ◽

Mission Critical ◽

Software Fault

Download Full-text

HB4: special session -- defect and fault tolerance for dependability: mission-critical measurement systems 3

Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510) ◽

10.1109/imtc.2004.1351423 ◽

2004 ◽

Keyword(s):

Fault Tolerance ◽

Special Session ◽

Measurement Systems ◽

Mission Critical

Download Full-text

Katana

International Journal of Secure Software Engineering ◽

10.4018/jsse.2010070101 ◽

2010 ◽

Vol 1 (3) ◽

pp. 1-17 ◽

Cited By ~ 4

Author(s):

Sergey Bratus ◽

James Oakley ◽

Ashwin Ramaswamy ◽

Sean W. Smith ◽

Michael E. Locasto

Keyword(s):

Object File ◽

Critical Systems ◽

End User ◽

Standard Source ◽

Novel Method ◽

Mission Critical ◽

Global Data

The mechanics of hot patching (the process of upgrading a program while it executes) remain understudied, even though it offers capabilities that act as practical benefits for both consumer and mission-critical systems. A reliable hot patching procedure would serve particularly well by reducing the downtime necessary for critical functionality or security upgrades. However, hot patching also carries the risk—real or perceived—of leaving the system in an inconsistent state, which leads many owners to forgo its benefits as too risky; for systems where availability is critical, this decision may result in leaving systems un-patched and vulnerable. In this paper, the authors present a novel method for hot patching ELF binaries that supports synchronized global data and code updates, and reasoning about the results of applying the hot patch. In this regard, the Patch Object format was developed to encode patches as a special type of ELF re-locatable object file. The authors then built a tool, Katana, which automatically creates these patch objects as a by-product of the standard source build process. Katana also allows an end-user to apply the Patch Objects to a running process.

Download Full-text

Consistency Is Not Enough in Byzantine Fault Tolerance

Encyclopedia of Information Science and Technology, Fourth Edition ◽

10.4018/978-1-5225-2255-3.ch107 ◽

2018 ◽

pp. 1238-1247

Author(s):

Wenbing Zhao

Keyword(s):

Fault Tolerance ◽

Fault Tolerant ◽

Large Body ◽

Random Numbers ◽

Security Requirement ◽

Critical Systems ◽

Byzantine Fault Tolerance ◽

Byzantine Fault ◽

Mission Critical ◽

Fault Tolerant Systems

The use of good random numbers is crucial to the security of many mission-critical systems. However, when such systems are replicated for Byzantine fault tolerance, a serious issue arises, i.e., how do we preserve the integrity of the systems while ensuring strong replica consistency? Despite the fact that there exists a large body of work on how to render replicas deterministic under the benign fault model, the solutions regarding the random number control are often overly simplistic without regard to the security requirement, and hence, they are not suitable for practical Byzantine fault tolerance. In this chapter, we present a novel integrity-preserving replica coordination algorithm for Byzantine fault tolerant systems. The central idea behind our CD-BFT algorithm is that all random numbers to be used by the replicas are collectively determined, based on the contributions made by a quorum of replicas, at least f+1 of which are not faulty.

Download Full-text

MILP, Pseudo-Boolean, and OMT Solvers for Optimal Fault-Tolerant Placements of Relay Nodes in Mission Critical Wireless Networks*

Fundamenta Informaticae ◽

10.3233/fi-2020-1941 ◽

2020 ◽

Vol 174 (3-4) ◽

pp. 229-258

Author(s):

Qian Matteo Chen ◽

Alberto Finzi ◽

Toni Mancini ◽

Igor Melatti ◽

Enrico Tronci

Keyword(s):

Fault Tolerance ◽

Communication Networks ◽

Electromagnetic Interference ◽

Fault Tolerant ◽

Relay Node ◽

Problem Formulation ◽

Leonardo Da Vinci ◽

Radio Communication ◽

Relay Nodes ◽

Mission Critical

In critical infrastructures like airports, much care has to be devoted in protecting radio communication networks from external electromagnetic interference. Protection of such mission-critical radio communication networks is usually tackled by exploiting radiogoniometers: at least three suitably deployed radiogoniometers, and a gateway gathering information from them, permit to monitor and localise sources of electromagnetic emissions that are not supposed to be present in the monitored area. Typically, radiogoniometers are connected to the gateway through relay nodes. As a result, some degree of fault-tolerance for the network of relay nodes is essential in order to offer a reliable monitoring. On the other hand, deployment of relay nodes is typically quite expensive. As a result, we have two conflicting requirements: minimise costs while guaranteeing a given fault-tolerance. In this paper, we address the problem of computing a deployment for relay nodes that minimises the overall cost while at the same time guaranteeing proper working of the network even when some of the relay nodes (up to a given maximum number) become faulty (fault-tolerance). We show that, by means of a computation-intensive pre-processing on a HPC infrastructure, the above optimisation problem can be encoded as a 0/1 Linear Program, becoming suitable to be approached with standard Artificial Intelligence reasoners like MILP, PB-SAT, and SMT/OMT solvers. Our problem formulation enables us to present experimental results comparing the performance of these three solving technologies on a real case study of a relay node network deployment in areas of the Leonardo da Vinci Airport in Rome, Italy.

Download Full-text

WF5: special session -- defect and fault tolerance for dependability: mission-critical measurement systems 1

Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510) ◽

10.1109/imtc.2004.1351339 ◽

2004 ◽

Keyword(s):

Fault Tolerance ◽

Special Session ◽

Measurement Systems ◽

Mission Critical

Download Full-text