Cache Tag Array Fault Tolerance Method Based on Redundancy and Similarity of Adjacent Cache Line Tag Bits

2021 ◽  
Author(s):  
Xiaozhi Du ◽  
Honglei Dong ◽  
Hehe Yue
2020 ◽  
Vol 8 (5) ◽  
pp. 2040-2044

The cloud technologies are gaining boom in the field of information technology. But on the same side cloud computing sometimes results in failures. These failures demand more reliable frameworks with high availability of computers acting as nodes. The request made by the user is replicated and sent to various VMs. If one of the VMs fail, the other can respond to increase the reliability. A lot of research has been done and being carried out to suggest various schemes for fault tolerance thus increasing the reliability. Earlier schemes focus on only one way of dealing with faults but the scheme proposed by the the author in this paper presents an adaptive scheme that deals with the issues related to fault tolerance in various cloud infrastructure. The projected scheme uses adaptive behavior during the selection of replication and fine-grained checkpointing methods for attaining a reliable cloud infrastructure that can handle different client requirements. In addition to it the algorithm also determines the best suited fault tolerance method for every designated virtual node. Zheng, Zhou,. Lyu and I. King (2012).


2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Yang Liu ◽  
Wei Wei

MapReduce is a programming model and an associated implementation for processing and generating large data sets with a parallel, distributed algorithm on a cluster. In cloud environment, node and task failure are no longer accidental but a common feature of large-scale systems. Current rescheduling-based fault tolerance method in MapReduce framework failed to fully consider the location of distributed data and the computation and storage overhead of rescheduling failure tasks. Thus, a single node failure will increase the completion time dramatically. In this paper, a replication-based mechanism is proposed, which takes both task and node failure into consideration. Experimental results show that, compared with default mechanism in Hadoop, our mechanism can significantly improve the performance at failure time, with more than 30% decreasing in execution time.


Author(s):  
Ahmad Shukri Mohd Noor ◽  
Nur Farhah Mat Zian ◽  
Noor Hafhizah Abd Rahim ◽  
Rabiei Mamat ◽  
Wan Nur Amira Wan Azman

The availability of the data in a distributed system can be increase by implementing fault tolerance mechanism in the system. Reactive method in fault tolerance mechanism deals with restarting the failed services, placing redundant copies of data in multiple nodes across network, in other words data replication and migrating the data for recovery. Even if the idea of data replication is solid, the challenge is to choose the right replication technique that able to provide better data availability as well as consistency that involves read and write operations on the redundant copies. Circular Neighboring Replication (CNR) technique exploits neighboring policy in replicating the data items in the system performs well with regards to lower copies needed to maintain the system availability at the highest. In a performance analysis with existing techniques, results show that CNR improves system availability by average 37% by offering only two replicas needed to maintain data availability and consistency. The study demonstrates the possibility of the proposed technique and the potential of deploying in larger and complex environment.


Author(s):  
Erin M. Gillespie ◽  
Wayne Walter

The purpose of this project is to assess the feasibility of a Kalman Filter approach for fault detection in a highly unstable system, specifically a heart pump that is currently under development at RIT. Simulations and experimental work were completed to determine the effects of possible position sensor fault conditions on the system; that information was then used in conjunction with a pair of Kalman filters to create a method of detecting faults and providing fault-tolerant operation. The estimator system was designed and tested using SIMULINK™. The simulations showed the filters were able to calculate and remove bias caused by any type of position sensor error, provided the estimated plant model is nearly identical to the actual plant model. Sensitivity analysis showed that the fault detection/fault-tolerance method is extremely sensitive to discrepancies between the estimated plant model and actual pump behavior. Consequently, it is considered unfeasible for implementation on a real system. Experimental results confirmed these findings, demonstrating the drawbacks of model-based fault detection and tolerance methods.


Sign in / Sign up

Export Citation Format

Share Document