Enhancing NameNode fault tolerance in Hadoop over cloud environment

Author(s):  
Lino Abraham Varghese ◽  
V. P. Sreejith ◽  
S. Bose
Author(s):  
M. Chaitanya ◽  
K. Durga Charan

Load balancing makes cloud computing greater knowledgeable and could increase client pleasure. At reward cloud computing is among the all most systems which offer garage of expertise in very lowers charge and available all the time over the net. However, it has extra vital hassle like security, load administration and fault tolerance. Load balancing inside the cloud computing surroundings has a large impact at the presentation. The set of regulations relates the sport idea to the load balancing manner to amplify the abilties in the public cloud environment. This textual content pronounces an extended load balance mannequin for the majority cloud concentrated on the cloud segregating proposal with a swap mechanism to select specific strategies for great occasions.


2015 ◽  
Vol 37 ◽  
pp. 427
Author(s):  
Minoo Soltanshahi ◽  
Aliakbar Niknafs

Cloud computing is the latest distributed technology providing a rich environment of dynamically shared resources through virtualization, which can fulfill the requirements of users by allocating resources to programs. Any program in a cloud environment is delivered by workflows which are a series of interlinked tasks to accomplish a goal. One of the most important tasks in cloud computing is correct mapping of tasks onto resources. It is essential to schedule processes in distributed systems such as cloud, since it leaves a tremendous impact on the system performance. This is done by scheduling algorithms. Therefore, it is crucial to present and adopt an efficient algorithm in the cloud environment. This article attempted to examine the parameters effective in the efficiency of scheduling algorithms including deadline, cost constraint, balanced loading, power consumption and fault tolerance. Additionally, the performances of several algorithms were briefly discussed.


Author(s):  
Bakhta Meroufel ◽  
Ghalem Belalem

One of the most important points for more effective use in the environment of cloud is undoubtedly the study of reliability and robustness of services related to this environment. In this case, fault tolerance is necessary to ensure that reliability and reduce the SLA violation. Checkpointing is a popular fault tolerance technique in large-scale systems. However, its major disadvantage is the overhead caused by the storage time of checkpointing files, which increases the execution time and minimizes the possibility to meet the desired deadlines. In this chapter, the authors propose a checkpointing strategy with lightweight storage. The storage is provided by creating a virtual topology VRbIO and the use of an intelligent and fault tolerant I/O technique CSDS (collective and selective data sieving). The proposal is executed by active and reactive agents and it solves many problems of checkpointing with standard I/O. To evaluate the approach, the authors compare it with a checkpointing with ROMIO as I/O strategy. Experimental results show the effectiveness and reliability of the proposed approach.


Author(s):  
Deepanshu Jain ◽  
Nabeel Zaidi ◽  
Raghav Bansal ◽  
Praveen Kumar ◽  
Tanupriya Choudhury

2021 ◽  
Vol 22 (1) ◽  
pp. 67-79
Author(s):  
Özdinç Çelikel ◽  
Tolga Ovatman

Application checkpointing is a widely used recovery mechanism that consists of saving an application's state periodically to be used in case of a failure. In this study we investigate the utilisation of distributed checkpointing for replicated state machines. Conventionally, for replicated state machines, checkpointing information is stored in a replicated way in each of the replicas or separately in a single instance. Applying distributed checkpointing provides a means to adjust the level of fault tolerance of the checkpointing approach by giving away from recovery time. We use a local cluster and cloud environment to examine the effects of distributed checkpointing in a simple state machine example and compare the results with conventional approaches. As expected, distributed checkpointing gains from memory consumption and utilise different levels of fault tolerance while performing worse in terms of recovery time.


Author(s):  
Ajay Rawat ◽  
Rama Sushil ◽  
Amit Agarwal

Fault tolerance is the most imperious issue in the cloud to provide reliable services. Inherent vulnerability to failure hampers the performance and reliability of cloud services. Hence, to achieve reliability, fault tolerance becomes a mandatory feature which is hard to implement due to the dynamic infrastructure and complex interdependencies. Numerous fault tolerance techniques have been developed in the literature to address the challenges of cloud reliability. A recent research survey presented in this paper attempts to integrate the different fault tolerance architecture. This study presents a critical research review on various existing fault tolerance techniques to improve services reliability, availability, and applications execution in the cloud. A comparative analysis, based on different critical metrics like failure prediction, detection strategy, failure history, VM placement, and limitations, of the reviewed framework systems is also included in the paper. This review intends to facilitate the development of the new fault tolerance technique for the cloud environment.


2020 ◽  
Vol 29 (15) ◽  
pp. 2050240
Author(s):  
Vahid Mohammadian ◽  
Nima Jafari Navimipour ◽  
Mehdi Hosseinzadeh ◽  
Aso Darwesh

Providing dynamic resources is based on the virtualization features of the cloud environment. Cloud computing as an emerging technology uses a high availability of services at any time, in any place and independent of the hardware. However, fault tolerance is one of the main problems and challenges in cloud computing. This subject has an important effect on cloud computing, but, as far as we know, there is not a comprehensive and systematic study in this field. Accordingly, in this paper, the existing methods and mechanisms are discussed in different groups, such as proactive and reactive, types of fault detection, etc. Various fault tolerance techniques are provided and discussed. The advantages and disadvantages of these techniques are shown on the basis of the technology that they have used. Generally, the contributions of this research provide a summary of the available challenges associated with fault tolerance, a description of several important fault tolerance methods in the cloud computing and the key regions for the betterment of fault tolerance techniques in the future works. The advantages and disadvantages of the selected articles in each category are also highlighted and their significant challenges are discussed to provide the research lines for further studies.


2017 ◽  
Vol 6 (2) ◽  
pp. 36 ◽  
Author(s):  
Mridula Dhingra ◽  
Neha Gupta

Cloud Computing is a vital platform for viable and non-viable clients. It provides the reliable services to clients through data centers which contains servers, storage etc. One of the major challengein cloud computing environment is that services should be run without errors or faults. In cloud computing environment various computations are run on real time applications so that chances of errors becomes high, for these reasons applications running in cloud environment should be reliable and must have the ability of fault tolerance. In this paper, authors have discussed many fault tolerance techniques and compared various model of fault tolerance.


2019 ◽  
Vol 23 (24) ◽  
pp. 13591-13602 ◽  
Author(s):  
Vipul Gupta ◽  
Bikram Pal Kaur ◽  
Surender Jangra

Sign in / Sign up

Export Citation Format

Share Document