scholarly journals Dynamic Load Aware Scheduler of Map Reduce Tasks for Cloud Environments

Most of the current day applications are data and compute intensive which led to invention of technologies like Hadoop. Hadoop uses Map Reduce framework for parallel processing of big data applications using the computing resources of multiple nodes. Hadoop is designed for cluster environments and has few limitations when executed in cloud environments. Hadoop on cloud has become a common choice due to its easy establishment of infrastructure and pay as you use model. Hadoop performance on cloud infrastructures is affected by the virtualization overhead of cloud environment. The execution times of Hadoop on cloud can be improved if the virtual resources are effectively used to schedule the tasks by studying the resource usage characteristics of the tasks and resource availability of the nodes. The proposed work is to build a dynamic scheduler for Hadoop framework which can make scheduling decision dynamically based on job resource usage and node load. The results of the proposed work indicate an improvement of up to 23% in execution time of the Hadoop Map Reduce applications.

Author(s):  
Peer Hasselmeyer ◽  
Gregory Katsaros ◽  
Bastian Koller ◽  
Philipp Wieder

The management of the entire service landscape comprising a Cloud environment is a complex and challenging venture. There, one task of utmost importance, is the generation and processing of information about the state, health, and performance of the various services and IT components, something which is generally referred to as monitoring. Such information is the foundation for proper assessment and management of the whole Cloud. This chapter pursues two objectives: first, to provide an overview of monitoring in Cloud environments and, second, to propose a solution for interoperable and vendor-independent Cloud monitoring. Along the way, the authors motivate the necessity of monitoring at the different levels of Cloud infrastructures, introduce selected state-of-the-art, and extract requirements for Cloud monitoring. Based on these requirements, the following sections depict a Cloud monitoring solution and describe current developments towards interoperable, open, and extensible Cloud monitoring frameworks.


2019 ◽  
Vol 214 ◽  
pp. 07006
Author(s):  
Paolo Andreetto ◽  
Fabrizio Chiarello ◽  
Sergio Traldi

The analysis and understanding of resources utilization in shared infrastructures, such as cloud environments, is crucial in order to provide better performance, administration and capacity planning. The management of resource usage of the OpenStack-based cloud infrastructures hosted at INFN-Padova, the Cloud Area Padovana and the INFN-PADOVA-STACK instance of the EGI Federated cloud, started with the deployment of Ceilometer, the OpenStack component responsible for collecting and managing accounting information. However, by using Ceilometer alone we found some limiting problems related to the way it handles information: among others, the imbalance between storage and data retention requirements, and the complexity in computing custom metrics. In this contribution we present a tool, called CAOS, which we have been implementing to overcome the aforementioned issues. CAOS collects, manages and presents the data concerning resource usage of our OpenStack-based cloud infrastructures. By gathering data from both the Ceilometer service and Open-Stack API, CAOS enables us to track resource usage at different levels (e.g. per project), in such a way that both current and past consumption ofresources can be easily determined, stored and presented.


Author(s):  
P. Lalitha Surya Kumari

This chapter gives information about the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by big data applications. Big data is one area where we can store, extract, and process a large amount of data. All these data are very often unstructured. Using big data, security functions are required to work over the heterogeneous composition of diverse hardware, operating systems, and network domains. A clearly defined security boundary like firewalls and demilitarized zones (DMZs), conventional security solutions, are not effective for big data as it expands with the help of public clouds. This chapter discusses the different concepts like characteristics, risks, life cycle, and data collection of big data, map reduce components, issues and challenges in big data, cloud secure alliance, approaches to solve security issues, introduction of cybercrime, YARN, and Hadoop components.


Author(s):  
Rajinder Sandhu ◽  
Adel Nadjaran Toosi ◽  
Rajkumar Buyya

Cloud computing provides resources using multitenant architecture where infrastructure is created from one or more distributed datacenters. Scheduling of applications in cloud infrastructures is one of the main research area in cloud computing. Researchers have developed many scheduling algorithms and evaluated them using simulators such as CloudSim. Their performance needs to be validated in real-time cloud environments to improve their usefulness. Aneka is one of the prominent PaaS software which allows users to develop cloud application using various programming models and underline infrastructure. This chapter presents a scheduling API developed for the Aneka software platform. Users can develop their own scheduling algorithms using this API and integrate it with Aneka to test their scheduling algorithms in real cloud environments. The proposed API provides all the required functionalities to integrate and schedule private, public, or hybrid cloud with the Aneka software.


2018 ◽  
Vol 14 (2) ◽  
pp. 43-58 ◽  
Author(s):  
S. Kirthica ◽  
Rajeswari Sridhar

One of the principle features on which cloud environments operate is the scaling up and down of resources based on users' needs, called elasticity. This feature is limited to the cloud's physical resources. This article proposes to enhance the elasticity of a cloud in an intelligent manner by communicating with an optimal external cloud (EC) and borrowing additional resources from it when the cloud runs out of resources. This inter-cloud communication is secured by a model whose structure is similar to the Kerberos protocol. To choose the optimal EC for a particular request of a user, a list of parameters, collectively termed as RePVoCRaD, are enumerated. Once chosen, trust is established with the chosen EC and inter-cloud communication begins. While existing works deal with third parties to establish or secure inter-cloud communication, this work is novel in that there is absence of third parties in the entire process, thereby reducing security threats and additional costs involved. Evaluating this work based on turnaround time and transaction success rate, in a real-time cloud environment, it is seen that the cloud's elasticity is so enhanced that it successfully accommodates its users' additional demands by the fastest means possible.


2017 ◽  
Vol 49 (3) ◽  
pp. 179-182 ◽  
Author(s):  
Keerthi Bangari ◽  
◽  
Sujitha Meduri ◽  
CY Rao

Sign in / Sign up

Export Citation Format

Share Document