scholarly journals Energy-Aware Task Scheduling Using Hybrid Firefly-BAT (FFABAT) in Big Data

2018 ◽  
Vol 18 (2) ◽  
pp. 98-111
Author(s):  
M. Senthilkumar

Abstract In modern times there is an increasing trend of applications for handling Big data. However, negotiating with the concepts of the Big data is an extremely difficult issue today. The MapReduce framework has been in focus recently for serious consideration. The aim of this study is to get the task-scheduling over Big data using Hadoop. Initially, we prioritize the tasks with the help of k-means clustering algorithm. Then, the MapReduce framework is employed. The available resource is optimally selected using optimization technique in map-phase. The proposed method uses the FireFly Algorithm and BAT algorithms (FFABAT) for choosing the optimal resource with minimum cost value. The bat-inspired algorithm is a meta-heuristic optimization method developed by Xin-She Yang (2010). This bat algorithm is established on the echo-location behaviour of micro-bats with variable pulse rates of emission and loudness. Finally, the tasks are scheduled with the optimal resource in reducer-phase and stored in the cloud. The performance of the algorithm is analysed, based on the total cost, time and memory utilization.

2012 ◽  
Vol 6-7 ◽  
pp. 82-87 ◽  
Author(s):  
Yuan Ming Yuan ◽  
Chan Le Wu

Data quantity of Big Data was too big to be processed with traditional clustering analysis technologies. Time consuming was long, problem of computability existed with traditional technologies. Having analyzed on k-means clustering algorithm, a new algorithm was proposed. Parallelizing part of k-means was found. The algorithm was improved with the method of redesigning flow with MapReduce framework. Problems mentioned above were solved. Experiments show that new algorithm is feasible and effective.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Zhihan Liu ◽  
Yi Jia ◽  
Xiaolu Zhu

Car sharing is a type of car rental service, by which consumers rent cars for short periods of time, often charged by hours. The analysis of urban traffic big data is full of importance and significance to determine locations of depots for car-sharing system. Taxi OD (Origin-Destination) is a typical dataset of urban traffic. The volume of the data is extremely large so that traditional data processing applications do not work well. In this paper, an optimization method to determine the depot locations by clustering taxi OD points with AP (Affinity Propagation) clustering algorithm has been presented. By analyzing the characteristics of AP clustering algorithm, AP clustering has been optimized hierarchically based on administrative region segmentation. Considering sparse similarity matrix of taxi OD points, the input parameters of AP clustering have been adapted. In the case study, we choose the OD pairs information from Beijing’s taxi GPS trajectory data. The number and locations of depots are determined by clustering the OD points based on the optimization AP clustering. We describe experimental results of our approach and compare it with standard K-means method using quantitative and stationarity index. Experiments on the real datasets show that the proposed method for determining car-sharing depots has a superior performance.


Author(s):  
Mujeeb Shaik Mohammed ◽  
Praveen Sam Rachapudy ◽  
Madhavi Kasa

With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face burden in handling them. Additionally, the presence of the imbalance data in the big data is a huge concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new optimization algorithm. Here, the big data classification is performed using the MapReduce framework, wherein the map and reduce functions are based on the proposed optimization algorithm. The optimization algorithm is named as Exponential Bat algorithm (E-Bat), which is the integration of the Exponential Weighted Moving Average (EWMA) and Bat Algorithm (BA). The function of map function is to select the features that are presented to the classification in the reducer module using the Neural Network (NN). Thus, the classification of big data is performed using the proposed E-Bat algorithm-based MapReduce Framework and the experimentation is performed using four standard databases, such as Breast cancer, Hepatitis, Pima Indian diabetes dataset, and Heart disease dataset. From, the experimental results, it can be shown that the proposed method acquired a maximal accuracy of 0.8829 and True Positive Rate (TPR) of 0.9090, respectively.


2019 ◽  
Vol 29 (1) ◽  
pp. 1496-1513 ◽  
Author(s):  
Omkaresh Kulkarni ◽  
Sudarson Jena ◽  
C. H. Sanjay

Abstract The recent advancements in information technology and the web tend to increase the volume of data used in day-to-day life. The result is a big data era, which has become a key issue in research due to the complexity in the analysis of big data. This paper presents a technique called FPWhale-MRF for big data clustering using the MapReduce framework (MRF), by proposing two clustering algorithms. In FPWhale-MRF, the mapper function estimates the cluster centroids using the Fractional Tangential-Spherical Kernel clustering algorithm, which is developed by integrating the fractional theory into a Tangential-Spherical Kernel clustering approach. The reducer combines the mapper outputs to find the optimal centroids using the proposed Particle-Whale (P-Whale) algorithm, for the clustering. The P-Whale algorithm is proposed by combining Whale Optimization Algorithm with Particle Swarm Optimization, for effective clustering such that its performance is improved. Two datasets, namely localization and skin segmentation datasets, are used for the experimentation and the performance is evaluated regarding two performance evaluation metrics: clustering accuracy and DB-index. The maximum accuracy attained by the proposed FPWhale-MRF technique is 87.91% and 90% for the localization and skin segmentation datasets, respectively, thus proving its effectiveness in big data clustering.


Cloud computing is an emerging technology with highly scalable service adopted by different kinds of people from around the world. In cloud environments one of the major problems is task scheduling; most of existing algorithm is not optimal. The proposed hybrid optimization method has combination of Elephant Herd Optimization (EHO) and Genetic Algorithm (GA) for find an optimal resource to schedule task in the Cloud. This proposed method has improves the performance of task scheduling by considering the parameters of response time, makespan, and cost of the cloud. The proposed method has implemented in CloudSim 3.0 toolkit and evaluated the performance with existing algorithm. The Experimental results were proven that proposed algorithm has given better performance compared to other scheduling algorithm.


During the last decade, the growth of big data is immeasurable in information technology. Big data has the potential to take all the decisions necessary for a company or business. But it has many challenges as well. As its size and volume are immeasurably ample it is a very challenging task to store, process and mines it. At the same time as a boon to it cloud computing has a large capacity to store this big data and provides tremendous processing power. It is a challenging task to process large amount of data frequently in the big-data cloud center through the thousands of interconnected servers. Due to the day by day growth of the big-data, big-data cloud center is forced to improve its Quality of Service (QoS) metrics like throughput, latency and response time. Hence, to develop an optimal data processing optimization method is a current research problem that has to be solved. The major intention of this paper is to develop an application that provides maximum throughput, minimum latency and reduce the response time. Toward this, we have developed an optimization technique using nature-inspired firefly optimization algorithm and k-means clustering (FA-KMeans). The developed optimization method has been evaluated with state of art algorithms. Its experimental result elucidates that our proposed method provides good throughput, reduces latency and response time.


Sign in / Sign up

Export Citation Format

Share Document