mapreduce paradigm Latest Research Papers

In the past twenty years, we have witnessed an unprecedented production of data worldwide that has generated a growing demand for computing resources and has stimulated the design of computing paradigms and software tools to efficiently and quickly obtain insights on such a Big Data. State-of-the-art parallel computing techniques such as the MapReduce guarantee high performance in scenarios where involved computing nodes are equally sized and clustered via broadband network links, and the data are co-located with the cluster of nodes. Unfortunately, the mentioned techniques have proven ineffective in geographically distributed scenarios, i.e., computing contexts where nodes and data are geographically distributed across multiple distant data centers. In the literature, researchers have proposed variants of the MapReduce paradigm that obtain awareness of the constraints imposed in those scenarios (such as the imbalance of nodes computing power and of interconnecting links) to enforce smart task scheduling strategies. We have designed a hierarchical computing framework in which a context-aware scheduler orchestrates computing tasks that leverage the potential of the vanilla Hadoop framework within each data center taking part in the computation. In this work, after presenting the features of the developed framework, we advocate the opportunity of fragmenting the data in a smart way so that the scheduler produces a fairer distribution of the workload among the computing tasks. To prove the concept, we implemented a software prototype of the framework and ran several experiments on a small-scale testbed. Test results are discussed in the last part of the paper.

Download Full-text

Incremental mining of high utility sequential patterns using MapReduce paradigm

Cluster Computing ◽

10.1007/s10586-021-03448-4 ◽

2021 ◽

Author(s):

Sumalatha Saleti

Keyword(s):

Sequential Patterns ◽

Incremental Mining ◽

High Utility ◽

Mapreduce Paradigm

Download Full-text

Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm

10.1109/smc52423.2021.9658839 ◽

2021 ◽

Author(s):

Lun Hu ◽

Bo-Wei Zhao ◽

Shicheng Yang ◽

Xin Luo ◽

MengChu Zhou

Keyword(s):

Protein Interactions ◽

Large Scale ◽

Protein Protein Interactions ◽

Mapreduce Paradigm

Download Full-text

MapReduce paradigm: DNA sequence clustering based on repeats as features

Expert Systems ◽

10.1111/exsy.12827 ◽

2021 ◽

Author(s):

Chandra Mohan Dasari ◽

Raju Bhukya

Keyword(s):

Dna Sequence ◽

Sequence Clustering ◽

Mapreduce Paradigm

Download Full-text

A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i3.pp789-800 ◽

2021 ◽

Vol 10 (3) ◽

pp. 789

Author(s):

Imad Sassi ◽

Samir Anter ◽

Abdelkrim Bekkhoucha

Keyword(s):

Big Data ◽

Hidden Markov Models ◽

Large Scale ◽

Markov Models ◽

Hidden Markov ◽

Machine Learning Algorithms ◽

Decoding Algorithm ◽

Large Scale Data ◽

Large Scale Data Processing ◽

Mapreduce Paradigm

Hidden Markov models (HMMs) are one of machine learning algorithms which have been widely used and demonstrated their efficiency in many conventional applications. This paper proposes a modified posterior decoding algorithm to solve hidden Markov models decoding problem based on MapReduce paradigm and spark’s resilient distributed dataset (RDDs) concept, for large-scale data processing. The objective of this work is to improve the performances of HMM to deal with big data challenges. The proposed algorithm shows a great improvement in reducing time complexity and provides good results in terms of running time, speedup, and parallelization efficiency for a large amount of data, i.e., large states number and large sequences number.

Download Full-text

The Optimization Strategies on Clarification of the Misconceptions of Big Data Processing in Dynamic and Opportunistic Environments

Big Data and Cognitive Computing ◽

10.3390/bdcc5030038 ◽

2021 ◽

Vol 5 (3) ◽

pp. 38

Author(s):

Wei Li ◽

Maolin Tang

Keyword(s):

Big Data ◽

Impact Factors ◽

Problem Size ◽

Data Problem ◽

Minimum Number ◽

Overall Performance ◽

The Impact ◽

The Given ◽

The Relationship ◽

Mapreduce Paradigm

This paper identifies four common misconceptions about the scalability of volunteer computing on big data problems. The misconceptions are then clarified by analyzing the relationship between scalability and the impact factors including the problem size of big data, the heterogeneity and dynamics of volunteers, and the overlay structure. This paper proposes optimization strategies to find the optimal overlay for the given big data problem. This paper forms multiple overlays to optimize the performance of individual steps in terms of MapReduce paradigm. The optimization is to achieve the maximum overall performance by using a minimum number of volunteers, not overusing resources. This paper has demonstrated that the simulations on the concerned factors can fast find the optimization points. This paper concludes that always welcoming more volunteers is an overuse of available resources because they do not always bring benefit to the overall performance. Finding optimal use of volunteers are possible for the given big data problems even on the dynamics and opportunism of volunteers.

Download Full-text

The MapReduce Paradigm

The Cloud Computing Book ◽

10.1201/9781003147503-15 ◽

2021 ◽

pp. 145-160

Author(s):

Douglas E. Comer

Keyword(s):

Mapreduce Paradigm

Download Full-text

Classification of big datasets using heterogeneous ensemble classifiers in Apache Spark based on MapReduce paradigm

Expert Systems with Applications ◽

10.1016/j.eswa.2021.115369 ◽

2021 ◽

pp. 115369

Author(s):

Hamidreza Kadkhodaei ◽

Amir Masoud Eftekhari Moghadam ◽

Mehdi Dehghan

Keyword(s):

Apache Spark ◽

Ensemble Classifiers ◽

Heterogeneous Ensemble ◽

Mapreduce Paradigm

Download Full-text

Heart Disease Prediction using Cluster Based MapReduce Paradigm

International Journal of Scientific and Research Publications (IJSRP) ◽

10.29322/ijsrp.11.03.2021.p11167 ◽

2021 ◽

Vol 11 (3) ◽

pp. 473-477

Author(s):

J. Sukanya ◽

Dr.K. Rajiv Gandhi ◽

Dr. V. Palanisamy

Keyword(s):

Heart Disease ◽

Disease Prediction ◽

Mapreduce Paradigm

Download Full-text

DEVELOPMENT OF ALGORITHM OF SMART GEOGRAPHIC AREA

EKSAKTA: Journal of Sciences and Data Analysis ◽

10.20885/eksakta.vol2.iss1.art7 ◽

2021 ◽

Vol 2 (1) ◽

pp. 55-60

Author(s):

Yusifov S.I ◽

Ragimova N.A ◽

Abdullayev V.H ◽

Khalilov M.E

Keyword(s):

Industry 4.0 ◽

Information Technologies ◽

Rapid Development ◽

Fog Computing ◽

Geographic Area ◽

Scientific Discipline ◽

Global Changes ◽

Main Components ◽

Mapreduce Paradigm ◽

Insight Into

The rapid development of information technologies accelerates the approximations of industry 4.0, which is why sectors of the economy and science must adapt to these changes. Global changes in geography have led to the emergence of a new scientific discipline called geoinformatics. It then provides insight into the Smart Geographic Area, its structure and the main components. To do this, there used methods for communicating the main components IIoT, IoE), for analyzing data (Big Data, Hadoop), for managing processes (CPs), for storing data (Cloud Computing, Fog Computing). As a result of the study, there was developed a Smart Geographic Area algorithm based on the MapReduce paradigm.

Download Full-text

mapreduce paradigm
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Hierarchical Hadoop Framework to Process Geo-Distributed Big Data

Incremental mining of high utility sequential patterns using MapReduce paradigm

Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm

MapReduce paradigm: DNA sequence clustering based on repeats as features

A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem

The Optimization Strategies on Clarification of the Misconceptions of Big Data Processing in Dynamic and Opportunistic Environments

The MapReduce Paradigm

Classification of big datasets using heterogeneous ensemble classifiers in Apache Spark based on MapReduce paradigm

Heart Disease Prediction using Cluster Based MapReduce Paradigm

DEVELOPMENT OF ALGORITHM OF SMART GEOGRAPHIC AREA

Export Citation Format

mapreduce paradigmRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A Hierarchical Hadoop Framework to Process Geo-Distributed Big Data

Incremental mining of high utility sequential patterns using MapReduce paradigm

Predicting Large-scale Protein-protein Interactions by Extracting Coevolutionary Patterns with MapReduce Paradigm

MapReduce paradigm: DNA sequence clustering based on repeats as features

A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem

The Optimization Strategies on Clarification of the Misconceptions of Big Data Processing in Dynamic and Opportunistic Environments

The MapReduce Paradigm

Classification of big datasets using heterogeneous ensemble classifiers in Apache Spark based on MapReduce paradigm

Heart Disease Prediction using Cluster Based MapReduce Paradigm

DEVELOPMENT OF ALGORITHM OF SMART GEOGRAPHIC AREA

mapreduce paradigm
Recently Published Documents