A personalized IPTV channel-recommendation mechanism based on the MapReduce framework

MapReduce framework of cloud computing has an effective way to achieve massive text categorization. In this paper a distributed parallel text training algorithm in cloud computing environment based on multi-class Support Vector Machines(SVM) is designed. In cloud computing environment Map tasks realize distributing various types of samples and Reduce tasks realize the specific SVM training. Experimental results show that the execution time of text training decreases with the number of Reduce tasks increasing. Also a parallel text classifying based on cloud computing is designed and implemented, which classify the unknown type texts. Experimental results show that the speed of text classifying increases with the number of Map tasks increasing.

Download Full-text

Efficient indexing and retrieval of patient information from the big data using MapReduce framework and optimisation

Journal of Information Science ◽

10.1177/01655515211013708 ◽

2021 ◽

pp. 016555152110137

Author(s):

N.R. Gladiss Merlin ◽

Vigilson Prem. M

Keyword(s):

Big Data ◽

Similarity Measure ◽

Patient Information ◽

Complex Data ◽

Mapreduce Framework ◽

Maximum Value ◽

User Query ◽

Indexing And Retrieval ◽

Sine Cosine Algorithm ◽

Disparate Source

Large and complex data becomes a valuable resource in biomedical discovery, which is highly facilitated to increase the scientific resources for retrieving the helpful information. However, indexing and retrieving the patient information from the disparate source of big data is challenging in biomedical research. Indexing and retrieving the patient information from big data is performed using the MapReduce framework. In this research, the indexing and retrieval of information are performed using the proposed Jaya-Sine Cosine Algorithm (Jaya–SCA)-based MapReduce framework. Initially, the input big data is forwarded to the mapper randomly. The average of each mapper data is calculated, and these data are forwarded to the reducer, where the representative data are stored. For each user query, the input query is matched with the reducer, and thereby, it switches over to the mapper for retrieving the matched best result. The bilevel matching is performed while retrieving the data from the mapper based on the distance between the query. The similarity measure is computed based on the parametric-enabled similarity measure (PESM), cosine similarity and the proposed Jaya–SCA, which is the integration of the Jaya algorithm and the SCA. Moreover, the proposed Jaya–SCA algorithm attained the maximum value of F-measure, recall and precision of 0.5323, 0.4400 and 0.6867, respectively, using the StatLog Heart Disease dataset.

Download Full-text

Extended Kalman Filter Based Echo State Network for Time Series Prediction using MapReduce Framework

2013 IEEE 9th International Conference on Mobile Ad-hoc and Sensor Networks ◽

10.1109/msn.2013.61 ◽

2013 ◽

Cited By ~ 4

Author(s):

Chunyang Sheng ◽

Jun Zhao ◽

Henry Leung ◽

Wei Wang

Keyword(s):

Time Series ◽

Kalman Filter ◽

Extended Kalman Filter ◽

Time Series Prediction ◽

Echo State Network ◽

Mapreduce Framework

Download Full-text

SimMapReduce: A Simulator for Modeling MapReduce Framework

2011 Fifth FTRA International Conference on Multimedia and Ubiquitous Engineering ◽

10.1109/mue.2011.56 ◽

2011 ◽

Cited By ~ 21

Author(s):

Fei Teng ◽

Lei Yu ◽

Frederic Magoulès

Keyword(s):

Mapreduce Framework

Download Full-text

Investigating MapReduce framework extensions for efficient processing of geographically scattered datasets

ACM SIGMETRICS Performance Evaluation Review ◽

10.1145/2160803.2160876 ◽

2011 ◽

Vol 39 (3) ◽

pp. 116-118 ◽

Cited By ~ 7

Author(s):

Hrishikesh Gadre ◽

Ivan Rodero ◽

Manish Parashar

Keyword(s):

Mapreduce Framework ◽

Efficient Processing

Download Full-text

CRFs based parallel biomedical named entity recognition algorithm employing MapReduce framework

Cluster Computing ◽

10.1007/s10586-015-0426-z ◽

2015 ◽

Vol 18 (2) ◽

pp. 493-505 ◽

Cited By ~ 18

Author(s):

Zhuo Tang ◽

Lingang Jiang ◽

Li Yang ◽

Kenli Li ◽

Keqin Li

Keyword(s):

Named Entity Recognition ◽

Recognition Algorithm ◽

Entity Recognition ◽

Mapreduce Framework ◽

Named Entity ◽

Biomedical Named Entity Recognition

Download Full-text

Moim: A Multi-GPU MapReduce Framework

2013 IEEE 16th International Conference on Computational Science and Engineering ◽

10.1109/cse.2013.190 ◽

2013 ◽

Cited By ~ 3

Author(s):

Mengjun Xie ◽

Kyoung-Don Kang ◽

Can Basaran

Keyword(s):

Mapreduce Framework

Download Full-text

One Size Does Not Fit All: Trade-offs between Misuse Probability and Level of Sanitization for Big Data

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1451.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 3606-3611

Keyword(s):

Big Data ◽

Empirical Study ◽

Cloud Storage ◽

Data Privacy ◽

Public Cloud ◽

Mapreduce Framework ◽

Privacy Concerns ◽

Time Restrictions ◽

Trade Offs ◽

Made In

Big data privacy has assumed importance as the cloud computing became a phenomenal success in providing a remote platform for sharing computing resources without geographical and time restrictions. However, the privacy concerns on the big data being outsourced to public cloud storage are still exist. Different anonymity or sanitization techniques came into existence for protecting big data from privacy attacks. In our prior works, we have proposed a misusability probability based metric to know the probable percentage of misusability. We additionally planned a system that suggests level of sanitization before actually applying privacy protection to big data. It was based on misusability probability. In this paper, our focus is on further evaluation of our misuse probability based sanitization of big data approach by defining an algorithm which willanalyse the trade-offs between misuse probability and level of sanitization. It throws light into the proposed framework and misusability measure besides evaluation of the framework with an empirical study. Empirical study is made in public cloud environment with Amazon EC2 (compute engine), S3 (storage service) and EMR (MapReduce framework). The experimental results revealed the dynamics of the trade-offs between them. The insights help in making well informed decisions while sanitizing big data to ensure that it is protected without losing utility required.

Download Full-text