Recent Developments on Security and Reliability in Large-Scale Data Processing with MapReduce

2016 ◽  
Vol 12 (1) ◽  
pp. 49-68 ◽  
Author(s):  
Christian Esposito ◽  
Massimo Ficco

The demand to access to a large volume of data, distributed across hundreds or thousands of machines, has opened new opportunities in commerce, science, and computing applications. MapReduce is a paradigm that offers a programming model and an associated implementation for processing massive datasets in a parallel fashion, by using non-dedicated distributed computing hardware. It has been successfully adopted in several academic and industrial projects for Big Data Analytics. However, since such analytics is increasingly demanded within the context of mission-critical applications, security and reliability in MapReduce frameworks are strongly required in order to manage sensible information, and to obtain the right answer at the right time. In this paper, the authors present the main implementation of the MapReduce programming paradigm, provided by Apache with the name of Hadoop. They illustrate the security and reliability concerns in the context of a large-scale data processing infrastructure. They review the available solutions, and their limitations to support security and reliability within the context MapReduce frameworks. The authors conclude by describing the undergoing evolution of such solutions, and the possible issues for improvements, which could be challenging research opportunities for academic researchers.

2018 ◽  
Vol 7 (3.8) ◽  
pp. 16
Author(s):  
Md Tahsir Ahmed Munna ◽  
Shaikh Muhammad Allayear ◽  
Mirza Mohtashim Alam ◽  
Sheikh Shah Mohammad Motiur Rahman ◽  
Md Samadur Rahman ◽  
...  

MapReduce has become a popular programming model for processing and running large-scale data sets with a parallel, distributed paradigm on a cluster. Hadoop MapReduce is needed especially for large scale data like big data processing. In this paper, we work to modify the Hadoop MapReduce Algorithm and implement it to reduce processing time.  


2014 ◽  
Vol 509 ◽  
pp. 175-181
Author(s):  
Wu Min Pan ◽  
Li Bai Ha

Popularity for the term Cloud-Computing has been increasing in recent years. In addition to the SQL technique, Map-Reduce, a programming model that realizes implementing large-scale data processing, has been a hot topic that is widely discussed through many studies. Many real-world tasks such as data processing for search engines can be parallel-implemented through a simple interface with two functions called Map and Reduce. We focus on comparing the performance of the Hadoop implementation of Map-Reduce with SQL Server through simulations. Hadoop can complete the same query faster than SQL Server. On the other hand, some concerned factors are also tested to see whether they would affect the performance for Hadoop or not. In fact more machines included for data processing can make Hadoop achieve a better performance, especially for a large-scale data set.


2014 ◽  
Vol 10 (3) ◽  
pp. 19-35 ◽  
Author(s):  
K. Amshakala ◽  
R. Nedunchezhian ◽  
M. Rajalakshmi

Over the last few years, data are generated in large volume at a faster rate and there has been a remarkable growth in the need for large scale data processing systems. As data grows larger in size, data quality is compromised. Functional dependencies representing semantic constraints in data are important for data quality assessment. Executing functional dependency discovery algorithms on a single computer is hard and laborious with large data sets. MapReduce provides an enabling technology for large scale data processing. The open-source Hadoop implementation of MapReduce has provided researchers a powerful tool for tackling large-data problems in a distributed manner. The objective of this study is to extract functional dependencies between attributes from large datasets using MapReduce programming model. Attribute entropy is used to measure the inter attribute correlations, and exploited to discover functional dependencies hidden in the data.


2008 ◽  
Vol 25 (5) ◽  
pp. 287-300 ◽  
Author(s):  
B. Martin ◽  
A. Al‐Shabibi ◽  
S.M. Batraneanu ◽  
Ciobotaru ◽  
G.L. Darlea ◽  
...  

2014 ◽  
Vol 26 (6) ◽  
pp. 1316-1331 ◽  
Author(s):  
Gang Chen ◽  
Tianlei Hu ◽  
Dawei Jiang ◽  
Peng Lu ◽  
Kian-Lee Tan ◽  
...  

2018 ◽  
Vol 7 (2.31) ◽  
pp. 240
Author(s):  
S Sujeetha ◽  
Veneesa Ja ◽  
K Vinitha ◽  
R Suvedha

In the existing scenario, a patient has to go to the hospital to take necessary tests, consult a doctor and buy prescribed medicines or use specified healthcare applications. Hence time is wasted at hospitals and in medical shops. In the case of healthcare applications, face to face interaction with the doctor is not available. The downside of the existing scenario can be improved by the Medimate: Ailment diffusion control system with real time large scale data processing. The purpose of medimate is to establish a Tele Conference Medical System that can be used in remote areas. The medimate is configured for better diagnosis and medical treatment for the rural people. The system is installed with Heart Beat Sensor, Temperature Sensor, Ultrasonic Sensor and Load Cell to monitor the patient’s health parameters. The voice instructions are updated for easier access.  The application for enabling video and voice communication with the doctor through Camera and Headphone is installed at both the ends. The doctor examines the patient and prescribes themedicines. The medical dispenser delivers medicine to the patient as per the prescription. The QR code will be generated for each prescription by medimate and that QR code can be used forthe repeated medical conditions in the future. Medical details are updated in the server periodically.  


2019 ◽  
Vol 12 (12) ◽  
pp. 2290-2299
Author(s):  
Azza Abouzied ◽  
Daniel J. Abadi ◽  
Kamil Bajda-Pawlikowski ◽  
Avi Silberschatz

Sign in / Sign up

Export Citation Format

Share Document