A Survey on Security of the Hadoop Framework in the Environment of Bigdata

Abstract The world is becoming increasingly digital at the moment. Every day, a significant amount of data is generated by everyone who uses the internet nowadays. The data are critical for carrying out day-to-day operations, as well as assisting corporate management in achieving their objectives and making the best judgments possible based on the information gathered. BigData is the process of merging many hardware and software solutions to deal with extremely huge amounts of data that surpass storage capability. It’s possible that large amounts of data will be generated. Hadoop systems are used in a variety of areas, including healthcare, finance, and government. insurance, and social media, in order to provide a quick and cost-effective big data solution. The Apache Hadoop is a framework for storing and processing data, managing, and distributing large amounts of information over a large number of server nodes. Here are some solutions that work on top of the Apache Hadoop stack to guarantee data security. To get a complete picture of the problem, we decided to conduct an investigation into existing security solutions for Apache Hadoop security in sensitive data which is stored on a huge data platform employing distributed computing on a cluster of commodity devices. The goal of this paper is to provide knowledge of security and Big Data issues.

Download Full-text

Secure sensitive data sharing on a big data platform

Tsinghua Science & Technology ◽

10.1109/tst.2015.7040516 ◽

2015 ◽

Vol 20 (1) ◽

pp. 72-80 ◽

Cited By ~ 37

Author(s):

Xinhua Dong ◽

Ruixuan Li ◽

Heng He ◽

Wanwan Zhou ◽

Zhengyuan Xue ◽

...

Keyword(s):

Big Data ◽

Data Sharing ◽

Sensitive Data ◽

Data Platform

Download Full-text

A Survey on Accelerated Mapreduce for Hadoop

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.03.07 ◽

2017 ◽

Vol 10 (3) ◽

pp. 597-602

Author(s):

Jyotindra Tiwari ◽

Dr. Mahesh Pawar ◽

Dr. Anjajana Pandey

Keyword(s):

Big Data ◽

Data Storage ◽

Energy Efficient ◽

High Performance ◽

Map Reduce ◽

Efficient Computation ◽

Apache Hadoop ◽

Huge Data ◽

Performance Techniques ◽

Big Data Storage

Big Data is defined by 3Vs which stands for variety, volume and velocity. The volume of data is very huge, data exists in variety of file types and data grows very rapidly. Big data storage and processing has always been a big issue. Big data has become even more challenging to handle these days. To handle big data high performance techniques have been introduced. Several frameworks like Apache Hadoop has been introduced to process big data. Apache Hadoop provides map/reduce to process big data. But this map/reduce can be further accelerated. In this paper a survey has been performed for map/reduce acceleration and energy efficient computation in quick time.

Download Full-text

AN EFFICIENT AND COST-EFFECTIVE MATHEMATICAL MODEL TO ANALYZE BIG DATA

Journal of Mountain Area Research ◽

10.53874/jmar.v2i0.31 ◽

2017 ◽

Vol 2 ◽

pp. 23

Author(s):

U. Yashkun ◽

W. Akram ◽

I. A. Memon

Keyword(s):

Mathematical Model ◽

Big Data ◽

Strategic Management ◽

Income Tax ◽

Cost Effective ◽

Piecewise Function ◽

Huge Data ◽

Decision Boundaries ◽

Good Agreement

An efficient and cost-effective piecewise mathematical model is presented to represent a descriptive huge data mathematically. The techniques of function lines as decision boundaries are applied to incorporate the big data of the organization into slope intercept form. Which may be very helpful for a better understanding of discrete data to obtain sustainable and accurate results. Based on the boundaries limitation results of the collected data of the Federal Board of Revenue, the income tax against the income is studied. And finally the reliability of piecewise function to optimize the role of strategic management in any organization is investigated. The results showed that, the slope rate measured in the boundaries of income in percentage or increased slope rate is in good agreement with that predicted by the organization in descriptive form.

Download Full-text

Efficient Security Framework for Sensitive Data sharing On Big Data Platform in Cloud Computing

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i5.277280 ◽

2019 ◽

Vol 7 (5) ◽

pp. 277-280

Author(s):

J. Vimala Roselin ◽

G.M Nasira

Keyword(s):

Cloud Computing ◽

Big Data ◽

Data Sharing ◽

Security Framework ◽

Sensitive Data ◽

Data Platform

Download Full-text

Secure Sharing of Sensitive Data on a Big Data Platform

Bonfring International Journal of Software Engineering and Soft Computing ◽

10.9756/bijsesc.9028 ◽

2019 ◽

Vol 9 (2) ◽

pp. 64-68

Author(s):

Logeswari R ◽

Manimaran V

Keyword(s):

Big Data ◽

Sensitive Data ◽

Data Platform

Download Full-text

Text Mining with Apache Hadoop Over different Hadoop Clusters Architectures

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1866.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 1252-1256

Keyword(s):

Big Data ◽

Text Mining ◽

Real Time ◽

Single Node ◽

Text Documents ◽

Apache Hadoop ◽

Data Platform ◽

Hadoop Distributed File System ◽

Hadoop Platform ◽

Hadoop Clusters

Big data is very much practical for real time applicational systems. One of the mostly used real time application worldwide are on unstructured documents. Large number of documents are managed and maintained through popular leadingBig Data platform is Hadoop. It maintains all the information at Hadoop Distributed File System in Blocks. Irrespective of datasize, BigData has opened its path to store and analyze the data which has consumed time. To overcome this, Hadoophas designed cluster process for large volumes of unstructured data computations. Three different cluster architectures like Standalone, Single node cluster and multi node clusters are considered. In this paper, Big Data allows Hadoop platform to boost the processing speed overlarge datasets through cluster architectures, which are studied and analyzed through text documents from newsgroup20 dataset.It identifies the challenges on text mining and its applications using ApacheHadoop

Download Full-text

Big Data and Cloud Computing

International Journal of Operations Research and Information Systems ◽

10.4018/ijoris.2020070102 ◽

2020 ◽

Vol 11 (3) ◽

pp. 19-38 ◽

Cited By ~ 2

Author(s):

Akansha Gautam ◽

Indranath Chatterjee

Keyword(s):

Cloud Computing ◽

Big Data ◽

Big Data Analytics ◽

Cost Effective ◽

Research Trend ◽

Computing Technology ◽

Open Research ◽

Huge Data ◽

Complicated Process ◽

Future Work

With evolving technology, a huge data is being generated from everywhere in various forms. The driving factors for the evolution of data, such as retail, media, banking, healthcare, and education, leads to a very large and complex collection of data popularly coined as big data. Handling, management, and analysis of big data seem to be a complicated process. Utilising cloud environment for analysing big data is a recent research trend. Big data analytics can provide cost-effective ways to analyse information quickly and helps in decision making, improvement in services or products. This paper aims to critically review the literature to find current issues and research gaps. This study illustrates the existing solutions and methods provided for big data and its rise in cloud computing technology. Furthermore, this paper throws light on the open research challenges in this domain, stating the scope of future work.

Download Full-text

Design and Implementation of Componentized Big Data Platform

Proceedings of the 2020 2nd International Conference on Robotics, Intelligent Control and Artificial Intelligence ◽

10.1145/3438872.3439086 ◽

2020 ◽

Author(s):

Naiwang Guo ◽

Quanjiang Shen ◽

Hongshan Yang

Keyword(s):

Big Data ◽

Design And Implementation ◽

Data Platform

Download Full-text

Construction of a multi-source heterogeneous hybrid platform for big data

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-215138 ◽

2021 ◽

pp. 1-10

Author(s):

Ying Wang ◽

Yiding Liu ◽

Minna Xia

Keyword(s):

Big Data ◽

Data Analysis ◽

Forest Fire ◽

Original Data ◽

Big Data Analysis ◽

Multiple Sources ◽

Data Types ◽

Fire Monitoring ◽

Data Platform

Big data is featured by multiple sources and heterogeneity. Based on the big data platform of Hadoop and spark, a hybrid analysis on forest fire is built in this study. This platform combines the big data analysis and processing technology, and learns from the research results of different technical fields, such as forest fire monitoring. In this system, HDFS of Hadoop is used to store all kinds of data, spark module is used to provide various big data analysis methods, and visualization tools are used to realize the visualization of analysis results, such as Echarts, ArcGIS and unity3d. Finally, an experiment for forest fire point detection is designed so as to corroborate the feasibility and effectiveness, and provide some meaningful guidance for the follow-up research and the establishment of forest fire monitoring and visualized early warning big data platform. However, there are two shortcomings in this experiment: more data types should be selected. At the same time, if the original data can be converted to XML format, the compatibility is better. It is expected that the above problems can be solved in the follow-up research.

Download Full-text

Research on the Construction of Smart Cities by the Big Data Platform of the Blockchain

Journal of Physics Conference Series ◽

10.1088/1742-6596/1883/1/012144 ◽

2021 ◽

Vol 1883 (1) ◽

pp. 012144

Author(s):

Qianwei Ma ◽

Yanxia Yang

Keyword(s):

Big Data ◽

Smart Cities ◽

Data Platform

Download Full-text