An Efficient Semantic Ranked Keyword Search of Big Data Using Map Reduce

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Addressing big data problem using Hadoop and Map Reduce

2012 Nirma University International Conference on Engineering (NUiCONE) ◽

10.1109/nuicone.2012.6493198 ◽

2012 ◽

Cited By ~ 77

Author(s):

Aditya B. Patel ◽

Manashvi Birla ◽

Ushma Nair

Keyword(s):

Big Data ◽

Map Reduce ◽

Data Problem

Download Full-text

Kringing Regressive Map reduce Entropy Feature Extraction based Rocchio Adaptive Boost Ensemble Classifier for Early Disease Diagnosis with Big Data

Dynamic Systems and Applications ◽

10.46719/dsa20213064 ◽

2021 ◽

Vol 30 (6) ◽

Author(s):

A Kaliappan ◽

D Chitra

Keyword(s):

Feature Extraction ◽

Big Data ◽

Disease Diagnosis ◽

Ensemble Classifier ◽

Map Reduce ◽

Early Disease

Download Full-text

PIMRS: achieving privacy and integrity-preserving multi-owner ranked-keyword search over encrypted cloud data

Security and Communication Networks ◽

10.1002/sec.1482 ◽

2016 ◽

Vol 9 (16) ◽

pp. 3765-3776 ◽

Cited By ~ 1

Author(s):

Jinguo Li ◽

Mi Wen ◽

Kejie Lu ◽

Chunhua Gu

Keyword(s):

Keyword Search ◽

Cloud Data ◽

Ranked Keyword Search

Download Full-text

Statement Generation Based on Big Data for Keyword Search

Machine Learning and Intelligent Communications - Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ◽

10.1007/978-3-030-32388-2_41 ◽

2019 ◽

pp. 477-488

Author(s):

Qingqing Liu ◽

Zhengyou Xia

Keyword(s):

Big Data ◽

Keyword Search

Download Full-text

Analysing Distributed Big Data through Hadoop Map Reduce

International Journal of Computer Applications ◽

10.5120/ijca2015907156 ◽

2015 ◽

Vol 129 (15) ◽

pp. 26-31 ◽

Cited By ~ 1

Author(s):

Arpit Gupta ◽

Rajiv Pandey ◽

Komal Verma

Keyword(s):

Big Data ◽

Map Reduce

Download Full-text

A FEASIBILITY STUDY ON BIG DATA INTEGRATION AND ITS METHODOLOGIES FOR HADOOP TECHNIQUES USING MAP REDUCE MODEL

International Journal of Modern Trends in Engineering & Research ◽

10.21884/ijmter.2016.3072.qxemv ◽

2016 ◽

Vol 3 (9) ◽

pp. 230-238

Keyword(s):

Big Data ◽

Data Integration ◽

Feasibility Study ◽

Map Reduce

Download Full-text

Leveraging big-data for business process analytics

The Learning Organization ◽

10.1108/tlo-05-2014-0023 ◽

2015 ◽

Vol 22 (4) ◽

pp. 215-228 ◽

Cited By ~ 21

Author(s):

Alejandro Vera-Baquero ◽

Ricardo Colomo Palacios ◽

Vladimir Stantchev ◽

Owen Molloy

Keyword(s):

Big Data ◽

Process Improvement ◽

Business Process ◽

Business Performance ◽

Business Processes ◽

Heterogeneous Systems ◽

Map Reduce ◽

Heterogeneous Environments ◽

Business Process Improvement ◽

Content Type

Purpose – This paper aims to present a solution that enables organizations to monitor and analyse the performance of their business processes by means of Big Data technology. Business process improvement can drastically influence in the profit of corporations and helps them to remain viable. However, the use of traditional Business Intelligence systems is not sufficient to meet today ' s business needs. They normally are business domain-specific and have not been sufficiently process-aware to support the needs of process improvement-type activities, especially on large and complex supply chains, where it entails integrating, monitoring and analysing a vast amount of dispersed event logs, with no structure, and produced on a variety of heterogeneous environments. This paper tackles this variability by devising different Big-Data-based approaches that aim to gain visibility into process performance. Design/methodology/approach – Authors present a cloud-based solution that leverages (BD) technology to provide essential insights into business process improvement. The proposed solution is aimed at measuring and improving overall business performance, especially in very large and complex cross-organisational business processes, where this type of visibility is hard to achieve across heterogeneous systems. Findings – Three different (BD) approaches have been undertaken based on Hadoop and HBase. We introduced first, a map-reduce approach that it is suitable for batch processing and presents a very high scalability. Secondly, we have described an alternative solution by integrating the proposed system with Impala. This approach has significant improvements in respect with map reduce as it is focused on performing real-time queries over HBase. Finally, the use of secondary indexes has been also proposed with the aim of enabling immediate access to event instances for correlation in detriment of high duplication storage and synchronization issues. This approach has produced remarkable results in two real functional environments presented in the paper. Originality/value – The value of the contribution relies on the comparison and integration of software packages towards an integrated solution that is aimed to be adopted by industry. Apart from that, in this paper, authors illustrate the deployment of the architecture in two different settings.

Download Full-text

An Ideal Big Data Architectural Analysis for Medical Image Data Classification or Clustering Using the Map-Reduce Frame Work

Lecture Notes in Electrical Engineering - ICCCE 2020 ◽

10.1007/978-981-15-7961-5_134 ◽

2020 ◽

pp. 1481-1494

Author(s):

Hemanth Kumar Vasireddi ◽

K. Suganya Devi

Keyword(s):

Big Data ◽

Medical Image ◽

Image Data ◽

Data Classification ◽

Map Reduce ◽

Architectural Analysis ◽

Frame Work ◽

Medical Image Data ◽

Image Data Classification

Download Full-text

Big Data

Security, Privacy, and Forensics Issues in Big Data - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-9742-1.ch002 ◽

2020 ◽

pp. 24-65

Author(s):

P. Lalitha Surya Kumari

Keyword(s):

Big Data ◽

Life Cycle ◽

Data Collection ◽

Operating Systems ◽

Data Security ◽

Map Reduce ◽

Security Issues ◽

Big Data Applications

This chapter gives information about the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by big data applications. Big data is one area where we can store, extract, and process a large amount of data. All these data are very often unstructured. Using big data, security functions are required to work over the heterogeneous composition of diverse hardware, operating systems, and network domains. A clearly defined security boundary like firewalls and demilitarized zones (DMZs), conventional security solutions, are not effective for big data as it expands with the help of public clouds. This chapter discusses the different concepts like characteristics, risks, life cycle, and data collection of big data, map reduce components, issues and challenges in big data, cloud secure alliance, approaches to solve security issues, introduction of cybercrime, YARN, and Hadoop components.

Download Full-text