Novel Methodologies for Processing Structured Big Data Using Hadoop Framework

Author(s):  
Prashant Bhat ◽  
Prajna Hegde
Keyword(s):  
Big Data ◽  
2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


Displays ◽  
2021 ◽  
Vol 70 ◽  
pp. 102061
Author(s):  
Amartya Hatua ◽  
Badri Narayan Subudhi ◽  
Veerakumar T. ◽  
Ashish Ghosh

2018 ◽  
Vol 11 (04) ◽  
Author(s):  
Rahul Kumar Chawda ◽  
Ghanshyam Thakur
Keyword(s):  
Big Data ◽  

Author(s):  
Orazio Tomarchio ◽  
Giuseppe Di Modica ◽  
Marco Cavallo ◽  
Carmelo Polito

Advances in the communication technologies, along with the birth of new communication paradigms leveraging on the power of the social, has fostered the production of huge amounts of data. Old-fashioned computing paradigms are unfit to handle the dimensions of the data daily produced by the countless, worldwide distributed sources of information. So far, the MapReduce has been able to keep the promise of speeding up the computation over Big Data within a cluster. This article focuses on scenarios of worldwide distributed Big Data. While stigmatizing the poor performance of the Hadoop framework when deployed in such scenarios, it proposes the definition of a Hierarchical Hadoop Framework (H2F) to cope with the issues arising when Big Data are scattered over geographically distant data centers. The article highlights the novelty introduced by the H2F with respect to other hierarchical approaches. Tests run on a software prototype are also reported to show the increase of performance that H2F is able to achieve in geographical scenarios over a plain Hadoop approach.


Array ◽  
2019 ◽  
Vol 1-2 ◽  
pp. 100002 ◽  
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh

Sign in / Sign up

Export Citation Format

Share Document