A Study on Onsite–Offshore Data Security Model for Big Data Applications

Author(s):  
T. N. Manjunath ◽  
S. K. Pushpa ◽  
Ravindra S. Hegadi ◽  
R. A. Archana
Author(s):  
P. Lalitha Surya Kumari

This chapter gives information about the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by big data applications. Big data is one area where we can store, extract, and process a large amount of data. All these data are very often unstructured. Using big data, security functions are required to work over the heterogeneous composition of diverse hardware, operating systems, and network domains. A clearly defined security boundary like firewalls and demilitarized zones (DMZs), conventional security solutions, are not effective for big data as it expands with the help of public clouds. This chapter discusses the different concepts like characteristics, risks, life cycle, and data collection of big data, map reduce components, issues and challenges in big data, cloud secure alliance, approaches to solve security issues, introduction of cybercrime, YARN, and Hadoop components.


2017 ◽  
Vol 28 (06) ◽  
pp. 661-682
Author(s):  
Rashed Mazumder ◽  
Atsuko Miyaji ◽  
Chunhua Su

Security, privacy and data integrity are the critical issues in Big Data application of IoT-enable environment and cloud-based services. There are many upcoming challenges to establish secure computations for Big Data applications. Authenticated encryption (AE) plays one of the core roles for Big Data’s confidentiality, integrity, and real-time security. However, many proposals exist in the research area of authenticated encryption. Generally, there are two concepts of nonce respect and nonce reuse under the security notion of the AE. However, recent studies show that nonce reuse needs to sacrifice security bound of the AE. In this paper, we consider nonce respect scheme and probabilistic encryption scheme which are more efficient and suitable for big data applications. Both schemes are based on keyed function. Our first scheme (FS) operates in parallel mode whose security is based on nonce respect and supports associated data. Furthermore, it needs less call of functions/block-cipher. On the contrary, our second scheme is based on probabilistic encryption. It is expected to be a light solution because of weaker security model construction. Moreover, both schemes satisfy reasonable privacy security bound.


2018 ◽  
Vol 7 (3.6) ◽  
pp. 234 ◽  
Author(s):  
N Sirisha ◽  
K V.D. Kiran

Big Data has become more popular, as it can provide on-demand, reliable and flexible services to users such as storage and its processing. The data security has become a major issue in the Big data. The open source HDFS software is used to store huge amount of data with high throughput and fault tolerance and Map Reduce is used for its computations and processing. However, it is a significant target in the Hadoop system, security model was not designed and became the major drawback of Hadoop software. In terms of storage, meta data security, sensitive data  and also the data security will be an serious issue in HDFS. With the importance of Hadoop in today's enterprises, there is also an increasing trend in providing a high security features in enterprises. Over recent years, only some level of security in Hadoop such as Kerberos and Transparent Data Encryption(TDE),Encryption techniques, hash techniques are shown for Hadoop. This paper, shows the efforts that are made to present Hadoop Authorization security issues using Apache Sentry in HDFS. 


Author(s):  
Prashant Srivastava ◽  
Niraj Kumar Tiwari ◽  
Ali Abbas

Organizations now have knowledge of big data significance, but new challenges stand up with new inventions. These challenges are not only limited to the three Vs of big data, but also to privacy and security. Attacks on big data system ranges from DDoS to information theft, ransomware to end user level security. So implementing security to big data system is a multiple phase-based ongoing process in which security is imposed from perimeter level to distributed file system security, cloud security to data security, storage to data mining security, and so on. In this chapter, the authors have identified some key vulnerable point related to big data and also proposed a security model.


2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


Author(s):  
Jonatan Enes ◽  
Guillaume Fieni ◽  
Roberto R. Exposito ◽  
Romain Rouvoy ◽  
Juan Tourino

Sign in / Sign up

Export Citation Format

Share Document