Big Data

Author(s):  
Piyush Kumar Shukla ◽  
Madhuvan Dixit

In this chapter, Big Data provide large-volume, complex structure, heterogeneous and irregular growing data sets include multiple and autonomous different resources. In this chapter, With the growing improvement of networking sites, image information storing capacity become big issue too, Big Data concept are most growing expanding in all technical area and knowledge engineering domains, including physical, medical and paramedical sciences. Here a data-driven method consist demand-driven aggregation of information and knowledge mining and analysis, user interest prototyping, security and privacy aspects has been presented.

2021 ◽  
Vol 8 (5) ◽  
pp. 73-83
Author(s):  
Ibrahim A. Atoum ◽  
◽  
Ismail M. Keshta ◽  

Big data has been used by different companies to deliver simple products and provide enhanced customer insights through predictive technology such as artificial intelligence. Big data is a field that mainly deals with the extraction and systemic analysis of large data sets to help businesses discover trends. Today, many companies use Big Data to facilitate growth in different functional areas as well as expand their ability to handle large customer databases. Big data has grown the demand for information management experts such that many software companies are increasingly investing in firms that specialize in data management and analytics. Nevertheless, the issue of data protection or privacy is a threat to big data management. This article presents some of the major concerns surrounding the application and use of Big Data about challenges of security and privacy of data stored on technological devices. The paper also discusses some of the current studies being undertaken aimed at addressing security and privacy issues in Big Data.


Author(s):  
Mohammad Hossein Fazel Zarandi ◽  
Reyhaneh Gamasaee

Big data is a new ubiquitous term for massive data sets having large, more varied and complex structure with the complexities and difficulties of storing, analyzing and visualizing for further processes or results. The use of Big Data in health is a new and exciting field. A wide range of use cases for Big Data and analytics in healthcare will benefit best practice development, outcomes analysis, prediction, and surveillance. Consequently, the aim of this chapter is to provide an overview of Big Data in Healthcare systems including two applications of Big Data analysis in healthcare. The first one is understanding disease outcomes through analyzing Big Data, and the second one is the application of Big Data in genetics, biological, and molecular fields. Moreover, characteristics and challenges of healthcare Big Data analysis as well as technologies and software used for Big Data analysis are reviewed.


2021 ◽  
Vol 2021 ◽  
pp. 1-19
Author(s):  
Liangshun Wu ◽  
Hengjin Cai

Big data is a term used for very large data sets. Digital equipment produces vast amounts of images every day; the need for image encryption is increasingly pronounced, for example, to safeguard the privacy of the patients’ medical imaging data in cloud disk. There is an obvious contradiction between the security and privacy and the widespread use of big data. Nowadays, the most important engine to provide confidentiality is encryption. However, block ciphering is not suitable for the huge data in a real-time environment because of the strong correlation among pixels and high redundancy; stream ciphering is considered a lightweight solution for ciphering high-definition images (i.e., high data volume). For a stream cipher, since the encryption algorithm is deterministic, the only thing you can do is to make the key “look random.” This article proves that the probability that the digit 1 appears in the midsection of a Zeckendorf representation is constant, which can be utilized to generate the pseudorandom numbers. Then, a novel stream cipher key generator (ZPKG) is proposed to encrypt high-definition images that need transferring. The experimental results show that the proposed stream ciphering method, with the keystream of which satisfies Golomb’s randomness postulates, is faster than RC4 and LSFR with indistinguishable performance on hardware depletion, and the method is highly key sensitive and shows good resistance against noise attacks and statistical attacks.


Author(s):  
Humam Khalid Yaseen ◽  
Ahmed Mahdi Obaid

Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. These useful informations for companies or organizations with the help of gaining richer and deeper insights and getting an advantage over the competition. For this reason, big data implementations need to be analyzed and executed as accurately as possible. In this paper; Firstly, we will discuss what big data and how it is defined according to different sources; Secondly, what are the characteristics of big data and where should it be used; Thirdly, the architecture of big data is discussed along with the different models of Big data; Fourthly, what are some potential applications of big data and how will it make the job easier for the persisting machines and users; Finally, we will discuss the future of Big data.


2018 ◽  
Vol 7 (4.5) ◽  
pp. 689
Author(s):  
Sarada. B ◽  
Vinayaka Murthy. M ◽  
Udaya Rani. V

Now a days data is increasing exponentially daily in terms of velocity, variety and volume which is also known as Big data. When the dataset has small number of dimensions, limited number of clusters and less number of data points the existing traditional clustering al- gorithms will give the expected results. As we know this is the Big Data age, with large volume of data sets through the traditional clus- tering algorithms we will not be able to get expected results. So there is a need to develop a new approach which gives better accuracy and computational time for large volume of data processing. The Proposed new System Architecture is a combination of canopy, Kmeans and RK sorting algorithm through Map Reduce Hadoop frame work platform. The analysis shows that the large volume of data processing will take less computational time and higher accuracy, and the RK sorting does not require swapping of elements and stack spaces. 


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2014 ◽  
Author(s):  
Pankaj K. Agarwal ◽  
Thomas Moelhave
Keyword(s):  
Big Data ◽  

2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Hossein Ahmadvand ◽  
Fouzhan Foroutan ◽  
Mahmood Fathy

AbstractData variety is one of the most important features of Big Data. Data variety is the result of aggregating data from multiple sources and uneven distribution of data. This feature of Big Data causes high variation in the consumption of processing resources such as CPU consumption. This issue has been overlooked in previous works. To overcome the mentioned problem, in the present work, we used Dynamic Voltage and Frequency Scaling (DVFS) to reduce the energy consumption of computation. To this goal, we consider two types of deadlines as our constraint. Before applying the DVFS technique to computer nodes, we estimate the processing time and the frequency needed to meet the deadline. In the evaluation phase, we have used a set of data sets and applications. The experimental results show that our proposed approach surpasses the other scenarios in processing real datasets. Based on the experimental results in this paper, DV-DVFS can achieve up to 15% improvement in energy consumption.


Sign in / Sign up

Export Citation Format

Share Document