Automating Feature Extraction and Feature Selection in Big Data Security Analytics

Author(s):  
Dimitrios Sisiaridis ◽  
Olivier Markowitch
2019 ◽  
Vol 8 (2) ◽  
pp. 4808-4811

Big data can be A data with large magnitude means it's large volume, velocity and variety. Currently a day's big data is enlarging in a various engineering and science disciplines. Because of rise in quantity of complex targeted dangers and accelerated increase in statistics, the investigation of data becomes overly hard. Now's Big Data security analytics systems rely on, on data that is indecent. As associations start and expand their own data systems - allowing spouses, customers and suppliers to get corporate information in dynamic and new methods and it becomes even vulnerable to data theft and abuse. Attackers have grown to be more adapt at exceptionally concentrated, complex attacks which interrupts static hazard detection measures. Now's strikes have decided by high level technologies aren't discovered before damage was happened. The process is collecting and assessing that the Big Data fast enough to comprise threats and execute past remediation. In this review paper, We're talking concerning procedure just how Big Data is examined with the procedure for Hadoop and the Big Data Security Analytics is Essential to mitigate the safety dangers to Ensure the business data better.


2020 ◽  
Vol 25 (6) ◽  
pp. 729-735
Author(s):  
Venkata Rao Maddumala ◽  
Arunkumar R

This paper intends to present main technique for feature extraction on multimeda getting well versed and a challenging task to handle big data. Analyzing and feature extracting valuable data from high dimensional dataset challenges the bounds of measurable methods and strategies. Conventional techniques in general have less performance while managing high dimensional datasets. Lower test size has consistently been an issue in measurable tests, which get bothered in high dimensional information due to more equivalent or higher component size than the quantity of tests. The intensity of any measurable test is legitimately relative to its capacity to lesser an invalid theory, and test size is a significant central factor in producing probabilities of errors for making substantial ends. Thus one of the effective methods for taking care of high dimensional datasets is by lessening its measurement through feature selection and extraction with the goal that substantial accurate data can be practically performed. Clustering is the act of finding hidden or comparable data in information. It is one of the most widely recognized techniques for realizing useful features where a weight is given to each feature without predefining the various classes. In any feature selection and extraction procedures, the three main considerations of concern are measurable exactness, model interpretability and computational multifaceted nature. For any classification model, it is important to ensure that the productivity of any of these three components isn't undermined. In this manuscript, a Weight Based Feature Extraction Model on Multifaceted Multimedia Big Data (WbFEM-MMB) is proposed which extracts useful features from videos. The feature extraction strategies utilize features from the discrete cosine methods and the features are extracted using a pre-prepared Convolutional Neural Network (CNN). The proposed method is compared with traditional methods and the results show that the proposed method exhibits better performance and accuracy in extracting features from multifaceted multimedia data.


Author(s):  
Vivek K. Verma ◽  
Tarun Jain

This is the age of big data where aggregating information is simple and keeping it economical. Tragically, as the measure of machine intelligible data builds, the capacity to comprehend and make utilization of it doesn't keep pace with its development. In content-based image retrieval (CBIR) applications, every database needs its comparing parameter setting for feature extraction. CBIR is the application of computer vision techniques to the image retrieval problem that is the problem of searching for digital images in large databases. In any case, the vast majority of the CBIR frameworks perform ordering by an arrangement of settled and pre-particular parameters. All the major machine-learning-based search algorithms have discussed in this chapter for better understanding related with the image retrieval accuracy. The efficiency of FS using machine learning compared with some other search algorithms and observed for the improvement of the CBIR system.


Big Data consist large volumes of data sets with various formats i.e., structured, unstructured and semi structured. Big Data requires security because day by day attackers attack on it in different manner. Big Data Security Analytics analyses Big Data for finding various threats and complex attacks. By increasing the number of targeting attacks on data and one side rapid growing of data, it is too difficult to analyze accurately. The Security Analytics Systems are used the untrusted data. So, strong security analytical tools are required to analyze the data. The organizations and industries exchange the data through networks dynamically, so this may become more vulnerable for data misusing and theft. Attackers are more advanced in the attacking on data that the existing security mechanisms are not identified before damaging. At present, the collecting and analyzing various attacks is major challenging task for Security Analytics Systems, to take suitable decision. In this research paper, we have addressed about Hadoop tool that how it analyses Big Data and how Big Data Security Analytics is applied to analyze the various threats and securing the business data more accurately.


2018 ◽  
Vol 7 (S1) ◽  
pp. 96-100
Author(s):  
Venkata Rao Maddumala ◽  
R. Arunkumar ◽  
S. Arivalagan

With the fast advancement of the Big Data, Big Data innovations have risen as a key data investigation apparatus, in which, feature extraction and data bunching calculations are considered as a basic part for data examination. Nonetheless, there has been constrained research that tends to the difficulties crosswise over Big Data and along these lines proposing an exploration motivation is vital to illuminate the examination challenges for bunching Big Data. By handling this particular viewpoint – grouping calculation in Big Data, this paper looks at on Big Data advancements, identified with feature determination and data bunching calculations and conceivable uses. In view of our survey, this paper distinguishes an arrangement of research difficulties that can be utilized as an exploration plan for the Big Data bunching research. This exploration plan goes for distinguishing and crossing over the examination holes between Big Data feature choice and grouping calculations.


2020 ◽  
Vol 13 (4) ◽  
pp. 790-797
Author(s):  
Gurjit Singh Bhathal ◽  
Amardeep Singh Dhiman

Background: In current scenario of internet, large amounts of data are generated and processed. Hadoop framework is widely used to store and process big data in a highly distributed manner. It is argued that Hadoop Framework is not mature enough to deal with the current cyberattacks on the data. Objective: The main objective of the proposed work is to provide a complete security approach comprising of authorisation and authentication for the user and the Hadoop cluster nodes and to secure the data at rest as well as in transit. Methods: The proposed algorithm uses Kerberos network authentication protocol for authorisation and authentication and to validate the users and the cluster nodes. The Ciphertext-Policy Attribute- Based Encryption (CP-ABE) is used for data at rest and data in transit. User encrypts the file with their own set of attributes and stores on Hadoop Distributed File System. Only intended users can decrypt that file with matching parameters. Results: The proposed algorithm was implemented with data sets of different sizes. The data was processed with and without encryption. The results show little difference in processing time. The performance was affected in range of 0.8% to 3.1%, which includes impact of other factors also, like system configuration, the number of parallel jobs running and virtual environment. Conclusion: The solutions available for handling the big data security problems faced in Hadoop framework are inefficient or incomplete. A complete security framework is proposed for Hadoop Environment. The solution is experimentally proven to have little effect on the performance of the system for datasets of different sizes.


Sign in / Sign up

Export Citation Format

Share Document