scholarly journals An Efficient Classification of Intrusion in Bigdata based on Chaos Game Optimization and Ensemble SVM

Author(s):  
A Ponmalar ◽  
V Dhanakoti

Abstract The growing popularity of the internet and network services has resulted in an increase in data in all fields. The data are increasing on the daily basis with high speed. This also creates some daunting issues such as security, storage, and so on. Meanwhile, the detection of intrusion from the big data in the ultra-high-speed environment is a critical task. Several intrusion detection methods are carried out to classify the big data based on intrusion and without intrusion. The optimum accuracy of big data classification, however, has yet to be achieved. Hence we proposed a novel ensemble SVM Model, in which the ensemble SVM is incorporated with the Chaos Game Optimization (CGO) algorithm, which can be exploited to enhance the classification accuracy. Our method also classifies the intrusion based on their types. It also classifies almost nine attacks as, Exploits, DoS, Backdoor, Generic, Worms, Analysis, Fuzzers, Shellcode, Reconnaissance. The experimental analysis is carried on the UNSW-NB15 big data dataset. The performance metrics precision, accuracy, recall, F-score are analyzed and compared with the state-of-art works such as BAMS-OIF, SAD, SMLsmBDA, and BDPM. The outcomes depict that the proposed work outperforms all the other existing works in terms of classification accuracy.

2021 ◽  
pp. 1-12
Author(s):  
Li Qian

In order to overcome the low classification accuracy of traditional methods, this paper proposes a new classification method of complex attribute big data based on iterative fuzzy clustering algorithm. Firstly, principal component analysis and kernel local Fisher discriminant analysis were used to reduce dimensionality of complex attribute big data. Then, the Bloom Filter data structure is introduced to eliminate the redundancy of the complex attribute big data after dimensionality reduction. Secondly, the redundant complex attribute big data is classified in parallel by iterative fuzzy clustering algorithm, so as to complete the complex attribute big data classification. Finally, the simulation results show that the accuracy, the normalized mutual information index and the Richter’s index of the proposed method are close to 1, the classification accuracy is high, and the RDV value is low, which indicates that the proposed method has high classification effectiveness and fast convergence speed.


2020 ◽  
Vol 7 ◽  
pp. 205566832093858
Author(s):  
Muhammad Raza Ul Islam ◽  
Asim Waris ◽  
Ernest Nlandu Kamavuako ◽  
Shaoping Bai

Introduction While surface-electromyography (sEMG) has been widely used in limb motion detection for the control of exoskeleton, there is an increasing interest to use forcemyography (FMG) method to detect motion. In this paper, we review the applications of two types of motion detection methods. Their performances were experimentally compared in day-to-day classification of forearm motions. The objective is to select a detection method suitable for motion assistance on a daily basis. Methods Comparisons of motion detection with FMG and sEMG were carried out considering classification accuracy (CA), repeatability and training scheme. For both methods, classification of motions was achieved through feed-forward neural network. Repeatability was evaluated on the basis of change in CA between days and also training schemes. Results The experiments shows that day-to-day CA with FMG can reach 84.9%, compared with a CA of 77.8% with sEMG, when the classifiers were trained only on the first day. Moreover, the CA with FMG can reach to 86.5%, comparable to CA of 84.1% with sEMG, if classifiers were trained daily. Conclusions Results suggest that data recorded from FMG is more repeatable in day-to-day testing and therefore FMG-based methods can be more useful than sEMG-based methods for motion detection in applications where exoskeletons are used as needed on a daily basis.


2020 ◽  
pp. 1742-1763
Author(s):  
Neha Bansal ◽  
R.K. Singh ◽  
Arun Sharma

This article describes how classification algorithms have emerged as strong meta-learning techniques to accurately and efficiently analyze the masses of data generated from the widespread use of internet and other sources. In particular, there is need of some mechanism which classifies unstructured data into some organized form. Classification techniques over big transactional database may provide required data to the users from large datasets in a more simplified way. With the intention of organizing and clearly representing the current state of classification algorithms for big data, present paper discusses various concepts and algorithms, and also an exhaustive review of existing classification algorithms over big data classification frameworks and other novel frameworks. The paper provides a comprehensive comparison, both from a theoretical as well as an empirical perspective. The effectiveness of the candidate classification algorithms is measured through a number of performance metrics such as implementation technique, data source validation, and scalability etc.


2017 ◽  
Vol 7 (1.5) ◽  
pp. 97
Author(s):  
T. Surekha ◽  
R. Siva Rama Prasad

The growth of the data is enormous in the current scenario of the developing information technology and performing the data classification is complex both in time and information extraction. Moreover, there are uncertainties in performing the big data classification that are associated with the unbalanced datasets. In order to overcome the issues, a novel method of big data classification is introduced in this paper. The novel method, Log Decision Tree and Map Reduce Framework (LDT-MRF) uses the Log Decision Tree (LDT) and the Map Reduce Framework (MRF) for performing the parallel data classification. The novel parameter termed as Log-entropy is used to select the best feature attribute for data classification. The data classification is performed using the LDT that enables the efficient data classification. Experimentation is carried out using three datasets, namely the Cleveland dataset, Switzerland dataset, and the Breast Cancer dataset. The comparative analysis is carried out using the performance metrics, such as sensitivity, specificity, and accuracy to prove the effectiveness of the proposed method. The sensitivity, specificity, and accuracy of the proposed method is 84.7596%, 74.633%, and 80.9088% respectively, which is greater when compared with the existing methods of big data classification. 


Author(s):  
Neha Bansal ◽  
R.K. Singh ◽  
Arun Sharma

This article describes how classification algorithms have emerged as strong meta-learning techniques to accurately and efficiently analyze the masses of data generated from the widespread use of internet and other sources. In particular, there is need of some mechanism which classifies unstructured data into some organized form. Classification techniques over big transactional database may provide required data to the users from large datasets in a more simplified way. With the intention of organizing and clearly representing the current state of classification algorithms for big data, present paper discusses various concepts and algorithms, and also an exhaustive review of existing classification algorithms over big data classification frameworks and other novel frameworks. The paper provides a comprehensive comparison, both from a theoretical as well as an empirical perspective. The effectiveness of the candidate classification algorithms is measured through a number of performance metrics such as implementation technique, data source validation, and scalability etc.


Big data analytics plays a vital role in today’s environment and a need for the researchers, academicians and industry people. The digital information over the internet is spreading with a very high speed by facebook likes/posts, blogs, tweets, articles, news, website clicks, youtube videos etc in an unstructured form. On the daily basis people around the globe which are in billions fetching, uploading and sharing the information through laptops and mobiles on social media platforms. The data includes structured and unstructured information in the form of blogs, google locations, pictures, videos, voice messages and text etc. The question arises how to process this huge information, because the basic techniques of data processing are not enough to handle the heterogeneous, enormous and Exponential data. Online marketing and ECommerce has become very famous in recent times because all types of businesses are mostly depend on the online transactions and services provided by the seller. Big data analytics has demonstrated to be an aid for such an industry as it removes valuable examples and obscure connections of potential buyer showcase, customer inclinations, purchasing traits and part of other data from mind boggling information sources. The different problems specified above can be resolved by using latest tools available. This paper focuses to provide detailed information about the latest tools and frameworks which are used for big data analytics with comparative assessment.


Author(s):  
B. Venkatesh ◽  
J. Anuradha

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.


Author(s):  
Xuewu Zhang ◽  
Yansheng Gong ◽  
Chen Qiao ◽  
Wenfeng Jing

AbstractThis article mainly focuses on the most common types of high-speed railways malfunctions in overhead contact systems, namely, unstressed droppers, foreign-body invasions, and pole number-plate malfunctions, to establish a deep-network detection model. By fusing the feature maps of the shallow and deep layers in the pretraining network, global and local features of the malfunction area are combined to enhance the network's ability of identifying small objects. Further, in order to share the fully connected layers of the pretraining network and reduce the complexity of the model, Tucker tensor decomposition is used to extract features from the fused-feature map. The operation greatly reduces training time. Through the detection of images collected on the Lanxin railway line, experiments result show that the proposed multiview Faster R-CNN based on tensor decomposition had lower miss probability and higher detection accuracy for the three types faults. Compared with object-detection methods YOLOv3, SSD, and the original Faster R-CNN, the average miss probability of the improved Faster R-CNN model in this paper is decreased by 37.83%, 51.27%, and 43.79%, respectively, and average detection accuracy is increased by 3.6%, 9.75%, and 5.9%, respectively.


Sign in / Sign up

Export Citation Format

Share Document