scholarly journals Deep Feature Extraction via Sparse Autoencoder for Intrusion Detection System

2020 ◽  
Author(s):  
Cao Xiaopeng ◽  
Qu Hongyan

The massive network traffic and high-dimensional features affect detection performance. In order to improve the efficiency and performance of detection, whale optimization sparse autoencoder model (WO-SAE) is proposed. Firstly, sparse autoencoder performs unsupervised training on high-dimensional raw data and extracts low-dimensional features of network traffic. Secondly, the key parameters of sparse autoencoder are optimized automatically by whale optimization algorithm to achieve better feature extraction ability. Finally, gated recurrent unit is used to classify the time series data. The experimental results show that the proposed model is superior to existing detection algorithms in accuracy, precision, and recall. And the accuracy presents 98.69%. WO-SAE model is a novel approach that reduces the user’s reliance on deep learning expertise.

Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4112 ◽  
Author(s):  
Se-Min Lim ◽  
Hyeong-Cheol Oh ◽  
Jaein Kim ◽  
Juwon Lee ◽  
Jooyoung Park

Recently, wearable devices have become a prominent health care application domain by incorporating a growing number of sensors and adopting smart machine learning technologies. One closely related topic is the strategy of combining the wearable device technology with skill assessment, which can be used in wearable device apps for coaching and/or personal training. Particularly pertinent to skill assessment based on high-dimensional time series data from wearable sensors is classifying whether a player is an expert or a beginner, which skills the player is exercising, and extracting some low-dimensional representations useful for coaching. In this paper, we present a deep learning-based coaching assistant method, which can provide useful information in supporting table tennis practice. Our method uses a combination of LSTM (Long short-term memory) with a deep state space model and probabilistic inference. More precisely, we use the expressive power of LSTM when handling high-dimensional time series data, and state space model and probabilistic inference to extract low-dimensional latent representations useful for coaching. Experimental results show that our method can yield promising results for characterizing high-dimensional time series patterns and for providing useful information when working with wearable IMU (Inertial measurement unit) sensors for table tennis coaching.


Author(s):  
Navendu S. Patil ◽  
Joseph P. Cusumano

Detecting bifurcations in noisy and/or high-dimensional physical systems is an important problem in nonlinear dynamics. Near bifurcations, the dynamics of even a high dimensional system is typically dominated by its behavior on a low dimensional manifold. Since the system is sensitive to perturbations near bifurcations, they can be detected by looking at the apparent deterministic structure generated by the interaction between the noise and low-dimensional dynamics. We use minimal hidden Markov models built from the noisy time series to quantify this deterministic structure at the period-doubling bifurcations in the two-well forced Duffing oscillator perturbed by noise. The apparent randomness in the system is characterized using the entropy rate of the discrete stochastic process generated by partitioning time series data. We show that as the bifurcation parameter is varied, sharp changes in the statistical complexity and the entropy rate can be used to locate incipient bifurcations.


2014 ◽  
Vol 24 (12) ◽  
pp. 1430033 ◽  
Author(s):  
Huanfei Ma ◽  
Tianshou Zhou ◽  
Kazuyuki Aihara ◽  
Luonan Chen

The prediction of future values of time series is a challenging task in many fields. In particular, making prediction based on short-term data is believed to be difficult. Here, we propose a method to predict systems' low-dimensional dynamics from high-dimensional but short-term data. Intuitively, it can be considered as a transformation from the inter-variable information of the observed high-dimensional data into the corresponding low-dimensional but long-term data, thereby equivalent to prediction of time series data. Technically, this method can be viewed as an inverse implementation of delayed embedding reconstruction. Both methods and algorithms are developed. To demonstrate the effectiveness of the theoretical result, benchmark examples and real-world problems from various fields are studied.


2018 ◽  
Vol 15 (147) ◽  
pp. 20180695 ◽  
Author(s):  
Simone Cenci ◽  
Serguei Saavedra

Biotic interactions are expected to play a major role in shaping the dynamics of ecological systems. Yet, quantifying the effects of biotic interactions has been challenging due to a lack of appropriate methods to extract accurate measurements of interaction parameters from experimental data. One of the main limitations of existing methods is that the parameters inferred from noisy, sparsely sampled, nonlinear data are seldom uniquely identifiable. That is, many different parameters can be compatible with the same dataset and can generalize to independent data equally well. Hence, it is difficult to justify conclusive assertions about the effect of biotic interactions without information about their associated uncertainty. Here, we develop an ensemble method based on model averaging to quantify the uncertainty associated with the effect of biotic interactions on community dynamics from non-equilibrium ecological time-series data. Our method is able to detect the most informative time intervals for each biotic interaction within a multivariate time series and can be easily adapted to different regression schemes. Overall, this novel approach can be used to associate a time-dependent uncertainty with the effect of biotic interactions. Moreover, because we quantify uncertainty with minimal assumptions about the data-generating process, our approach can be applied to any data for which interactions among variables strongly affect the overall dynamics of the system.


Author(s):  
Kamil Faber ◽  
Roberto Corizzo ◽  
Bartlomiej Sniezynski ◽  
Michael Baron ◽  
Nathalie Japkowicz

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Wenmin Li ◽  
Sanqi Sun ◽  
Shuo Zhang ◽  
Hua Zhang ◽  
Yijie Shi

Aim. The purpose of this study is how to better detect attack traffic in imbalance datasets. The deep learning technology has played an important role in detecting malicious network traffic in recent years. However, it suffers serious imbalance distribution of data if the traffic model skews towards the modeling in the benign direction, because only a small portion of traffic is malicious, while most network traffic is benign. That is the reason why the authors wrote this manuscript. Methods. We propose a cost-sensitive approach to improve the HTTP traffic detection performance with imbalanced data and also present a character-level abstract feature extraction approach that can provide features with clear decision boundaries in addition. Finally, we design a spark-based HTTP traffic detection system based on these two approaches. Results. The methods proposed in this paper work well in imbalanced datasets. Compared to other methods, the experiment results indicate that our system has F1-score in a high precision. Conclusion. For imbalanced HTTP traffic detection, we confirmed that the method of feature extraction and the cost function is very effective. In the future, we may focus on how to use the cost function to further improve detection performance.


2019 ◽  
Vol 2019 ◽  
pp. 1-19
Author(s):  
Mingai Li ◽  
Hongwei Xi ◽  
Xiaoqing Zhu

Due to the nonlinear and high-dimensional characteristics of motor imagery electroencephalography (MI-EEG), it can be challenging to get high online accuracy. As a nonlinear dimension reduction method, landmark maximum variance unfolding (L-MVU) can completely retain the nonlinear features of MI-EEG. However, L-MVU still requires considerable computation costs for out-of-sample data. An incremental version of L-MVU (denoted as IL-MVU) is proposed in this paper. The low-dimensional representation of the training data is generated by L-MVU. For each out-of-sample data, its nearest neighbors will be found in the high-dimensional training samples and the corresponding reconstruction weight matrix be calculated to generate its low-dimensional representation as well. IL-MVU is further combined with the dual-tree complex wavelet transform (DTCWT), which develops a hybrid feature extraction method (named as IL-MD). IL-MVU is applied to extract the nonlinear features of the specific subband signals, which are reconstructed by DTCWT and have the obvious event-related synchronization/event-related desynchronization phenomenon. The average energy features of α and β waves are calculated simultaneously. The two types of features are fused and are evaluated by a linear discriminant analysis classifier. Based on the two public datasets with 12 subjects, extensive experiments were conducted. The average recognition accuracies of 10-fold cross-validation are 92.50% on Dataset 3b and 88.13% on Dataset 2b, and they gain at least 1.43% and 3.45% improvement, respectively, compared to existing methods. The experimental results show that IL-MD can extract more accurate features with relatively lower consumption cost, and it also has better feature visualization and self-adaptive characteristics to subjects. The t-test results and Kappa values suggest the proposed feature extraction method reaches statistical significance and has high consistency in classification.


Sign in / Sign up

Export Citation Format

Share Document