A Joint Approach to Detect Malicious URL Based on Attention Mechanism

Author(s):  
Yongfang Peng ◽  
Shengwei Tian ◽  
Long Yu ◽  
Yalong Lv ◽  
Ruijin Wang

To improve the accuracy and automation of malware Uniform Resource Locator (URL) recognition, a joint approach of Convolutional neural network (CNN) and Long-short term memory (LSTM) based on the Attention mechanism (JCLA) is proposed to identify and detect malicious URL. Firstly, the URL features including texture information, lexical information and host information are extracted and filtered, and pre-processed with encode. Then, the feature matrix more relevant to the output are chose according to the weight of the attention mechanism and input to the constructed parallel processing model called CNN_LSTM, combinating CNN and LSTM to get local features. Next, the extracted local features are merged to calculate the global features of the URLs to be detected. Finally, the URLs are classified by the SoftMax classifier using global features, the accuracy of the model in malicious URL recgonition is 98.26%. The experimental results show that the JCLA model proposed in this paper is better than the traditional deep learning model or CNN_LSTM combined model for detecting malicious URLs.

2020 ◽  
Vol 34 (4) ◽  
pp. 515-520
Author(s):  
Chen Zhang ◽  
Qingxu Li ◽  
Xue Cheng

The convolutional neural network (CNN) and long short-term memory (LSTM) network are adept at extracting local and global features, respectively. Both can achieve excellent classification effects. However, the CNN performs poorly in extracting the global contextual information of the text, while LSTM often overlooks the features hidden between words. For text sentiment classification, this paper combines the CNN with bidirectional LSTM (BiLSTM) into a parallel hybrid model called CNN_BiLSTM. Firstly, the CNN was adopted to extract the local features of the text quickly. Next, the BiLSTM was employed to obtain the global text features containing contextual semantics. After that, the features extracted by the two neural networks (NNs) were fused, and processed by Softmax classifier for text sentiment classification. To verify its performance, the CNN_BiLSTM was compared with single NNs like CNN and LSTM, as well as other deep learning (DL) NNs through experiments. The experimental results show that the proposed parallel hybrid model outperformed the contrastive methods in F1-score and accuracy. Therefore, our model can solve text sentiment classification tasks effectively, and boast better practical value than other NNs.


Information ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 243 ◽  
Author(s):  
Zhi-Yuan Zeng ◽  
Jyun-Jie Lin ◽  
Mu-Sheng Chen ◽  
Meng-Hui Chen ◽  
Yan-Qi Lan ◽  
...  

Consumers’ purchase behavior increasingly relies on online reviews. Accordingly, there are more and more deceptive reviews which are harmful to customers. Existing methods to detect spam reviews mainly take the problem as a general text classification task, but they ignore the important features of spam reviews. In this paper, we propose a novel model, which splits a review into three parts: first sentence, middle context, and last sentence, based on the discovery that the first and last sentence express stronger emotion than the middle context. Then, the model uses four independent bidirectional long-short term memory (LSTM) models to encode the beginning, middle, end of a review and the whole review into four document representations. After that, the four representations are integrated into one document representation by a self-attention mechanism layer and an attention mechanism layer. Based on three domain datasets, the results of in-domain and mix-domain experiments show that our proposed method performs better than the compared methods.


2020 ◽  
Vol 91 (6) ◽  
pp. 3433-3443
Author(s):  
Ryota Otake ◽  
Jun Kurima ◽  
Hiroyuki Goto ◽  
Sumio Sawada

Abstract Spatial distribution of seismic intensity plays an important role in emergency response during and immediately after an earthquake. In this study, we propose a deep learning model to predict the seismic intensity based on only the observation records at the seismic stations in a surrounding area. The deep learning model is trained using the observation records at both the input and target stations, and no geological information is used. Once the model is developed, for example, using the data from a temporal seismic array, the model can spatially interpolate the seismic intensity from the sparse layout of the seismic stations. The model consists of long short-term memory cells, which are well-established neural network components for time series analysis. We used observed seismograms in 1996 through 2019 at the Kyoshin Network (K-NET) and Kiban–Kyoshin Network (KiK-net) stations located in the northeastern part of Japan. In our deep learning model, approximately 85% of validation data is successfully classified into seismic intensity scales, which is better than adopting either the maximum or weighted average of the input data. We also apply the deep learning model to earthquake early warning (EEW). The model can predict the seismic intensity accurately and provides a long warning time. We concluded that our approach is a possible future solution for increasing the accuracy of EEW.


2021 ◽  
Vol 9 (4) ◽  
pp. 387
Author(s):  
Yuchao Wang ◽  
Hui Wang ◽  
Dexin Zou ◽  
Huixuan Fu

When ships sail on the sea, the changes of ship motion attitude presents the characteristics of nonlinearity and high randomness. Aiming at the problem of low accuracy of ship roll angle prediction by traditional prediction algorithms and single neural network model, a ship roll angle prediction method based on bidirectional long short-term memory network (Bi-LSTM) and temporal pattern attention mechanism (TPA) combined deep learning model is proposed. Bidirectional long short-term memory network extracts time features from the forward and reverse of the ship roll angle time series, and temporal pattern attention mechanism extracts the time patterns from the deep features of a bidirectional long short-term memory network output state that are beneficial to ship roll angle prediction, ignore other features that contribute less to the prediction. The experimental results of real ship data show that the proposed Bi-LSTM-TPA combined model has a significant reduction in MAPE, MAE, and MSE compared with the LSTM model and the SVM model, which verifies the effectiveness of the proposed algorithm.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7501
Author(s):  
Cunli Mao ◽  
Haoyuan Liang ◽  
Zhengtao Yu ◽  
Yuxin Huang ◽  
Junjun Guo

Finding the news of same case from the large numbers of case-involved news is an important basis for public opinion analysis. Existing text clustering methods usually based on topic models which only use topic and case infomation as the global features of documents, so distinguishing between different cases with similar types remains a challenge. The contents of documents contain rich local features. Taking into account the internal features of news, the information of cases and the contributions provided by different topics, we propose a clustering method of case-involved news, which combines topic network and multi-head attention mechanism. Using case information and topic information to construct a topic network, then extracting the global features by graph convolution network, thus realizing the combination of case information and topic information. At the same time, the local features are extracted by multi-head attention mechanism. Finally, the fusion of global features and local features is realized by variational auto-encoder, and the learned latent representations are used for clustering. The experiments show that the proposed method significantly outperforms the state-of-the-art unsupervised clustering methods.


2021 ◽  
Vol 11 (6) ◽  
pp. 2848
Author(s):  
Pengfei Zhang ◽  
Fenghua Li ◽  
Lidong Du ◽  
Rongjian Zhao ◽  
Xianxiang Chen ◽  
...  

To satisfy the need to accurately monitor emotional stress, this paper explores the effectiveness of the attention mechanism based on the deep learning model CNN (Convolutional Neural Networks)-BiLSTM (Bi-directional Long Short-Term Memory) As different attention mechanisms can cause the framework to focus on different positions of the feature map, this discussion adds attention mechanisms to the CNN layer and the BiLSTM layer separately, and to both the CNN layer and BiLSTM layer simultaneously to generate different CNN–BiLSTM networks with attention mechanisms. ECG (electrocardiogram) data from 34 subjects were collected on the server platform created by the Institute of Psychology of the Chinese Academy of Science and the researches. It verifies that the average accuracy of CNN–BiLSTM is up to 0.865 without any attention mechanism, while the highest average accuracy of 0.868 is achieved using the CNN–attention–based BiLSTM.


2022 ◽  
Vol 355 ◽  
pp. 02022
Author(s):  
Chenglong Zhang ◽  
Li Yao ◽  
Jinjin Zhang ◽  
Junyong Wu ◽  
Baoguo Shan ◽  
...  

Combining actual conditions, power demand forecasting is affected by various uncertain factors such as meteorological factors, economic factors, and diversity of forecasting models, which increase the complexity of forecasting. In response to this problem, taking into account that different time step states will have different effects on the output, the attention mechanism is introduced into the method proposed in this paper, which improves the deep learning model. Improved models of convolutional neural networks (CNN) and long short-term memory (LSTM) that combine the attention mechanism are proposed respectively. Finally, according to the verification results of actual examples, it is proved that the proposed method can obtain a smaller error and the prediction performance are better compared with other models.


2021 ◽  
Vol 12 ◽  
Author(s):  
Mingfeng Jiang ◽  
Jiayan Gu ◽  
Yang Li ◽  
Bo Wei ◽  
Jucheng Zhang ◽  
...  

In recent years, with the development of artificial intelligence, deep learning model has achieved initial success in ECG data analysis, especially the detection of atrial fibrillation. In order to solve the problems of ignoring the correlation between contexts and gradient dispersion in traditional deep convolution neural network model, the hybrid attention-based deep learning network (HADLN) method is proposed to implement arrhythmia classification. The HADLN can make full use of the advantages of residual network (ResNet) and bidirectional long–short-term memory (Bi-LSTM) architecture to obtain fusion features containing local and global information and improve the interpretability of the model through the attention mechanism. The method is trained and verified by using the PhysioNet 2017 challenge dataset. Without loss of generality, the ECG signal is classified into four categories, including atrial fibrillation, noise, other, and normal signals. By combining the fusion features and the attention mechanism, the learned model has a great improvement in classification performance and certain interpretability. The experimental results show that the proposed HADLN method can achieve precision of 0.866, recall of 0.859, accuracy of 0.867, and F1-score of 0.880 on 10-fold cross-validation.


2019 ◽  
Vol 10 (1) ◽  
pp. 205 ◽  
Author(s):  
Chunjun Zheng ◽  
Chunli Wang ◽  
Ning Jia

Speech emotion recognition is a challenging and widely examined research topic in the field of speech processing. The accuracy of existing models in speech emotion recognition tasks is not high, and the generalization ability is not strong. Since the feature set and model design of effective speech directly affect the accuracy of speech emotion recognition, research on features and models is important. Because emotional expression is often correlated with the global features, local features, and model design of speech, it is often difficult to find a universal solution for effective speech emotion recognition. Based on this, the main research purpose of this paper is to generate general emotion features in speech signals from different angles, and use the ensemble learning model to perform emotion recognition tasks. It is divided into the following aspects: (1) Three expert roles of speech emotion recognition are designed. Expert 1 focuses on three-dimensional feature extraction of local signals; expert 2 focuses on extraction of comprehensive information in local data; and expert 3 emphasizes global features: acoustic feature descriptors (low-level descriptors (LLDs)), high-level statistics functionals (HSFs), and local features and their timing relationships. A single-/multiple-level deep learning model that meets expert characteristics is designed for each expert, including convolutional neural network (CNN), bi-directional long short-term memory (BLSTM), and gated recurrent unit (GRU). Convolutional recurrent neural network (CRNN), based on a combination of an attention mechanism, is used for internal training of experts. (2) By designing an ensemble learning model, each expert can play to its own advantages and evaluate speech emotions from different focuses. (3) Through experiments, the performance of various experts and ensemble learning models in emotion recognition is compared in the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpus and the validity of the proposed model is verified.


Author(s):  
Zengyan Hong ◽  
Xiangxiang Zeng ◽  
Leyi Wei ◽  
Xiangrong Liu

Abstract Motivation Identification of enhancer–promoter interactions (EPIs) is of great significance to human development. However, experimental methods to identify EPIs cost too much in terms of time, manpower and money. Therefore, more and more research efforts are focused on developing computational methods to solve this problem. Unfortunately, most existing computational methods require a variety of genomic data, which are not always available, especially for a new cell line. Therefore, it limits the large-scale practical application of methods. As an alternative, computational methods using sequences only have great genome-scale application prospects. Results In this article, we propose a new deep learning method, namely EPIVAN, that enables predicting long-range EPIs using only genomic sequences. To explore the key sequential characteristics, we first use pre-trained DNA vectors to encode enhancers and promoters; afterwards, we use one-dimensional convolution and gated recurrent unit to extract local and global features; lastly, attention mechanism is used to boost the contribution of key features, further improving the performance of EPIVAN. Benchmarking comparisons on six cell lines show that EPIVAN performs better than state-of-the-art predictors. Moreover, we build a general model, which has transfer ability and can be used to predict EPIs in various cell lines. Availability and implementation The source code and data are available at: https://github.com/hzy95/EPIVAN.


Sign in / Sign up

Export Citation Format

Share Document