Where to Prune: Using LSTM to Guide End-to-end Pruning

Recent years have witnessed the great success of convolutional neural networks (CNNs) in many related fields. However, its huge model size and computation complexity bring in difficulty when deploying CNNs in some scenarios, like embedded system with low computation power. To address this issue, many works have been proposed to prune filters in CNNs to reduce computation. However, they mainly focus on seeking which filters are unimportant in a layer and then prune filters layer by layer or globally. In this paper, we argue that the pruning order is also very significant for model pruning. We propose a novel approach to figure out which layers should be pruned in each step. First, we utilize a long short-term memory (LSTM) to learn the hierarchical characteristics of a network and generate a pruning decision for each layer, which is the main difference from previous works. Next, a channel-based method is adopted to evaluate the importance of filters in a to-be-pruned layer, followed by an accelerated recovery step. Experimental results demonstrate that our approach is capable of reducing 70.1% FLOPs for VGG and 47.5% for Resnet-56 with comparable accuracy. Also, the learning results seem to reveal the sensitivity of each network layer.

Download Full-text

Robust-LSTM: A Novel Approach to Short-Traffic Flow Prediction Based on Signal Decomposition

10.21203/rs.3.rs-658657/v1 ◽

2021 ◽

Author(s):

Erdem Doğan

Keyword(s):

Traffic Flow ◽

Short Term Memory ◽

Performance Measurement System ◽

Flow Data ◽

Short Term ◽

Term Memory ◽

Traffic Flow Prediction ◽

Flow Prediction ◽

Novel Approach ◽

Long Short Term Memory

Abstract Intelligent transport systems need accurate short-term traffic flow forecasts. However, developing a robust short-term traffic flow forecasting approach is a challenge due to the stochastic character of traffic flow. This study proposes a novel approach for short-term traffic flow prediction task namely Robust Long Short Term Memory (R-LSTM) based on Robust Empirical Mode Decomposing (REDM) algorithm and Long Short Term Memory (LSTM). Short-term traffic flow data provided from the Caltrans Performance Measurement System (PeMS) database were used in the training and testing of the model. The dataset was composed of traffic data collected by 25 traffic detectors on different freeways’ main lanes. The time resolution of the dataset was set to 15 minutes, and the Hampel preprocessing algorithm was applied for outlier elimination. The R-LSTM predictions were compared with the state-of-art models, utilizing RMSE, MSE, and MAPE as performance criteria. Performance analyzes for various periods show that R-LSTM is remarkably successful in all time periods. Moreover, developed model performance is significantly higher, especially during mid-day periods when traffic flow fluctuations are high. These results show that R-LSTM is a strong candidate for short-term traffic flow prediction, and can easily adapt to fluctuations in traffic flow. In addition, robust models for short-term predictions can be developed by applying the signal separation method to traffic flow data.

Download Full-text

Cascading 1D-Convnet Bidirectional Long Short Term Memory Network with Modified COCOB Optimizer: A Novel Approach for Protein Secondary Structure Prediction

Chaos Solitons & Fractals ◽

10.1016/j.chaos.2021.111446 ◽

2021 ◽

Vol 153 ◽

pp. 111446

Author(s):

Pravinkumar M. Sonsare ◽

Gunavathi C

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Short Term Memory ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Short Term ◽

Term Memory ◽

Novel Approach ◽

Memory Network ◽

Long Short Term Memory

Download Full-text

Convolutional Neural Networks with LSTM for Intrusion Detection

10.29007/j35r ◽

2020 ◽

Author(s):

Mostofa Ahsan ◽

Kendall Nygard

Keyword(s):

Intrusion Detection ◽

Hybrid Algorithm ◽

Short Term Memory ◽

High Accuracy ◽

Short Term ◽

Network Infrastructure ◽

Network Intrusion ◽

Novel Approach ◽

Network Intrusions ◽

Long Short Term Memory

A variety of attacks are regularly attempted at network infrastructure. With the increasing development of artificial intelligence algorithms, it has become effective to prevent network intrusion for more than two decades. Deep learning methods can achieve high accuracy with a low false alarm rate to detect network intrusions. A novel approach using a hybrid algorithm of Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) is introduced in this paper to provide improved intrusion detection. This bidirectional algorithm showed the highest known accuracy of 99.70% on a standard dataset known as NSL KDD. The performance of this algorithm is measured using precision, false positive, F1 score, and recall which found promising for deployment on live network infrastructure.

Download Full-text

Conversation Modeling on Reddit Using a Graph-Structured LSTM

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00009 ◽

2018 ◽

Vol 6 ◽

pp. 121-132 ◽

Cited By ~ 18

Author(s):

Victoria Zayats ◽

Mari Ostendorf

Keyword(s):

Social Media ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Threaded Discussions ◽

Novel Approach ◽

Proposed Model ◽

Long Short Term Memory ◽

Bidirectional Lstm ◽

Late Stages

This paper presents a novel approach for modeling threaded discussions on social media using a graph-structured bidirectional LSTM (long-short term memory) which represents both hierarchical and temporal conversation structure. In experiments with a task of predicting popularity of comments in Reddit discussions, the proposed model outperforms a node-independent architecture for different sets of input features. Analyses show a benefit to the model over the full course of the discussion, improving detection in both early and late stages. Further, the use of language cues with the bidirectional tree state updates helps with identifying controversial comments.

Download Full-text

CNN-Based Chinese NER with Lexicon Rethinking

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/692 ◽

2019 ◽

Cited By ~ 5

Author(s):

Tao Gui ◽

Ruotian Ma ◽

Qi Zhang ◽

Lujun Zhao ◽

Yu-Gang Jiang ◽

...

Keyword(s):

Short Term Memory ◽

Named Entity Recognition ◽

Entity Recognition ◽

Great Success ◽

Short Term ◽

Named Entity ◽

Word Level ◽

Long Short Term Memory ◽

High Level ◽

Gpu Parallelism

Character-level Chinese named entity recognition (NER) that applies long short-term memory (LSTM) to incorporate lexicons has achieved great success. However, this method fails to fully exploit GPU parallelism and candidate lexicons can conflict. In this work, we propose a faster alternative to Chinese NER: a convolutional neural network (CNN)-based method that incorporates lexicons using a rethinking mechanism. The proposed method can model all the characters and potential words that match the sentence in parallel. In addition, the rethinking mechanism can address the word conflict by feeding back the high-level features to refine the networks. Experimental results on four datasets show that the proposed method can achieve better performance than both word-level and character-level baseline methods. In addition, the proposed method performs up to 3.21 times faster than state-of-the-art methods, while realizing better performance.

Download Full-text

A Novel Approach to Protein Folding Prediction based on Long Short-Term Memory Networks: A Preliminary Investigation and Analysis

2018 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2018.8489514 ◽

2018 ◽

Cited By ~ 1

Author(s):

Leandro Takeshi Hattori ◽

Cesar Manuel Vargas Benitez ◽

Matheus Gutoski ◽

Nelson Marcelo Romero Aquino ◽

Heitor Silverio Lopes

Keyword(s):

Protein Folding ◽

Short Term Memory ◽

Preliminary Investigation ◽

Short Term ◽

Term Memory ◽

Novel Approach ◽

Long Short Term Memory

Download Full-text

DLRCNeg: Deep Learning based Reading Comprehension by handling Negation

10.34048/adcom.2019.phdforumpaper.6 ◽

2019 ◽

Author(s):

Felicia Lilian J. ◽

Sundarakantham K ◽

Mercy Shalinie S.

Keyword(s):

Reading Comprehension ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Activation Function ◽

Attention Mechanism ◽

Question Type ◽

Short Term ◽

Novel Approach ◽

Long Short Term Memory

Question Answer (QA) System for Reading Comprehension (RC) is a computerized approach to retrieve relevant response to the query posted by the users. The underlined concept in developing such a system is to build a human computer interaction. The interactions will be in natural language and we tend to use negation words as a part of our expressions. During the pre-processing stage in Natural Language Processing (NLP) task these negation words gets removed and hence the semantics gets changed. This remains to be an unsolved problem in QA system. In order to maintain the semantics we have proposed a novel approach Hybrid NLP based Bi-directional Long Short Term Memory (Bi-LSTM) with attention mechanism. It deals with the negation words and maintains the semantics of the sentence. We also focus on answering any factoid query (i.e. ’what’, ’when’, ’where’, ’who’) that is raised by the user. For this purpose, the use of attention mechanism with softmax activation function has obtained superior results that matches the question type and process the context information effectively. The experimental results are performed over the SQuAD dataset for reading comprehension and the Stanford Negation dataset is used to perform the negation in the RC sentence. The accuracy of the system over negation is obtained as 93.9% and over the QA system is 87%.

Download Full-text

A Novel Approach for Detection of Fake News using Long Short Term Memory (LSTM)

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2021/201052021 ◽

2021 ◽

Vol 10 (5) ◽

pp. 3062-3066

Keyword(s):

Neural Network ◽

Deep Learning ◽

Text Mining ◽

Short Term Memory ◽

Online Media ◽

Fake News ◽

Short Term ◽

Term Memory ◽

Novel Approach ◽

Long Short Term Memory

Online media for news consumption has doubtful advantages. From one perspective, it has minimal expense, simple access, and fast dispersal of data which leads individuals to search out and devour news from online media. On the other hand, it increases the wide spread of "counterfeit news", i.e., inferior quality news with purposefully bogus data. The broad spread of fake news contrarily affects people and society. Hence, fake news detection in social media has become an emerging research topic that is drawing attention from various researchers. In past, many creators proposed the utilization of text mining procedures and AI strategies to examine textual data and helps to foresee the believability of news. With more computational capacities and to deal with enormous datasets, deep learning models present a better presentation over customary text mining strategies and AI methods. Normally deep learning model, for example, LSTM model can identify complex patterns in the data. Long short term memory is a tree organized recurrent neural network (RNN) used to examine variable length sequential information. In our proposed framework we set up a fake news identification model dependent on LSTM neural network. Openly accessible unstructured news datasets are utilized to evaluate the exhibition of the model. The outcome shows the prevalence and exactness of LSTM model over the customary techniques specifically CNN for fake news recognition.

Download Full-text

Densely connected layer to improve VGGnet-based CRNN for Arabic handwriting text line recognition

International Journal of Hybrid Intelligent Systems ◽

10.3233/his-210009 ◽

2021 ◽

pp. 1-15

Author(s):

Zouhaira Noubigh ◽

Anis Mezghani ◽

Monji Kherallah

Keyword(s):

Neural Networks ◽

Short Term Memory ◽

Text Recognition ◽

Great Success ◽

Arabic Text ◽

Recognition Method ◽

Short Term ◽

Sequence Modeling ◽

Long Short Term Memory ◽

Network Component

In recent years, Deep neural networks (DNNs) have achieved great success in sequence modeling. Several deep models have been used for enhancing Handwriting Text Recognition (HTR). Among these models, Convolutional Neural Networks (CNNs) and Recurrent Neural network especially Long-Short-Term-Memory (LSTM) networks achieve state-of-the-art recognition accuracy. The recognition methods for Arabic text lines have been widely applied in many specific tasks. However, there are still some potential challenges as the lack of available and large Arabic text recognition dataset and the characteristics of Arabic script. In order to address these challenges, we propose an end-to-end recognition method based on convolutional recurrent neural networks (CRNNs), which adds feature reuse network component on the basis of a CRNN. The model is trained and tested on two Arabic text recognition datasets named KHATT and AHTID/MW. The experimental results demonstrate that the proposed method achieves better performance than other methods in the literature.

Download Full-text

3D Skeletal Joints-Based Hand Gesture Spotting and Classification

Applied Sciences ◽

10.3390/app11104689 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4689

Author(s):

Ngoc-Hoang Nguyen ◽

Tran-Dac-Thinh Phan ◽

Soo-Hyung Kim ◽

Hyung-Jeong Yang ◽

Guee-Sang Lee

Keyword(s):

Short Term Memory ◽

Hand Gesture ◽

Short Term ◽

Term Memory ◽

Novel Approach ◽

Gesture Classification ◽

Long Short Term Memory ◽

Lstm Network ◽

Public Datasets ◽

Gesture Spotting

This paper presents a novel approach to continuous dynamic hand gesture recognition. Our approach contains two main modules: gesture spotting and gesture classification. Firstly, the gesture spotting module pre-segments the video sequence with continuous gestures into isolated gestures. Secondly, the gesture classification module identifies the segmented gestures. In the gesture spotting module, the motion of the hand palm and fingers are fed into the Bidirectional Long Short-Term Memory (Bi-LSTM) network for gesture spotting. In the gesture classification module, three residual 3D Convolution Neural Networks based on ResNet architectures (3D_ResNet) and one Long Short-Term Memory (LSTM) network are combined to efficiently utilize the multiple data channels such as RGB, Optical Flow, Depth, and 3D positions of key joints. The promising performance of our approach is obtained through experiments conducted on three public datasets—Chalearn LAP ConGD dataset, 20BN-Jester, and NVIDIA Dynamic Hand gesture Dataset. Our approach outperforms the state-of-the-art methods on the Chalearn LAP ConGD dataset.

Download Full-text