Automatic Lip Reading Using Convolution Neural Network and Bidirectional Long Short-term Memory

Traditional automatic lip-reading systems generally consist of two stages: feature extraction and recognition, while the handcrafted features are empirical and cannot learn the relevance of lip movement sequence sufficiently. Recently, deep learning approaches have attracted increasing attention, especially the significant improvements of convolution neural network (CNN) applied to image classification and long short-term memory (LSTM) used in speech recognition, video processing and text analysis. In this paper, we propose a hybrid neural network architecture, which integrates CNN and bidirectional LSTM (BiLSTM) for lip reading. First, we extract key frames from each isolated video clip and use five key points to locate mouth region. Then, features are extracted from raw mouth images using an eight-layer CNN. The extracted features have the characteristics of stronger robustness and fault-tolerant capability. Finally, we use BiLSTM to capture the correlation of sequential information among frame features in two directions and the softmax function to predict final recognition result. The proposed method is capable of extracting local features through convolution operations and finding hidden correlation in temporal information from lip image sequences. The evaluation results of lip-reading recognition experiments demonstrate that our proposed method outperforms conventional approaches such as active contour model (ACM) and hidden Markov model (HMM).

Download Full-text

Hybrid technique for heart diseases diagnosis based on convolution neural network and long short-term memory

Applications of Big Data in Healthcare ◽

10.1016/b978-0-12-820203-6.00009-6 ◽

2021 ◽

pp. 261-280

Author(s):

Abdelmegeid Amin Ali ◽

Hassan Shaban Hassan ◽

Eman M. Anwar ◽

Ashish Khanna

Keyword(s):

Neural Network ◽

Short Term Memory ◽

Heart Diseases ◽

Convolution Neural Network ◽

Hybrid Technique ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Comparison Performance of Long Short-Term Memory and Convolution Neural Network Variants on Online Learning Tweet Sentiment Analysis

10.1007/978-981-16-7334-4_1 ◽

2021 ◽

pp. 3-17

Author(s):

Muhammad Syamil Ali ◽

Marina Yusoff

Keyword(s):

Neural Network ◽

Online Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Convolution Neural Network ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Accurate and Efficient Intracranial Hemorrhage Detection and Subtype Classification in 3D CT Scans with Convolutional and Long Short-Term Memory Neural Networks

Sensors ◽

10.3390/s20195611 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5611 ◽

Cited By ~ 1

Author(s):

Mihail Burduja ◽

Radu Tudor Ionescu ◽

Nicolae Verga

Keyword(s):

Neural Network ◽

Intracranial Hemorrhage ◽

Network Architecture ◽

Short Term Memory ◽

Ct Scans ◽

Short Term ◽

Term Memory ◽

Deep Model ◽

Long Short Term Memory ◽

Ct Slices

In this paper, we present our system for the RSNA Intracranial Hemorrhage Detection challenge, which is based on the RSNA 2019 Brain CT Hemorrhage dataset. The proposed system is based on a lightweight deep neural network architecture composed of a convolutional neural network (CNN) that takes as input individual CT slices, and a Long Short-Term Memory (LSTM) network that takes as input multiple feature embeddings provided by the CNN. For efficient processing, we consider various feature selection methods to produce a subset of useful CNN features for the LSTM. Furthermore, we reduce the CT slices by a factor of 2×, which enables us to train the model faster. Even if our model is designed to balance speed and accuracy, we report a weighted mean log loss of 0.04989 on the final test set, which places us in the top 30 ranking (2%) from a total of 1345 participants. While our computing infrastructure does not allow it, processing CT slices at their original scale is likely to improve performance. In order to enable others to reproduce our results, we provide our code as open source. After the challenge, we conducted a subjective intracranial hemorrhage detection assessment by radiologists, indicating that the performance of our deep model is on par with that of doctors specialized in reading CT scans. Another contribution of our work is to integrate Grad-CAM visualizations in our system, providing useful explanations for its predictions. We therefore consider our system as a viable option when a fast diagnosis or a second opinion on intracranial hemorrhage detection are needed.

Download Full-text

Quantification of Mental Workload Using a Cascaded Deep One-dimensional Convolution Neural Network and Bi-directional Long Short-Term Memory Model

10.36227/techrxiv.15066642 ◽

2021 ◽

Author(s):

Vipul Sharma ◽

Mitul Kumar Ahirwal

Keyword(s):

Neural Network ◽

Deep Learning ◽

Short Term Memory ◽

Mental Workload ◽

Binary Classification ◽

Convolution Neural Network ◽

Short Term ◽

Term Memory ◽

One Dimensional ◽

Long Short Term Memory

In this paper, a new cascade one-dimensional convolution neural network (1DCNN) and bidirectional long short-term memory (BLSTM) model has been developed for binary and ternary classification of mental workload (MWL). MWL assessment is important to increase the safety and efficiency in Brain-Computer Interface (BCI) systems and professions where multi-tasking is required. Keeping in mind the necessity of MWL assessment, a two-fold study is presented, firstly binary classification is done to classify MWL into Low and High classes. Secondly, ternary classification is applied to classify MWL into Low, Moderate, and High classes. The cascaded 1DCNN-BLSTM deep learning architecture has been developed and tested over the Simultaneous task EEG workload (STEW) dataset. Unlike recent research in MWL, handcrafted feature extraction and engineering are not done, rather end-to-end deep learning is used over 14 channel EEG signals for classification. Accuracies exceeding the previous state-of-the-art studies have been obtained. In binary and ternary classification accuracies of 96.77% and 95.36% have been achieved with 7-fold cross validation, respectively.

Download Full-text

Application of deep learning methods to predict ionosphere parameters in real time

E3S Web of Conferences ◽

10.1051/e3sconf/202019602007 ◽

2020 ◽

Vol 196 ◽

pp. 02007

Author(s):

Vladimir Mochalov ◽

Anastasia Mochalova

Keyword(s):

Neural Network ◽

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Short Term Memory ◽

Neural Network Architecture ◽

Short Term ◽

Learning Methods ◽

Term Memory ◽

Long Short Term Memory

In this paper, the previously obtained results on recognition of ionograms using deep learning are expanded to predict the parameters of the ionosphere. After the ionospheric parameters have been identified on the ionogram using deep learning in real time, we can predict the parameters for some time ahead on the basis of the new data obtained Examples of predicting the ionosphere parameters using an artificial recurrent neural network architecture long short-term memory are given. The place of the block for predicting the parameters of the ionosphere in the system for analyzing ionospheric data using deep learning methods is shown.

Download Full-text

Weather radar echo prediction method based on convolution neural network and Long Short-Term memory networks for sustainable e-agriculture

Journal of Cleaner Production ◽

10.1016/j.jclepro.2021.126776 ◽

2021 ◽

Vol 298 ◽

pp. 126776

Author(s):

Lei Zhang ◽

Zhenyue Huang ◽

Wei Liu ◽

Zhongli Guo ◽

Zhe Zhang

Keyword(s):

Neural Network ◽

Short Term Memory ◽

Prediction Method ◽

Weather Radar ◽

Convolution Neural Network ◽

Short Term ◽

Term Memory ◽

Radar Echo ◽

Long Short Term Memory

Download Full-text

Quantification of Mental Workload Using a Cascaded Deep One-dimensional Convolution Neural Network and Bi-directional Long Short-Term Memory Model

10.36227/techrxiv.15066642.v2 ◽

2021 ◽

Author(s):

Vipul Sharma ◽

Mitul Kumar Ahirwal

Keyword(s):

Neural Network ◽

Deep Learning ◽

Short Term Memory ◽

Mental Workload ◽

Binary Classification ◽

Convolution Neural Network ◽

Short Term ◽

Term Memory ◽

One Dimensional ◽

Long Short Term Memory

Download Full-text

Improving the Learning Power of Artificial Intelligence Using Multimodal Deep Learning

EPJ Web of Conferences ◽

10.1051/epjconf/202124801017 ◽

2021 ◽

Vol 248 ◽

pp. 01017

Author(s):

Eugene Yu. Shchetinin ◽

Leonid Sevastianov

Keyword(s):

Neural Network ◽

Network Architecture ◽

Short Term Memory ◽

Short Term ◽

Security Systems ◽

Term Memory ◽

Human Voice ◽

Long Short Term Memory ◽

Effective Analysis

Computer paralinguistic analysis is widely used in security systems, biometric research, call centers and banks. Paralinguistic models estimate different physical properties of voice, such as pitch, intensity, formants and harmonics to classify emotions. The main goal is to find such features that would be robust to outliers and will retain variety of human voice properties at the same time. Moreover, the model used must be able to estimate features on a time scale for an effective analysis of voice variability. In this paper a paralinguistic model based on Bidirectional Long Short-Term Memory (BLSTM) neural network is described, which was trained for vocal-based emotion recognition. The main advantage of this network architecture is that each module of the network consists of several interconnected layers, providing the ability to recognize flexible long-term dependencies in data, which is important in context of vocal analysis. We explain the architecture of a bidirectional neural network model, its main advantages over regular neural networks and compare experimental results of BLSTM network with other models.

Download Full-text

Long Short-Term Memory-Convolution Neural Network Based Hybrid Deep Learning Approach for Power Quality Events Classification

Lecture Notes in Networks and Systems - Innovations in Electronics and Communication Engineering ◽

10.1007/978-981-13-3765-9_52 ◽

2019 ◽

pp. 501-510

Author(s):

Rahul ◽

Kapoor Rajiv ◽

M. M. Tripathi

Keyword(s):

Neural Network ◽

Deep Learning ◽

Power Quality ◽

Short Term Memory ◽

Convolution Neural Network ◽

Learning Approach ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

CCG supertagging with bidirectional long short-term memory networks

Natural Language Engineering ◽

10.1017/s1351324917000250 ◽

2017 ◽

Vol 24 (1) ◽

pp. 77-90 ◽

Cited By ~ 5

Author(s):

REKIA KADARI ◽

YU ZHANG ◽

WEINAN ZHANG ◽

TING LIU

Keyword(s):

Neural Network ◽

Network Architecture ◽

Short Term Memory ◽

State Of The Art ◽

Input Sequence ◽

Entity Recognition ◽

Short Term ◽

Term Memory ◽

Pos Tagging ◽

Long Short Term Memory

AbstractNeural Network-based approaches have recently produced good performances in Natural language tasks, such as Supertagging. In the supertagging task, a Supertag (Lexical category) is assigned to each word in an input sequence. Combinatory Categorial Grammar Supertagging is a more challenging problem than various sequence-tagging problems, such as part-of-speech (POS) tagging and named entity recognition due to the large number of the lexical categories. Specifically, simple Recurrent Neural Network (RNN) has shown to significantly outperform the previous state-of-the-art feed-forward neural networks. On the other hand, it is well known that Recurrent Networks fail to learn long dependencies. In this paper, we introduce a new neural network architecture based on backward and Bidirectional Long Short-Term Memory (BLSTM) Networks that has the ability to memorize information for long dependencies and benefit from both past and future information. State-of-the-art methods focus on previous information, whereas BLSTM has access to information in both previous and future directions. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short-Term Memory (LSTM) networks are more precise and successful than both unidirectional and bidirectional standard RNNs. Experiment results reveal the effectiveness of our proposed method on both in-domain and out-of-domain datasets. Experiments show improvements about (1.2 per cent) over standard RNN.

Download Full-text