scholarly journals DBDNet: Learning Bi-directional Dynamics for Early Action Prediction

Author(s):  
Guoliang Pang ◽  
Xionghui Wang ◽  
Jian-Fang Hu ◽  
Qing Zhang ◽  
Wei-Shi Zheng

Predicting future actions from observed partial videos is very challenging as the missing future is uncertain and sometimes has multiple possibilities. To obtain a reliable future estimation, a novel encoder-decoder architecture is proposed for integrating the tasks of synthesizing future motions from observed videos and reconstructing observed motions from synthesized future motions in an unified framework, which can capture the bi-directional dynamics depicted in partial videos along the temporal (past-to-future) direction and reverse chronological (future-back-to-past) direction. We then employ a bi-directional long short-term memory (Bi-LSTM) architecture to exploit the learned bi-directional dynamics for predicting early actions. Our experiments on two benchmark action datasets show that learning bi-directional dynamics benefits the early action prediction and our system clearly outperforms the state-of-the-art methods.

Author(s):  
Xingjian Lai ◽  
Huanyi Shui ◽  
Jun Ni

Throughput bottlenecks define and constrain the productivity of a production line. Prediction of future bottlenecks provides a great support for decision-making on the factory floor, which can help to foresee and formulate appropriate actions before production to improve the system throughput in a cost-effective manner. Bottleneck prediction remains a challenging task in literature. The difficulty lies in the complex dynamics of manufacturing systems. There are multiple factors collaboratively affecting bottleneck conditions, such as machine performance, machine degradation, line structure, operator skill level, and product release schedules. These factors impact on one another in a nonlinear manner and exhibit long-term temporal dependencies. State-of-the-art research utilizes various assumptions to simplify the modeling by reducing the input dimensionality. As a result, those models cannot accurately reflect complex dynamics of the bottleneck in a manufacturing system. To tackle this problem, this paper will propose a systematic framework to design a two-layer Long Short-Term Memory (LSTM) network tailored to the dynamic bottleneck prediction problem in multi-job manufacturing systems. This neural network based approach takes advantage of historical high dimensional factory floor data to predict system bottlenecks dynamically considering the future production planning inputs. The model is demonstrated with data from an automotive underbody assembly line. The result shows that the proposed method can achieve higher prediction accuracy compared with current state-of-the-art approaches.


2020 ◽  
Vol 23 (65) ◽  
pp. 124-135
Author(s):  
Imane Guellil ◽  
Marcelo Mendoza ◽  
Faical Azouaou

This paper presents an analytic study showing that it is entirely possible to analyze the sentiment of an Arabic dialect without constructing any resources. The idea of this work is to use the resources dedicated to a given dialect \textit{X} for analyzing the sentiment of another dialect \textit{Y}. The unique condition is to have \textit{X} and \textit{Y} in the same category of dialects. We apply this idea on Algerian dialect, which is a Maghrebi Arabic dialect that suffers from limited available tools and other handling resources required for automatic sentiment analysis. To do this analysis, we rely on Maghrebi dialect resources and two manually annotated sentiment corpus for respectively Tunisian and Moroccan dialect. We also use a large corpus for Maghrebi dialect. We use a state-of-the-art system and propose a new deep learning architecture for automatically classify the sentiment of Arabic dialect (Algerian dialect). Experimental results show that F1-score is up to 83% and it is achieved by Multilayer Perceptron (MLP) with Tunisian corpus and with Long short-term memory (LSTM) with the combination of Tunisian and Moroccan. An improvement of 15% compared to its closest competitor was observed through this study. Ongoing work is aimed at manually constructing an annotated sentiment corpus for Algerian dialect and comparing the results


Symmetry ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1290 ◽  
Author(s):  
Rahman ◽  
Siddiqui

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.


Author(s):  
Jing Wang ◽  
Yingwei Pan ◽  
Ting Yao ◽  
Jinhui Tang ◽  
Tao Mei

Image paragraph generation is the task of producing a coherent story (usually a paragraph) that describes the visual content of an image. The problem nevertheless is not trivial especially when there are multiple descriptive and diverse gists to be considered for paragraph generation, which often happens in real images. A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure. In this paper, we present a new design --- Convolutional Auto-Encoding (CAE) that purely employs convolutional and deconvolutional auto-encoding framework for topic modeling on the region-level features of an image. Furthermore, we propose an architecture, namely CAE plus Long Short-Term Memory (dubbed as CAE-LSTM), that novelly integrates the learnt topics in support of paragraph generation. Technically, CAE-LSTM capitalizes on a two-level LSTM-based paragraph generation framework with attention mechanism. The paragraph-level LSTM captures the inter-sentence dependency in a paragraph, while sentence-level LSTM is to generate one sentence which is conditioned on each learnt topic. Extensive experiments are conducted on Stanford image paragraph dataset, and superior results are reported when comparing to state-of-the-art approaches. More remarkably, CAE-LSTM increases CIDEr performance from 20.93% to 25.15%.


2021 ◽  
Vol 25 (3) ◽  
pp. 1671-1687
Author(s):  
Andreas Wunsch ◽  
Tanja Liesch ◽  
Stefan Broda

Abstract. It is now well established to use shallow artificial neural networks (ANNs) to obtain accurate and reliable groundwater level forecasts, which are an important tool for sustainable groundwater management. However, we observe an increasing shift from conventional shallow ANNs to state-of-the-art deep-learning (DL) techniques, but a direct comparison of the performance is often lacking. Although they have already clearly proven their suitability, shallow recurrent networks frequently seem to be excluded from the study design due to the euphoria about new DL techniques and its successes in various disciplines. Therefore, we aim to provide an overview on the predictive ability in terms of groundwater levels of shallow conventional recurrent ANNs, namely non-linear autoregressive networks with exogenous input (NARX) and popular state-of-the-art DL techniques such as long short-term memory (LSTM) and convolutional neural networks (CNNs). We compare the performance on both sequence-to-value (seq2val) and sequence-to-sequence (seq2seq) forecasting on a 4-year period while using only few, widely available and easy to measure meteorological input parameters, which makes our approach widely applicable. Further, we also investigate the data dependency in terms of time series length of the different ANN architectures. For seq2val forecasts, NARX models on average perform best; however, CNNs are much faster and only slightly worse in terms of accuracy. For seq2seq forecasts, mostly NARX outperform both DL models and even almost reach the speed of CNNs. However, NARX are the least robust against initialization effects, which nevertheless can be handled easily using ensemble forecasting. We showed that shallow neural networks, such as NARX, should not be neglected in comparison to DL techniques especially when only small amounts of training data are available, where they can clearly outperform LSTMs and CNNs; however, LSTMs and CNNs might perform substantially better with a larger dataset, where DL really can demonstrate its strengths, which is rarely available in the groundwater domain though.


2021 ◽  
Author(s):  
Naresh Kumar Thapa K ◽  
N. Duraipandian

Abstract Malicious traffic classification is the initial and primary step for any network-based security systems. This traffic classification systems include behavior-based anomaly detection system and Intrusion Detection System. Existing methods always relies on the conventional techniques and process the data in the fixed sequence, which may leads to performance issues. Furthermore, conventional techniques require proper annotation to process the volumetric data. Relying on the data annotation for efficient traffic classification may leads to network loops and bandwidth issues within the network. To address the above-mentioned issues, this paper presents a novel solution based on artificial intelligence perspective. The key idea of this paper is to propose a novel malicious classification system using Long Short-Term Memory (LSTM) model. To validate the efficiency of the proposed model, an experimental setup along with experimental validation is carried out. From the experimental results, it is proven that the proposed model is better in terms of accuracy, throughput when compared to the state-of-the-art models. Further, the accuracy of the proposed model outperforms the existing state of the art models with increase in 5% and overall 99.5% in accuracy.


2017 ◽  
Vol 24 (1) ◽  
pp. 77-90 ◽  
Author(s):  
REKIA KADARI ◽  
YU ZHANG ◽  
WEINAN ZHANG ◽  
TING LIU

AbstractNeural Network-based approaches have recently produced good performances in Natural language tasks, such as Supertagging. In the supertagging task, a Supertag (Lexical category) is assigned to each word in an input sequence. Combinatory Categorial Grammar Supertagging is a more challenging problem than various sequence-tagging problems, such as part-of-speech (POS) tagging and named entity recognition due to the large number of the lexical categories. Specifically, simple Recurrent Neural Network (RNN) has shown to significantly outperform the previous state-of-the-art feed-forward neural networks. On the other hand, it is well known that Recurrent Networks fail to learn long dependencies. In this paper, we introduce a new neural network architecture based on backward and Bidirectional Long Short-Term Memory (BLSTM) Networks that has the ability to memorize information for long dependencies and benefit from both past and future information. State-of-the-art methods focus on previous information, whereas BLSTM has access to information in both previous and future directions. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short-Term Memory (LSTM) networks are more precise and successful than both unidirectional and bidirectional standard RNNs. Experiment results reveal the effectiveness of our proposed method on both in-domain and out-of-domain datasets. Experiments show improvements about (1.2 per cent) over standard RNN.


Author(s):  
Sheena Christabel Pravin ◽  
M. Palanivelan

In this paper, the Deep Long-short term memory Autoencoder (DLAE), a regularized deep learning model, is proposed for the automatic severity assessment of phonological deviations which are crucial stuttering markers in children. This automatic noninvasive severity assessment plays a paramount role in prevenient diagnosis, progress inference, and post-care for the patients with specific speech disorder. The proposed model is an implementation of a multi-layered Autoencoder in the Encoder–Decoder architecture of the Long-Short Term Memory (LSTM) model with hierarchically appended hidden layers and hidden units. The DLAE has definite advantage over the baseline Autoencoders. During the training phase, the proposed DLAE reconstructs the phonological features in an unsupervised fashion and the latent bottleneck features are extracted from the Encoder. The trained and regularized DLAE model with drop out is then used to predict the severity of the phonological deviation with high precision and classification accuracy compared to the baseline models.


2018 ◽  
Vol 28 (06) ◽  
pp. 1750061 ◽  
Author(s):  
Jianyong Wang ◽  
Lei Zhang ◽  
Yuanyuan Chen ◽  
Zhang Yi

Connections play a crucial role in neural network (NN) learning because they determine how information flows in NNs. Suitable connection mechanisms may extensively enlarge the learning capability and reduce the negative effect of gradient problems. In this paper, a new delay connection is proposed for Long Short-Term Memory (LSTM) unit to develop a more sophisticated recurrent unit, called Delay Connected LSTM (DCLSTM). The proposed delay connection brings two main merits to DCLSTM with introducing no extra parameters. First, it allows the output of the DCLSTM unit to maintain LSTM, which is absent in the LSTM unit. Second, the proposed delay connection helps to bridge the error signals to previous time steps and allows it to be back-propagated across several layers without vanishing too quickly. To evaluate the performance of the proposed delay connections, the DCLSTM model with and without peephole connections was compared with four state-of-the-art recurrent model on two sequence classification tasks. DCLSTM model outperformed the other models with higher accuracy and F1[Formula: see text]score. Furthermore, the networks with multiple stacked DCLSTM layers and the standard LSTM layer were evaluated on Penn Treebank (PTB) language modeling. The DCLSTM model achieved lower perplexity (PPL)/bit-per-character (BPC) than the standard LSTM model. The experiments demonstrate that the learning of the DCLSTM models is more stable and efficient.


Sign in / Sign up

Export Citation Format

Share Document