An empirical study on temporal modeling for online action detection

Complex & Intelligent Systems ◽

10.1007/s40747-021-00534-3 ◽

2021 ◽

Author(s):

Wen Wang ◽

Xiaojiang Peng ◽

Yu Qiao ◽

Jian Cheng

Keyword(s):

Neural Networks ◽

Empirical Study ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Temporal Modeling ◽

Action Detection ◽

Modeling Methods ◽

Feature Extractor ◽

First Time

AbstractOnline action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years. A typical OAD system mainly consists of three modules: a frame-level feature extractor which is usually based on pre-trained deep Convolutional Neural Networks (CNNs), a temporal modeling module, and an action classifier. Among them, the temporal modeling module is crucial which aggregates discriminative information from historical and current features. Though many temporal modeling methods have been developed for OAD and other topics, their effects are lack of investigation on OAD fairly. This paper aims to provide an empirical study on temporal modeling for OAD including four meta types of temporal modeling methods, i.e. temporal pooling, temporal convolution, recurrent neural networks, and temporal attention, and uncover some good practices to produce a state-of-the-art OAD system. Many of them are explored in OAD for the first time, and extensively evaluated with various hyper parameters. Furthermore, based on our empirical study, we present several hybrid temporal modeling methods. Our best networks, i.e. , the hybridization of DCC, LSTM and M-NL, and the hybridization of DCC and M-NL, which outperform previously published results with sizable margins on THUMOS-14 dataset (48.6% vs. 47.2%) and TVSeries dataset (84.3% vs. 83.7%).

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text

AI-driven deep CNN approach for multi-label pathology classification using chest X-Rays

PeerJ Computer Science ◽

10.7717/peerj-cs.495 ◽

2021 ◽

Vol 7 ◽

pp. e495

Author(s):

Saleh Albahli ◽

Hafiz Tayyab Rauf ◽

Abdulelah Algosaibi ◽

Valentina Emilia Balas

Keyword(s):

Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Synthetic Data ◽

X Rays ◽

Deep Convolutional Neural Networks ◽

Current State ◽

Pathology Classification ◽

Wide Range ◽

Multi Class Classification

Artificial intelligence (AI) has played a significant role in image analysis and feature extraction, applied to detect and diagnose a wide range of chest-related diseases. Although several researchers have used current state-of-the-art approaches and have produced impressive chest-related clinical outcomes, specific techniques may not contribute many advantages if one type of disease is detected without the rest being identified. Those who tried to identify multiple chest-related diseases were ineffective due to insufficient data and the available data not being balanced. This research provides a significant contribution to the healthcare industry and the research community by proposing a synthetic data augmentation in three deep Convolutional Neural Networks (CNNs) architectures for the detection of 14 chest-related diseases. The employed models are DenseNet121, InceptionResNetV2, and ResNet152V2; after training and validation, an average ROC-AUC score of 0.80 was obtained competitive as compared to the previous models that were trained for multi-class classification to detect anomalies in x-ray images. This research illustrates how the proposed model practices state-of-the-art deep neural networks to classify 14 chest-related diseases with better accuracy.

Download Full-text

Crop disease identification using state-of-the-art deep convolutional neural networks

Smart Computing ◽

10.1201/9781003167488-21 ◽

2021 ◽

pp. 160-169

Author(s):

P.S. Thakur ◽

T. Sheorey ◽

Aparajita Ojha

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Disease Identification ◽

Crop Disease

Download Full-text

Action detection in office scene based on deep convolutional neural networks

2016 International Conference on Machine Learning and Cybernetics (ICMLC) ◽

10.1109/icmlc.2016.7860906 ◽

2016 ◽

Cited By ~ 1

Author(s):

Shi-Yang Yan ◽

Yu-Di An ◽

Jeremy S. Smith ◽

Bai-Ling Zhang

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks ◽

Action Detection

Download Full-text

Accurate and efficient time-domain classification with adaptive spiking recurrent neural networks

10.1101/2021.03.22.436372 ◽

2021 ◽

Author(s):

Bojian Yin ◽

Federico Corradi ◽

Sander M. Bohté

Keyword(s):

Neural Networks ◽

Time Domain ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Spiking Neurons ◽

Recurrent Networks ◽

Computationally Efficient ◽

Hardware Implementations ◽

Comparable Performance ◽

The Time Domain

ABSTRACTInspired by more detailed modeling of biological neurons, Spiking neural networks (SNNs) have been investigated both as more biologically plausible and potentially more powerful models of neural computation, and also with the aim of extracting biological neurons’ energy efficiency; the performance of such networks however has remained lacking compared to classical artificial neural networks (ANNs). Here, we demonstrate how a novel surrogate gradient combined with recurrent networks of tunable and adaptive spiking neurons yields state-of-the-art for SNNs on challenging benchmarks in the time-domain, like speech and gesture recognition. This also exceeds the performance of standard classical recurrent neural networks (RNNs) and approaches that of the best modern ANNs. As these SNNs exhibit sparse spiking, we show that they theoretically are one to three orders of magnitude more computationally efficient compared to RNNs with comparable performance. Together, this positions SNNs as an attractive solution for AI hardware implementations.

Download Full-text

Neural Machine Translation between Vietnamese and English: an Empirical Study

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/35/2/13233 ◽

2019 ◽

Vol 35 (2) ◽

pp. 147-166 ◽

Cited By ~ 2

Author(s):

Hong-Hai Phan-Vu ◽

Viet Trung Tran ◽

Van Nam Nguyen ◽

Hoang Vu Dang ◽

Phan Thuan Do

Keyword(s):

Neural Networks ◽

Empirical Study ◽

Machine Translation ◽

Deep Neural Networks ◽

State Of The Art ◽

Neural Models ◽

Neural Machine Translation ◽

Parallel Corpora ◽

Parameter Search ◽

Popular Language

Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English - French or English - Chinese. However for English - Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our efforts on improving English-Vietnamese translations in two directions: (1) Building the largest open Vietnamese - English corpus to date, and (2) Extensive experiments with the latest neural models to achieve the highest BLEU scores. Our experiments provide practical examples of effectively employing different neural machine translation models with low-resource language pairs.

Download Full-text

Learning to Capitalize with Character-Level Recurrent Neural Networks: An Empirical Study

10.18653/v1/d16-1225 ◽

2016 ◽

Cited By ~ 1

Author(s):

Raymond Hendy Susanto ◽

Hai Leong Chieu ◽

Wei Lu

Keyword(s):

Neural Networks ◽

Empirical Study ◽

Recurrent Neural Networks

Download Full-text

Handwritten Bangla Character Recognition Using the State-of-the-Art Deep Convolutional Neural Networks

Computational Intelligence and Neuroscience ◽

10.1155/2018/6747098 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 18

Author(s):

Md Zahangir Alom ◽

Paheding Sidike ◽

Mahmudul Hasan ◽

Tarek M. Taha ◽

Vijayan K. Asari

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Character Recognition ◽

State Of The Art ◽

The State ◽

Superior Performance ◽

Deep Convolutional Neural Networks ◽

Practical Applications ◽

High Degree

In spite of advances in object recognition technology, handwritten Bangla character recognition (HBCR) remains largely unsolved due to the presence of many ambiguous handwritten characters and excessively cursive Bangla handwritings. Even many advanced existing methods do not lead to satisfactory performance in practice that related to HBCR. In this paper, a set of the state-of-the-art deep convolutional neural networks (DCNNs) is discussed and their performance on the application of HBCR is systematically evaluated. The main advantage of DCNN approaches is that they can extract discriminative features from raw data and represent them with a high degree of invariance to object distortions. The experimental results show the superior performance of DCNN models compared with the other popular object recognition approaches, which implies DCNN can be a good candidate for building an automatic HBCR system for practical applications.

Download Full-text

DEEPCON: protein contact prediction using dilated convolutional neural networks with dropout

Bioinformatics ◽

10.1093/bioinformatics/btz593 ◽

2019 ◽

Vol 36 (2) ◽

pp. 470-477 ◽

Cited By ~ 2

Author(s):

Badri Adhikari

Keyword(s):

Neural Networks ◽

Long Range ◽

Convolutional Neural Networks ◽

High Throughput Sequencing ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Contact Prediction ◽

Learning Techniques ◽

Medium Range ◽

Art Methods

Abstract Motivation Exciting new opportunities have arisen to solve the protein contact prediction problem from the progress in neural networks and the availability of a large number of homologous sequences through high-throughput sequencing. In this work, we study how deep convolutional neural networks (ConvNets) may be best designed and developed to solve this long-standing problem. Results With publicly available datasets, we designed and trained various ConvNet architectures. We tested several recent deep learning techniques including wide residual networks, dropouts and dilated convolutions. We studied the improvements in the precision of medium-range and long-range contacts, and compared the performance of our best architectures with the ones used in existing state-of-the-art methods. The proposed ConvNet architectures predict contacts with significantly more precision than the architectures used in several state-of-the-art methods. When trained using the DeepCov dataset consisting of 3456 proteins and tested on PSICOV dataset of 150 proteins, our architectures achieve up to 15% higher precision when L/2 long-range contacts are evaluated. Similarly, when trained using the DNCON2 dataset consisting of 1426 proteins and tested on 84 protein domains in the CASP12 dataset, our single network achieves 4.8% higher precision than the ensembled DNCON2 method when top L long-range contacts are evaluated. Availability and implementation DEEPCON is available at https://github.com/badriadhikari/DEEPCON/.

Download Full-text

Graph Attention Based Proposal 3D ConvNets for Action Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5893 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4626-4633 ◽

Cited By ~ 1

Author(s):

Jin Li ◽

Xianglong Liu ◽

Zhuofan Zong ◽

Wanru Zhao ◽

Mingyuan Zhang ◽

...

Keyword(s):

Neural Networks ◽

Long Range ◽

Convolutional Neural Networks ◽

Adjacency Matrix ◽

State Of The Art ◽

Original Model ◽

Action Detection ◽

Model Experiments ◽

Temporal And Spatial ◽

Temporal Action

The recent advances in 3D Convolutional Neural Networks (3D CNNs) have shown promising performance for untrimmed video action detection, employing the popular detection framework that heavily relies on the temporal action proposal generations as the input of the action detector and localization regressor. In practice the proposals usually contain strong intra and inter relations among them, mainly stemming from the temporal and spatial variations in the video actions. However, most of existing 3D CNNs ignore the relations and thus suffer from the redundant proposals degenerating the detection performance and efficiency. To address this problem, we propose graph attention based proposal 3D ConvNets (AGCN-P-3DCNNs) for video action detection. Specifically, our proposed graph attention is composed of intra attention based GCN and inter attention based GCN. We use intra attention to learn the intra long-range dependencies inside each action proposal and update node matrix of Intra Attention based GCN, and use inter attention to learn the inter dependencies between different action proposals as adjacency matrix of Inter Attention based GCN. Afterwards, we fuse intra and inter attention to model intra long-range dependencies and inter dependencies simultaneously. Another contribution is that we propose a simple and effective framewise classifier, which enhances the feature presentation capabilities of backbone model. Experiments on two proposal 3D ConvNets based models (P-C3D and P-ResNet) and two popular action detection benchmarks (THUMOS 2014, ActivityNet v1.3) demonstrate the state-of-the-art performance achieved by our method. Particularly, P-C3D embedded with our module achieves average mAP 3.7% improvement on THUMOS 2014 dataset compared to original model.

Download Full-text