Attention-Aware Sampling via Deep Reinforcement Learning for Action Recognition

Deep learning based methods have achieved remarkable progress in action recognition. Existing works mainly focus on designing novel deep architectures to achieve video representations learning for action recognition. Most methods treat sampled frames equally and average all the frame-level predictions at the testing stage. However, within a video, discriminative actions may occur sparsely in a few frames and most other frames are irrelevant to the ground truth and may even lead to a wrong prediction. As a result, we think that the strategy of selecting relevant frames would be a further important key to enhance the existing deep learning based action recognition. In this paper, we propose an attentionaware sampling method for action recognition, which aims to discard the irrelevant and misleading frames and preserve the most discriminative frames. We formulate the process of mining key frames from videos as a Markov decision process and train the attention agent through deep reinforcement learning without extra labels. The agent takes features and predictions from the baseline model as input and generates importance scores for all frames. Moreover, our approach is extensible, which can be applied to different existing deep learning based action recognition models. We achieve very competitive action recognition performance on two widely used action recognition datasets.

Download Full-text

A MULTI-AGENT APPROACH TO POMDPS USING OFF-POLICY REINFORCEMENT LEARNING AND GENETIC ALGORITHMS

International Journal of Computing ◽

10.47839/ijc.19.3.1887 ◽

2020 ◽

pp. 377-386

Author(s):

Samuel Obadan ◽

Zenghui Wang

Keyword(s):

Genetic Algorithm ◽

Reinforcement Learning ◽

Learning Algorithm ◽

Feedforward Neural Networks ◽

Ground Truth ◽

Estimation Accuracy ◽

Offline Learning ◽

Markov Decision ◽

Multi Agent ◽

The Impact

This paper introduces novel concepts for accelerating learning in an off-policy reinforcement learning algorithm for Partially Observable Markov Decision Processes (POMDP) by leveraging multiple agents frame work. Reinforcement learning (RL) algorithm is considerably a slow but elegant approach to learning in an unknown environment. Although the action-value (Q-learning) is faster than the state-value, the rate of convergence to an optimal policy or maximum cumulative reward remains a constraint. Consequently, in an attempt to optimize the learning phase of an RL problem within POMD environment, we present two multi-agent learning paradigms: the multi-agent off-policy reinforcement learning and an ingenious GA (genetic Algorithm) approach for multi-agent offline learning using feedforward neural networks. At the end of the trainings (episodes and epochs) for reinforcement learning and genetic algorithm respectively, we compare the convergence rate for both algorithms with respect to creating the underlying MDPs for POMDP problems. Finally, we demonstrate the impact of layered resampling of Monte CarloвЂ™s particle filter for improving the belief state estimation accuracy with respect to ground truth within POMDP domains. Initial empirical results suggest practicable solutions.

Download Full-text

Deep Learning for Human Action Recognition with Convolution Neural Network

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit206466 ◽

2020 ◽

pp. 376-380

Author(s):

S. Karthickkumar ◽

K. Kumar

Keyword(s):

Neural Network ◽

Health Care ◽

Deep Learning ◽

Action Recognition ◽

Human Activities ◽

Recognition Performance ◽

Human Action Recognition ◽

Human Action ◽

Two Dimensional ◽

Consumer Behavior Analysis

In recent years, deep learning for human action recognition is one of the most popular researches. It has a variety of applications such as surveillance, health care, and consumer behavior analysis, robotics. In this paper to propose a Two-Dimensional (2D) Convolutional Neural Network for recognizing Human Activities. Here the WISDM dataset is used to tarin and test the data. It can have the Activities like sitting, standing and downstairs, upstairs, running. The human activity recognition performance of our 2D-CNN based method which shows 93.17% accuracy.

Download Full-text

Edge-Sensitive Left Ventricle Segmentation Using Deep Reinforcement Learning

Sensors ◽

10.3390/s21072375 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2375

Author(s):

Jingjing Xiong ◽

Lai-Man Po ◽

Kwok Wai Cheung ◽

Pengfei Xian ◽

Yuzhi Zhao ◽

...

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Left Ventricle ◽

Autonomous Driving ◽

Learning Methods ◽

Proposed Model ◽

Markov Decision ◽

Edge Points ◽

Cardiac Diagnosis ◽

Ventricle Segmentation

Deep reinforcement learning (DRL) has been utilized in numerous computer vision tasks, such as object detection, autonomous driving, etc. However, relatively few DRL methods have been proposed in the area of image segmentation, particularly in left ventricle segmentation. Reinforcement learning-based methods in earlier works often rely on learning proper thresholds to perform segmentation, and the segmentation results are inaccurate due to the sensitivity of the threshold. To tackle this problem, a novel DRL agent is designed to imitate the human process to perform LV segmentation. For this purpose, we formulate the segmentation problem as a Markov decision process and innovatively optimize it through DRL. The proposed DRL agent consists of two neural networks, i.e., First-P-Net and Next-P-Net. The First-P-Net locates the initial edge point, and the Next-P-Net locates the remaining edge points successively and ultimately obtains a closed segmentation result. The experimental results show that the proposed model has outperformed the previous reinforcement learning methods and achieved comparable performances compared with deep learning baselines on two widely used LV endocardium segmentation datasets, namely Automated Cardiac Diagnosis Challenge (ACDC) 2017 dataset, and Sunnybrook 2009 dataset. Moreover, the proposed model achieves higher F-measure accuracy compared with deep learning methods when training with a very limited number of samples.

Download Full-text

ASNet: Auto-Augmented Siamese Neural Network for Action Recognition

Sensors ◽

10.3390/s21144720 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4720

Author(s):

Yujia Zhang ◽

Lai-Man Po ◽

Jingjing Xiong ◽

Yasar Abbas Ur REHMAN ◽

Kwok-Wai Cheung

Keyword(s):

Neural Network ◽

Action Recognition ◽

Data Augmentation ◽

Recognition Performance ◽

Human Action Recognition ◽

Human Action ◽

Deep Convolutional Neural Networks ◽

Learning Agent ◽

Markov Decision ◽

The Impact

Human action recognition methods in videos based on deep convolutional neural networks usually use random cropping or its variants for data augmentation. However, this traditional data augmentation approach may generate many non-informative samples (video patches covering only a small part of the foreground or only the background) that are not related to a specific action. These samples can be regarded as noisy samples with incorrect labels, which reduces the overall action recognition performance. In this paper, we attempt to mitigate the impact of noisy samples by proposing an Auto-augmented Siamese Neural Network (ASNet). In this framework, we propose backpropagating salient patches and randomly cropped samples in the same iteration to perform gradient compensation to alleviate the adverse gradient effects of non-informative samples. Salient patches refer to the samples containing critical information for human action recognition. The generation of salient patches is formulated as a Markov decision process, and a reinforcement learning agent called SPA (Salient Patch Agent) is introduced to extract patches in a weakly supervised manner without extra labels. Extensive experiments were conducted on two well-known datasets UCF-101 and HMDB-51 to verify the effectiveness of the proposed SPA and ASNet.

Download Full-text

Deep Learning for Human Action Recognition Survey

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i10.323328 ◽

2018 ◽

Vol 6 (10) ◽

pp. 323-328

Author(s):

K.Kiruba . ◽

D. Shiloah Elizabeth ◽

C Sunil Retmin Raj

Keyword(s):

Deep Learning ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Author response for "Deep learning and reinforcement learning approach on microgrid"

10.1002/2050-7038.12531/v2/response1 ◽

2020 ◽

Author(s):

Kumar Chandrasekaran ◽

Prabaakaran Kandasamy ◽

Srividhya Ramanathan

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Author Response ◽

Learning Approach

Download Full-text

2D and 3D Palmprint and Palm Vein Recognition Based on Neural Architecture Search

International Journal of Automation and Computing ◽

10.1007/s11633-021-1292-1 ◽

2021 ◽

Author(s):

Wei Jia ◽

Wei Xia ◽

Yang Zhao ◽

Hai Min ◽

Yan-Xiang Chen

Keyword(s):

Deep Learning ◽

Recognition Performance ◽

Research Direction ◽

Palmprint Recognition ◽

Neural Architecture ◽

Development Direction ◽

Vein Recognition ◽

Palm Vein ◽

2D And 3D ◽

Important Research Direction

AbstractPalmprint recognition and palm vein recognition are two emerging biometrics technologies. In the past two decades, many traditional methods have been proposed for palmprint recognition and palm vein recognition and have achieved impressive results. In recent years, in the field of artificial intelligence, deep learning has gradually become the mainstream recognition technology because of its excellent recognition performance. Some researchers have tried to use convolutional neural networks (CNNs) for palmprint recognition and palm vein recognition. However, the architectures of these CNNs have mostly been developed manually by human experts, which is a time-consuming and error-prone process. In order to overcome some shortcomings of manually designed CNN, neural architecture search (NAS) technology has become an important research direction of deep learning. The significance of NAS is to solve the deep learning model’s parameter adjustment problem, which is a cross-study combining optimization and machine learning. NAS technology represents the future development direction of deep learning. However, up to now, NAS technology has not been well studied for palmprint recognition and palm vein recognition. In this paper, in order to investigate the problem of NAS-based 2D and 3D palmprint recognition and palm vein recognition in-depth, we conduct a performance evaluation of twenty representative NAS methods on five 2D palmprint databases, two palm vein databases, and one 3D palmprint database. Experimental results show that some NAS methods can achieve promising recognition results. Remarkably, among different evaluated NAS methods, ProxylessNAS achieves the best recognition performance.

Download Full-text

Inverse reinforcement learning in contextual MDPs

Machine Learning ◽

10.1007/s10994-021-05984-x ◽

2021 ◽

Author(s):

Stav Belogolovsky ◽

Philip Korsunsky ◽

Shie Mannor ◽

Chen Tessler ◽

Tom Zahavy

Keyword(s):

Reinforcement Learning ◽

Optimization Problem ◽

Decision Processes ◽

Inverse Reinforcement Learning ◽

Convex Optimization Problem ◽

Reward Function ◽

Dynamic Treatment Regime ◽

Markov Decision ◽

Dynamic Treatment ◽

Recorded Data

AbstractWe consider the task of Inverse Reinforcement Learning in Contextual Markov Decision Processes (MDPs). In this setting, contexts, which define the reward and transition kernel, are sampled from a distribution. In addition, although the reward is a function of the context, it is not provided to the agent. Instead, the agent observes demonstrations from an optimal policy. The goal is to learn the reward mapping, such that the agent will act optimally even when encountering previously unseen contexts, also known as zero-shot transfer. We formulate this problem as a non-differential convex optimization problem and propose a novel algorithm to compute its subgradients. Based on this scheme, we analyze several methods both theoretically, where we compare the sample complexity and scalability, and empirically. Most importantly, we show both theoretically and empirically that our algorithms perform zero-shot transfer (generalize to new and unseen contexts). Specifically, we present empirical experiments in a dynamic treatment regime, where the goal is to learn a reward function which explains the behavior of expert physicians based on recorded data of them treating patients diagnosed with sepsis.

Download Full-text

Spectroscopic and deep learning-based approaches to identify and quantify cerebral microhemorrhages

Scientific Reports ◽

10.1038/s41598-021-88236-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Christian Crouzet ◽

Gwangjin Jeong ◽

Rachel H. Chae ◽

Krystal T. LoPresti ◽

Cody E. Dunn ◽

...

Keyword(s):

Deep Learning ◽

Prussian Blue ◽

Processing Speed ◽

Digital Pathology ◽

Ground Truth ◽

Individual Variability ◽

Rgb Images ◽

Cerebral Microhemorrhages ◽

Phasor Analysis ◽

Better Than

AbstractCerebral microhemorrhages (CMHs) are associated with cerebrovascular disease, cognitive impairment, and normal aging. One method to study CMHs is to analyze histological sections (5–40 μm) stained with Prussian blue. Currently, users manually and subjectively identify and quantify Prussian blue-stained regions of interest, which is prone to inter-individual variability and can lead to significant delays in data analysis. To improve this labor-intensive process, we developed and compared three digital pathology approaches to identify and quantify CMHs from Prussian blue-stained brain sections: (1) ratiometric analysis of RGB pixel values, (2) phasor analysis of RGB images, and (3) deep learning using a mask region-based convolutional neural network. We applied these approaches to a preclinical mouse model of inflammation-induced CMHs. One-hundred CMHs were imaged using a 20 × objective and RGB color camera. To determine the ground truth, four users independently annotated Prussian blue-labeled CMHs. The deep learning and ratiometric approaches performed better than the phasor analysis approach compared to the ground truth. The deep learning approach had the most precision of the three methods. The ratiometric approach has the most versatility and maintained accuracy, albeit with less precision. Our data suggest that implementing these methods to analyze CMH images can drastically increase the processing speed while maintaining precision and accuracy.

Download Full-text

Deep Learning for Human Action Recognition

2021 6th International Conference for Convergence in Technology (I2CT) ◽

10.1109/i2ct51068.2021.9418080 ◽

2021 ◽

Author(s):

R. U. Shekokar ◽

S. N. Kale

Keyword(s):

Deep Learning ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text