SqueezeNet and Fusion Network-Based Accurate Fast Fully Convolutional Network for Hand Detection and Gesture Recognition

Existing hand detection methods usually follow the pipeline of multiple stages with high computation cost, i.e., feature extraction, region proposal, bounding box regression, and additional layers for rotated region detection. In this paper, we propose a new Scale Invariant Fully Convolutional Network (SIFCN) trained in an end-to-end fashion to detect hands efficiently. Specifically, we merge the feature maps from high to low layers in an iterative way, which handles different scales of hands better with less time overhead comparing to concatenating them simply. Moreover, we develop the Complementary Weighted Fusion (CWF) block to make full use of the distinctive features among multiple layers to achieve scale invariance. To deal with rotated hand detection, we present the rotation map to get rid of complex rotation and derotation layers. Besides, we design the multi-scale loss scheme to accelerate the training process significantly by adding supervision to the intermediate layers of the network. Compared with the state-of-the-art methods, our algorithm shows comparable accuracy and runs a 4.23 times faster speed on the VIVA dataset and achieves better average precision on Oxford hand detection dataset at a speed of 62.5 fps.

Download Full-text

Proposing Gesture Recognition Algorithm Using Two-Stream Convolutional Network and LSTM

2020 IEEE Eighth International Conference on Communications and Electronics (ICCE) ◽

10.1109/icce48956.2021.9352147 ◽

2021 ◽

Author(s):

Phat Nguyen Huu ◽

Tien Luong Ngoc ◽

Quang Tran Minh

Keyword(s):

Gesture Recognition ◽

Recognition Algorithm ◽

Convolutional Network

Download Full-text

Seismic Images Interpretation to Discover Salt Domes Using Deep Fully Convolutional Network

Journal of Physics Conference Series ◽

10.1088/1742-6596/1818/1/012006 ◽

2021 ◽

Vol 1818 (1) ◽

pp. 012006

Author(s):

Shms Aldeen S. Al-Duri ◽

Amel H. Abbas

Keyword(s):

Convolutional Network ◽

Fully Convolutional Network ◽

Salt Domes ◽

Seismic Images

Download Full-text

Downlink Channel State Information Limited Feedback Using Fully Convolutional Network

2021 IEEE Wireless Communications and Networking Conference (WCNC) ◽

10.1109/wcnc49053.2021.9417350 ◽

2021 ◽

Author(s):

Guanghui Fan ◽

Zhengran He ◽

Jinlong Sun ◽

Guan Gui ◽

Haris Gacanin ◽

...

Keyword(s):

Channel State Information ◽

Limited Feedback ◽

Channel State ◽

Convolutional Network ◽

State Information ◽

Fully Convolutional Network

Download Full-text

MaskNet: A Fully-Convolutional Network to Estimate Inlier Points

2020 International Conference on 3D Vision (3DV) ◽

10.1109/3dv50981.2020.00113 ◽

2020 ◽

Author(s):

Vinit Sarode ◽

Animesh Dhagat ◽

Rangaprasad Arun Srivatsan ◽

Nicolas Zevallos ◽

Simon Lucey ◽

...

Keyword(s):

Convolutional Network ◽

Fully Convolutional Network

Download Full-text

Deep Multi-Stage Approach For Emotional Body Gesture Recognition In Job Interview

The Computer Journal ◽

10.1093/comjnl/bxab011 ◽

2021 ◽

Author(s):

Intissar Khalifa ◽

Ridha Ejbali ◽

Raimondo Schettini ◽

Mourad Zaied

Keyword(s):

Gesture Recognition ◽

Affective Computing ◽

Psychological State ◽

Sources Of Information ◽

Emotion Classification ◽

Convolutional Network ◽

Job Interview ◽

Multi Stage ◽

Representation Technique ◽

The Subject

Abstract Affective computing is a key research topic in artificial intelligence which is applied to psychology and machines. It consists of the estimation and measurement of human emotions. A person’s body language is one of the most significant sources of information during job interview, and it reflects a deep psychological state that is often missing from other data sources. In our work, we combine two tasks of pose estimation and emotion classification for emotional body gesture recognition to propose a deep multi-stage architecture that is able to deal with both tasks. Our deep pose decoding method detects and tracks the candidate’s skeleton in a video using a combination of depthwise convolutional network and detection-based method for 2D pose reconstruction. Moreover, we propose a representation technique based on the superposition of skeletons to generate for each video sequence a single image synthesizing the different poses of the subject. We call this image: ‘history pose image’, and it is used as input to the convolutional neural network model based on the Visual Geometry Group architecture. We demonstrate the effectiveness of our method in comparison with other methods in the state of the art on the standard Common Object in Context keypoint dataset and Face and Body gesture video database.

Download Full-text