Chinese Traffic Police Gesture Recognition in Complex Scene

Author(s):  
Fan Guo ◽  
Zixing Cai ◽  
Jin Tang

2021 ◽  
Vol 11 (24) ◽  
pp. 11951
Author(s):  
Kang Liu ◽  
Ying Zheng ◽  
Junyi Yang ◽  
Hong Bao ◽  
Haoming Zeng

For an automated driving system to be robust, it needs to recognize not only fixed signals such as traffic signs and traffic lights, but also gestures used by traffic police. With the aim to achieve this requirement, this paper proposes a new gesture recognition technology based on a graph convolutional network (GCN) according to an analysis of the characteristics of gestures used by Chinese traffic police. To begin, we used a spatial–temporal graph convolutional network (ST-GCN) as a base network while introducing the attention mechanism, which enhanced the effective features of gestures used by traffic police and balanced the information distribution of skeleton joints in the spatial dimension. Next, to solve the problem of the former graph structure only representing the physical structure of the human body, which cannot capture the potential effective features, this paper proposes an adaptive graph structure (AGS) model to explore the hidden feature between traffic police gesture nodes and a temporal attention mechanism (TAS) to extract features in the temporal dimension. In this paper, we established a traffic police gesture dataset, which contained 20,480 videos in total, and an ablation study was carried out to verify the effectiveness of the method we proposed. The experiment results show that the proposed method improves the accuracy of traffic police gesture recognition to a certain degree; the top-1 is 87.72%, and the top-3 is 95.26%. In addition, to validate the method’s generalization ability, we also carried out an experiment on the Kinetics–Skeleton dataset in this paper; the results show that the proposed method is better than some of the existing action-recognition algorithms.





2019 ◽  
Vol 2019 ◽  
pp. 1-7 ◽  
Author(s):  
Peng Liu ◽  
Xiangxiang Li ◽  
Haiting Cui ◽  
Shanshan Li ◽  
Yafei Yuan

Hand gesture recognition is an intuitive and effective way for humans to interact with a computer due to its high processing speed and recognition accuracy. This paper proposes a novel approach to identify hand gestures in complex scenes by the Single-Shot Multibox Detector (SSD) deep learning algorithm with 19 layers of a neural network. A benchmark database with gestures is used, and general hand gestures in the complex scene are chosen as the processing objects. A real-time hand gesture recognition system based on the SSD algorithm is constructed and tested. The experimental results show that the algorithm quickly identifies humans’ hands and accurately distinguishes different types of gestures. Furthermore, the maximum accuracy is 99.2%, which is significantly important for human-computer interaction application.



2016 ◽  
Vol 76 (6) ◽  
pp. 8915-8936 ◽  
Author(s):  
Fan Guo ◽  
Jin Tang ◽  
Xile Wang


Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 551
Author(s):  
Xin Xiong ◽  
Haoyuan Wu ◽  
Weidong Min ◽  
Jianqiang Xu ◽  
Qiyan Fu ◽  
...  

Traffic police gesture recognition is important in automatic driving. Most existing traffic police gesture recognition methods extract pixel-level features from RGB images which are uninterpretable because of a lack of gesture skeleton features and may result in inaccurate recognition due to background noise. Existing deep learning methods are not suitable for handling gesture skeleton features because they ignore the inevitable connection between skeleton joint coordinate information and gestures. To alleviate the aforementioned issues, a traffic police gesture recognition method based on a gesture skeleton extractor (GSE) and a multichannel dilated graph convolution network (MD-GCN) is proposed. To extract discriminative and interpretable gesture skeleton coordinate information, a GSE is proposed to extract skeleton coordinate information and remove redundant skeleton joints and bones. In the gesture discrimination stage, GSE-based features are introduced into the proposed MD-GCN. The MD-GCN constructs a graph convolution with a multichannel dilated to enlarge the receptive field, which extracts body topological and spatiotemporal action features from skeleton coordinates. Comparison experiments with state-of-the-art methods were conducted on a public dataset. The results show that the proposed method achieves an accuracy rate of 98.95%, which is the best and at least 6% higher than that of the other methods.



Author(s):  
Zhijie Fang ◽  
Wuqiang Zhang ◽  
Zijie Guo ◽  
Rong Zhi ◽  
Baofeng Wang ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document