scholarly journals Traffic Police Gesture Recognition Based on Gesture Skeleton Extractor and Multichannel Dilated Graph Convolution Network

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 551
Author(s):  
Xin Xiong ◽  
Haoyuan Wu ◽  
Weidong Min ◽  
Jianqiang Xu ◽  
Qiyan Fu ◽  
...  

Traffic police gesture recognition is important in automatic driving. Most existing traffic police gesture recognition methods extract pixel-level features from RGB images which are uninterpretable because of a lack of gesture skeleton features and may result in inaccurate recognition due to background noise. Existing deep learning methods are not suitable for handling gesture skeleton features because they ignore the inevitable connection between skeleton joint coordinate information and gestures. To alleviate the aforementioned issues, a traffic police gesture recognition method based on a gesture skeleton extractor (GSE) and a multichannel dilated graph convolution network (MD-GCN) is proposed. To extract discriminative and interpretable gesture skeleton coordinate information, a GSE is proposed to extract skeleton coordinate information and remove redundant skeleton joints and bones. In the gesture discrimination stage, GSE-based features are introduced into the proposed MD-GCN. The MD-GCN constructs a graph convolution with a multichannel dilated to enlarge the receptive field, which extracts body topological and spatiotemporal action features from skeleton coordinates. Comparison experiments with state-of-the-art methods were conducted on a public dataset. The results show that the proposed method achieves an accuracy rate of 98.95%, which is the best and at least 6% higher than that of the other methods.

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Wenkang Chen ◽  
Shenglian Lu ◽  
Binghao Liu ◽  
Guo Li ◽  
Tingting Qian

Real-time detection of fruits in orchard environments is one of crucial techniques for many precision agriculture applications, including yield estimation and automatic harvesting. Due to the complex conditions, such as different growth periods and occlusion among leaves and fruits, detecting fruits in natural environments is a considerable challenge. A rapid citrus recognition method by improving the state-of-the-art You Only Look Once version 4 (YOLOv4) detector is proposed in this paper. Kinect V2 camera was used to collect RGB images of citrus trees. The Canopy algorithm and the K-Means++ algorithm were then used to automatically select the number and size of the prior frames from these RGB images. An improved YOLOv4 network structure was proposed to better detect smaller citrus under complex backgrounds. Finally, the trained network model was used for sparse training, pruning unimportant channels or network layers in the network, and fine-tuning the parameters of the pruned model to restore some of the recognition accuracy. The experimental results show that the improved YOLOv4 detector works well for detecting different growth periods of citrus in a natural environment, with an average increase in accuracy of 3.15% (from 92.89% to 96.04%). This result is superior to the original YOLOv4, YOLOv3, and Faster R-CNN. The average detection time of this model is 0.06 s per frame at 1920 × 1080 resolution. The proposed method is suitable for the rapid detection of the type and location of citrus in natural environments and can be applied to the application of citrus picking and yield evaluation in actual orchards.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Vanita Jain ◽  
Qiming Wu ◽  
Shivam Grover ◽  
Kshitij Sidana ◽  
Gopal Chaudhary ◽  
...  

In this paper, we present a method for generating bird’s eye video from egocentric RGB videos. Working with egocentric views is tricky since such the view is highly warped and prone to occlusions. On the other hand, a bird’s eye view has a consistent scaling in at least the two dimensions it shows. Moreover, most of the state-of-the-art systems for tasks such as path prediction are built for bird’s eye views of the subjects. We present a deep learning-based approach that transfers the egocentric RGB images captured from a dashcam of a car to bird’s eye view. This is a task of view translation, and we perform two experiments. The first one uses an image-to-image translation method, and the other uses a video-to-video translation. We compare the results of our work with homographic transformation, and our SSIM values are better by a margin of 77% and 14.4%, and the RMSE errors are lower by 40% and 14.6% for image-to-image translation and video-to-video translation, respectively. We also visually show the efficacy and limitations of each method with helpful insights for future research. Compared to previous works that use homography and LIDAR for 3D point clouds, our work is more generalizable and does not require any expensive equipment.


Author(s):  
Ryo Izuta ◽  
Kazuya Murao ◽  
Tsutomu Terada ◽  
Masahiko Tsukamoto

Purpose – This paper aims to propose a gesture recognition method at an early stage. An accelerometer is installed in most current mobile phones, such as iPhones, Android-powered devices and video game controllers for the Wii or PS3, which enables easy and intuitive operations. Therefore, many gesture-based user interfaces that use accelerometers are expected to appear in the future. Gesture recognition systems with an accelerometer generally have to construct models with user’s gesture data before use and recognize unknown gestures by comparing them with the models. Because the recognition process generally starts after the gesture has finished, the output of the recognition result and feedback delay, which may cause users to retry gestures, degrades the interface usability. Design/methodology/approach – The simplest way to achieve early recognition is to start it at a fixed time after a gesture starts. However, the degree of accuracy would decrease if a gesture in an early stage was similar to the others. Moreover, the timing of a recognition has to be capped by the length of the shortest gesture, which may be too early for longer gestures. On the other hand, retreated recognition timing will exceed the length of the shorter gestures. In addition, a proper length of training data has to be found, as the full length of training data does not fit the input data until halfway. To recognize gestures in an early stage, proper recognition timing and a proper length of training data have to be decided. This paper proposes a gesture recognition method used in the early stages that sequentially calculates the distance between the input and training data. The proposed method outputs the recognition result when one candidate has a stronger likelihood of recognition than the other candidates so that similar incorrect gestures are not output. Findings – The proposed method was experimentally evaluated on 27 kinds of gestures and it was confirmed that the recognition process finished 1,000 msec before the end of the gestures on average without deteriorating the level of accuracy. Gestures were recognized in an early stage of motion, which would lead to an improvement in the interface usability and a reduction in the number of incorrect operations such as retried gestures. Moreover, a gesture-based photo viewer was implemented as a useful application of our proposed method, the proposed early gesture recognition system was used in a live unscripted performance and its effectiveness is ensured. Originality/value – Gesture recognition methods with accelerometers generally learn a given user’s gesture data before using the system, then recognizes any unknown gestures by comparing them with the training data. The recognition process starts after a gesture has finished, and therefore, any interaction or feedback depending on the recognition result is delayed. For example, an image on a smartphone screen rotates a few seconds after the device has been tilted, which may cause the user to retry tilting the smartphone even if the first one was correctly recognized. Although many studies on gesture recognition using accelerometers have been done, to the best of the authors’ knowledge, none of these studies has taken the potential delays in output into consideration.


Author(s):  
Megha Chhabra ◽  
Manoj Kumar Shukla ◽  
Kiran Kumar Ravulakollu

: Latent fingerprints are unintentional finger skin impressions left as ridge patterns at crime scenes. A major challenge in latent fingerprint forensics is the poor quality of the lifted image from the crime scene. Forensics investigators are in permanent search of novel outbreaks of the effective technologies to capture and process low quality image. The accuracy of the results depends upon the quality of the image captured in the beginning, metrics used to assess the quality and thereafter level of enhancement required. The low quality of the image collected by low quality scanners, unstructured background noise, poor ridge quality, overlapping structured noise result in detection of false minutiae and hence reduce the recognition rate. Traditionally, Image segmentation and enhancement is partially done manually using help of highly skilled experts. Using automated systems for this work, differently challenging quality of images can be investigated faster. This survey amplifies the comparative study of various segmentation techniques available for latent fingerprint forensics.


2020 ◽  
Vol 29 (6) ◽  
pp. 1153-1164
Author(s):  
Qianyi Xu ◽  
Guihe Qin ◽  
Minghui Sun ◽  
Jie Yan ◽  
Huiming Jiang ◽  
...  

1999 ◽  
Vol 18 (3-4) ◽  
pp. 265-273
Author(s):  
Giovanni B. Garibotto

The paper is intended to provide an overview of advanced robotic technologies within the context of Postal Automation services. The main functional requirements of the application are briefly referred, as well as the state of the art and new emerging solutions. Image Processing and Pattern Recognition have always played a fundamental role in Address Interpretation and Mail sorting and the new challenging objective is now off-line handwritten cursive recognition, in order to be able to handle all kind of addresses in a uniform way. On the other hand, advanced electromechanical and robotic solutions are extremely important to solve the problems of mail storage, transportation and distribution, as well as for material handling and logistics. Finally a short description of new services of Postal Automation is referred, by considering new emerging services of hybrid mail and paper to electronic conversion.


Author(s):  
Alexander Diederich ◽  
Christophe Bastien ◽  
Karthikeyan Ekambaram ◽  
Alexis Wilson

The introduction of automated L5 driving technologies will revolutionise the design of vehicle interiors and seating configurations, improving occupant comfort and experience. It is foreseen that pre-crash emergency braking and swerving manoeuvres will affect occupant posture, which could lead to an interaction with a deploying airbag. This research addresses the urgent safety need of defining the occupant’s kinematics envelope during that pre-crash phase, considering rotated seat arrangements and different seatbelt configurations. The research used two different sets of volunteer tests experiencing L5 vehicle manoeuvres, based in the first instance on 22 50th percentile fit males wearing a lap-belt (OM4IS), while the other dataset is based on 87 volunteers with a BMI range of 19 to 67 kg/m2 wearing a 3-point belt (UMTRI). Unique biomechanics kinematics corridors were then defined, as a function of belt configuration and vehicle manoeuvre, to calibrate an Active Human Model (AHM) using a multi-objective optimisation coupled with a Correlation and Analysis (CORA) rating. The research improved the AHM omnidirectional kinematics response over current state of the art in a generic lap-belted environment. The AHM was then tested in a rotated seating arrangement under extreme braking, highlighting that maximum lateral and frontal motions are comparable, independent of the belt system, while the asymmetry of the 3-point belt increased the occupant’s motion towards the seatbelt buckle. It was observed that the frontal occupant kinematics decrease by 200 mm compared to a lap-belted configuration. This improved omnidirectional AHM is the first step towards designing safer future L5 vehicle interiors.


2021 ◽  
Vol 18 (4) ◽  
pp. 1-22
Author(s):  
Jerzy Proficz

Two novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI implementations and exploits an auxiliary background thread for early data exchange from faster processes to accelerate the performed all-gather operation. The other algorithm, Background Sorted Linear synchronized tree with Broadcast (BSLB), is built upon the already existing PAP-aware gather algorithm, that is, Background Sorted Linear Synchronized tree (BSLS), followed by a regular broadcast distributing gathered data to all participating processes. The background of the imbalanced PAP subject is described, along with the PAP monitoring and evaluation topics. An experimental evaluation of the algorithms based on a proposed mini-benchmark is presented. The mini-benchmark was performed over 2,000 times in a typical HPC cluster architecture with homogeneous compute nodes. The obtained results are analyzed according to different PATs, data sizes, and process numbers, showing that the proposed optimization works well for various configurations, is scalable, and can significantly reduce the all-gather elapsed times, in our case, up to factor 1.9 or 47% in comparison with the best state-of-the-art solution.


2021 ◽  
Vol 13 (12) ◽  
pp. 2417
Author(s):  
Savvas Karatsiolis ◽  
Andreas Kamilaris ◽  
Ian Cole

Estimating the height of buildings and vegetation in single aerial images is a challenging problem. A task-focused Deep Learning (DL) model that combines architectural features from successful DL models (U-NET and Residual Networks) and learns the mapping from a single aerial imagery to a normalized Digital Surface Model (nDSM) was proposed. The model was trained on aerial images whose corresponding DSM and Digital Terrain Models (DTM) were available and was then used to infer the nDSM of images with no elevation information. The model was evaluated with a dataset covering a large area of Manchester, UK, as well as the 2018 IEEE GRSS Data Fusion Contest LiDAR dataset. The results suggest that the proposed DL architecture is suitable for the task and surpasses other state-of-the-art DL approaches by a large margin.


Sign in / Sign up

Export Citation Format

Share Document