scholarly journals Human Motion Posture Detection Algorithm Using Deep Reinforcement Learning

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Limin Qi ◽  
Yong Han

To address problems of serious loss of details and low detection definition in the traditional human motion posture detection algorithm, a human motion posture detection algorithm using deep reinforcement learning is proposed. Firstly, the perception ability of deep learning is used to match human motion feature points to obtain human motion posture features. Secondly, normalize the human motion image, take the color histogram distribution of human motion posture as the antigen, search the region close to the motion posture in the image, and take its candidate region as the antibody. By calculating the affinity between the antigen and the antibody, the feature extraction of human motion posture is realized. Finally, using the training characteristics of deep learning network and reinforcement learning network, the change information of human motion posture is obtained, and the design of human motion posture detection algorithm is realized. The results show that when the image resolution is 384 × 256 px, the motion pose contour detection accuracy of this algorithm is 87%. When the image size is 30 MB, the recognition time of this method is only 0.8 s. When the number of iterations is 500, the capture rate of human motion posture details can reach 98.5%. This shows that the proposed algorithm can improve the definition of human motion posture contour, improve the posture detailed capture rate, reduce the loss of detail, and have better effect and performance.

2021 ◽  
Vol 13 (10) ◽  
pp. 1909
Author(s):  
Jiahuan Jiang ◽  
Xiongjun Fu ◽  
Rui Qin ◽  
Xiaoyan Wang ◽  
Zhifeng Ma

Synthetic Aperture Radar (SAR) has become one of the important technical means of marine monitoring in the field of remote sensing due to its all-day, all-weather advantage. National territorial waters to achieve ship monitoring is conducive to national maritime law enforcement, implementation of maritime traffic control, and maintenance of national maritime security, so ship detection has been a hot spot and focus of research. After the development from traditional detection methods to deep learning combined methods, most of the research always based on the evolving Graphics Processing Unit (GPU) computing power to propose more complex and computationally intensive strategies, while in the process of transplanting optical image detection ignored the low signal-to-noise ratio, low resolution, single-channel and other characteristics brought by the SAR image imaging principle. Constantly pursuing detection accuracy while ignoring the detection speed and the ultimate application of the algorithm, almost all algorithms rely on powerful clustered desktop GPUs, which cannot be implemented on the frontline of marine monitoring to cope with the changing realities. To address these issues, this paper proposes a multi-channel fusion SAR image processing method that makes full use of image information and the network’s ability to extract features; it is also based on the latest You Only Look Once version 4 (YOLO-V4) deep learning framework for modeling architecture and training models. The YOLO-V4-light network was tailored for real-time and implementation, significantly reducing the model size, detection time, number of computational parameters, and memory consumption, and refining the network for three-channel images to compensate for the loss of accuracy due to light-weighting. The test experiments were completed entirely on a portable computer and achieved an Average Precision (AP) of 90.37% on the SAR Ship Detection Dataset (SSDD), simplifying the model while ensuring a lead over most existing methods. The YOLO-V4-lightship detection algorithm proposed in this paper has great practical application in maritime safety monitoring and emergency rescue.


Author(s):  
Zhaoliang He ◽  
Hongshan Li ◽  
Zhi Wang ◽  
Shutao Xia ◽  
Wenwu Zhu

With the growth of computer vision-based applications, an explosive amount of images have been uploaded to cloud servers that host such online computer vision algorithms, usually in the form of deep learning models. JPEG has been used as the de facto compression and encapsulation method for images. However, standard JPEG configuration does not always perform well for compressing images that are to be processed by a deep learning model—for example, the standard quality level of JPEG leads to 50% of size overhead (compared with the best quality level selection) on ImageNet under the same inference accuracy in popular computer vision models (e.g., InceptionNet and ResNet). Knowing this, designing a better JPEG configuration for online computer vision-based services is still extremely challenging. First, cloud-based computer vision models are usually a black box to end-users; thus, it is challenging to design JPEG configuration without knowing their model structures. Second, the “optimal” JPEG configuration is not fixed; instead, it is determined by confounding factors, including the characteristics of the input images and the model, the expected accuracy and image size, and so forth. In this article, we propose a reinforcement learning (RL)-based adaptive JPEG configuration framework, AdaCompress. In particular, we design an edge (i.e., user-side) RL agent that learns the optimal compression quality level to achieve an expected inference accuracy and upload image size, only from the online inference results, without knowing details of the model structures. Furthermore, we design an explore-exploit mechanism to let the framework fast switch an agent when it detects a performance degradation, mainly due to the input change (e.g., images captured across daytime and night). Our evaluation experiments using real-world online computer vision-based APIs from Amazon Rekognition, Face++, and Baidu Vision show that our approach outperforms existing baselines by reducing the size of images by one-half to one-third while the overall classification accuracy only decreases slightly. Meanwhile, AdaCompress adaptively re-trains or re-loads the RL agent promptly to maintain the performance.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Hai Wang ◽  
Xinyu Lou ◽  
Yingfeng Cai ◽  
Yicheng Li ◽  
Long Chen

Vehicle detection is one of the most important environment perception tasks for autonomous vehicles. The traditional vision-based vehicle detection methods are not accurate enough especially for small and occluded targets, while the light detection and ranging- (lidar-) based methods are good in detecting obstacles but they are time-consuming and have a low classification rate for different target types. Focusing on these shortcomings to make the full use of the advantages of the depth information of lidar and the obstacle classification ability of vision, this work proposes a real-time vehicle detection algorithm which fuses vision and lidar point cloud information. Firstly, the obstacles are detected by the grid projection method using the lidar point cloud information. Then, the obstacles are mapped to the image to get several separated regions of interest (ROIs). After that, the ROIs are expanded based on the dynamic threshold and merged to generate the final ROI. Finally, a deep learning method named You Only Look Once (YOLO) is applied on the ROI to detect vehicles. The experimental results on the KITTI dataset demonstrate that the proposed algorithm has high detection accuracy and good real-time performance. Compared with the detection method based only on the YOLO deep learning, the mean average precision (mAP) is increased by 17%.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yiran Feng ◽  
Xueheng Tao ◽  
Eung-Joo Lee

In view of the current absence of any deep learning algorithm for shellfish identification in real contexts, an improved Faster R-CNN-based detection algorithm is proposed in this paper. It achieves multiobject recognition and localization through a second-order detection network and replaces the original feature extraction module with DenseNet, which can fuse multilevel feature information, increase network depth, and avoid the disappearance of network gradients. Meanwhile, the proposal merging strategy is improved with Soft-NMS, where an attenuation function is designed to replace the conventional NMS algorithm, thereby avoiding missed detection of adjacent or overlapping objects and enhancing the network detection accuracy under multiple objects. By constructing a real contexts shellfish dataset and conducting experimental tests on a vision recognition seafood sorting robot production line, we were able to detect the features of shellfish in different scenarios, and the detection accuracy was improved by nearly 4% compared to the original detection model, achieving a better detection accuracy. This provides favorable technical support for future quality sorting of seafood using the improved Faster R-CNN-based approach.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Weidong Zhao ◽  
Feng Chen ◽  
Hancheng Huang ◽  
Dan Li ◽  
Wei Cheng

In recent years, more and more scholars devoted themselves to the research of the target detection algorithm due to the continuous development of deep learning. Among them, the detection and recognition of small and complex targets are still a problem to be solved. The authors of this article have understood the shortcomings of the deep learning detection algorithm in detecting small and complex defect targets and would like to share a new improved target detection algorithm in steel surface defect detection. The steel surface defects will affect the quality of steel seriously. We find that most of the current detection algorithms for NEU-DET dataset detection accuracy are low, so we choose to verify a steel surface defect detection algorithm based on machine vision on this dataset for the problem of defect detection in steel production. A series of improvement measures are carried out in the traditional Faster R-CNN algorithm, such as reconstructing the network structure of Faster R-CNN. Based on the small features of the target, we train the network with multiscale fusion. For the complex features of the target, we replace part of the conventional convolution network with a deformable convolution network. The experimental results show that the deep learning network model trained by the proposed method has good detection performance, and the mean average precision is 0.752, which is 0.128 higher than the original algorithm. Among them, the average precision of crazing, inclusion, patches, pitted surface, rolled in scale and scratches is 0.501, 0.791, 0.792, 0.874, 0.649, and 0.905, respectively. The detection method is able to identify small target defects on the steel surface effectively, which can provide a reference for the automatic detection of steel defects.


2021 ◽  
Author(s):  
Zhenyu Wang ◽  
Senrong Ji ◽  
Duokun Yin

Abstract Recently, using image sensing devices to analyze air quality has attracted much attention of researchers. To keep real-time factory smoke under universal social supervision, this paper proposes a mobile-platform-running efficient smoke detection algorithm based on image analysis techniques. Since most smoke images in real scenes have challenging variances, it’s difficult for existing object detection methods. To this end, we introduce the two-stage smoke detection (TSSD) algorithm based on the lightweight framework, in which the prior knowledge and contextual information are modeled into the relation-guided module to reduce the smoke search space, which can therefore significantly improve the shortcomings of the single-stage method. Experimental results show that the TSSD algorithm can robustly improve the detection accuracy of the single-stage method and has good compatibility for different image resolution inputs. Compared with various state-of-the-art detection methods, the accuracy AP mean of the TSSD model reaches 59.24%, even surpassing the current detection model Faster R-CNN. In addition, the detection speed of our proposed model can reach 50 ms (20 FPS), which meets the real-time requirements, and can be deployed in the mobile terminal carrier. This model can be widely used in some scenes with smoke detection requirements, providing great potential for practical environmental applications.


Agriculture ◽  
2020 ◽  
Vol 10 (5) ◽  
pp. 160
Author(s):  
Ting Yuan ◽  
Lin Lv ◽  
Fan Zhang ◽  
Jun Fu ◽  
Jin Gao ◽  
...  

The detection of cherry tomatoes in greenhouse scene is of great significance for robotic harvesting. This paper states a method based on deep learning for cherry tomatoes detection to reduce the influence of illumination, growth difference, and occlusion. In view of such greenhouse operating environment and accuracy of deep learning, Single Shot multi-box Detector (SSD) was selected because of its excellent anti-interference ability and self-taught from datasets. The first step is to build datasets containing various conditions in greenhouse. According to the characteristics of cherry tomatoes, the image samples with illumination change, images rotation and noise enhancement were used to expand the datasets. Then training datasets were used to train and construct network model. To study the effect of base network and the input size of networks, one contrast experiment was designed on different base networks of VGG16, MobileNet, Inception V2 networks, and the other contrast experiment was conducted on changing the network input image size of 300 pixels by 300 pixels, 512 pixels by 512 pixels. Through the analysis of the experimental results, it is found that the Inception V2 network is the best base network with the average precision of 98.85% in greenhouse environment. Compared with other detection methods, this method shows substantial improvement in cherry tomatoes detection.


Sensors ◽  
2019 ◽  
Vol 19 (14) ◽  
pp. 3166 ◽  
Author(s):  
Cao ◽  
Song ◽  
Song ◽  
Xiao ◽  
Peng

Lane detection is an important foundation in the development of intelligent vehicles. To address problems such as low detection accuracy of traditional methods and poor real-time performance of deep learning-based methodologies, a lane detection algorithm for intelligent vehicles in complex road conditions and dynamic environments was proposed. Firstly, converting the distorted image and using the superposition threshold algorithm for edge detection, an aerial view of the lane was obtained via region of interest extraction and inverse perspective transformation. Secondly, the random sample consensus algorithm was adopted to fit the curves of lane lines based on the third-order B-spline curve model, and fitting evaluation and curvature radius calculation were then carried out on the curve. Lastly, by using the road driving video under complex road conditions and the Tusimple dataset, simulation test experiments for lane detection algorithm were performed. The experimental results show that the average detection accuracy based on road driving video reached 98.49%, and the average processing time reached 21.5 ms. The average detection accuracy based on the Tusimple dataset reached 98.42%, and the average processing time reached 22.2 ms. Compared with traditional methods and deep learning-based methodologies, this lane detection algorithm had excellent accuracy and real-time performance, a high detection efficiency and a strong anti-interference ability. The accurate recognition rate and average processing time were significantly improved. The proposed algorithm is crucial in promoting the technological level of intelligent vehicle driving assistance and conducive to the further improvement of the driving safety of intelligent vehicles.


2021 ◽  
Vol 2021 ◽  
pp. 1-6
Author(s):  
Yi Lv ◽  
Zhengbo Yin ◽  
Zhezhou Yu

In order to improve the accuracy of remote sensing image target detection, this paper proposes a remote sensing image target detection algorithm DFS based on deep learning. Firstly, dimension clustering module, loss function, and sliding window segmentation detection are designed. The data set used in the experiment comes from GoogleEarth, and there are 6 types of objects: airplanes, boats, warehouses, large ships, bridges, and ports. Training set, verification set, and test set contain 73490 images, 22722 images, and 2138 images, respectively. It is assumed that the number of detected positive samples and negative samples is A and B, respectively, and the number of undetected positive samples and negative samples is C and D, respectively. The experimental results show that the precision-recall curve of DFS for six types of targets shows that DFS has the best detection effect for bridges and the worst detection effect for boats. The main reason is that the size of the bridge is relatively large, and it is clearly distinguished from the background in the image, so the detection difficulty is low. However, the target of the boat is very small, and it is easy to be mixed with the background, so it is difficult to detect. The MAP of DFS is improved by 12.82%, the detection accuracy is improved by 13%, and the recall rate is slightly decreased by 1% compared with YOLOv2. According to the number of detection targets, the number of false positives (FPs) of DFS is much less than that of YOLOv2. The false positive rate is greatly reduced. In addition, the average IOU of DFS is 11.84% higher than that of YOLOv2. For small target detection efficiency and large remote sensing image detection, the DFS algorithm has obvious advantages.


2021 ◽  
Vol 13 (21) ◽  
pp. 4377
Author(s):  
Long Sun ◽  
Jie Chen ◽  
Dazheng Feng ◽  
Mengdao Xing

Unmanned aerial vehicle (UAV) is one of the main means of information warfare, such as in battlefield cruises, reconnaissance, and military strikes. Rapid detection and accurate recognition of key targets in UAV images are the basis of subsequent military tasks. The UAV image has characteristics of high resolution and small target size, and in practical application, the detection speed is often required to be fast. Existing algorithms are not able to achieve an effective trade-off between detection accuracy and speed. Therefore, this paper proposes a parallel ensemble deep learning framework for unmanned aerial vehicle video multi-target detection, which is a global and local joint detection strategy. It combines a deep learning target detection algorithm with template matching to make full use of image information. It also integrates multi-process and multi-threading mechanisms to speed up processing. Experiments show that the system has high detection accuracy for targets with focal lengths varying from one to ten times. At the same time, the real-time and stable display of detection results is realized by aiming at the moving UAV video image.


Sign in / Sign up

Export Citation Format

Share Document