scholarly journals A Multiperson Pose Estimation Method Using Depthwise Separable Convolutions and Feature Pyramid Network

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Qidong Du

In the process of multiperson pose estimation, there are problems such as slow detection speed, low detection accuracy of key point targets, and inaccurate positioning of the boundaries of people with serious occlusion. A multiperson pose estimation method using depthwise separable convolutions and feature pyramid network is proposed. Firstly, the YOLOv3 target detection algorithm model based on the depthwise separable convolution is used to improve the running speed of the human body detector. Then, based on the improved feature pyramid network, a multiscale supervision module and a multiscale regression module are added to assist training and to solve the difficult key point detection problem of the human body. Finally, the improved soft-argmax method is used to further eliminate redundant attitudes and improve the accuracy of attitude boundary positioning. Experimental results show that the proposed model has a score of 73.4% in AP on the 2017 COCO test-dev dataset, and it scored 86.24% on [email protected] on the MPII dataset.

2021 ◽  
Vol 11 (9) ◽  
pp. 4241
Author(s):  
Jiahua Wu ◽  
Hyo Jong Lee

In bottom-up multi-person pose estimation, grouping joint candidates into the appropriately structured corresponding instance of a person is challenging. In this paper, a new bottom-up method, the Partitioned CenterPose (PCP) Network, is proposed to better cluster the detected joints. To achieve this goal, we propose a novel approach called Partition Pose Representation (PPR) which integrates the instance of a person and its body joints based on joint offset. PPR leverages information about the center of the human body and the offsets between that center point and the positions of the body’s joints to encode human poses accurately. To enhance the relationships between body joints, we divide the human body into five parts, and then, we generate a sub-PPR for each part. Based on this PPR, the PCP Network can detect people and their body joints simultaneously, then group all body joints according to joint offset. Moreover, an improved l1 loss is designed to more accurately measure joint offset. Using the COCO keypoints and CrowdPose datasets for testing, it was found that the performance of the proposed method is on par with that of existing state-of-the-art bottom-up methods in terms of accuracy and speed.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1235
Author(s):  
Yang Yang ◽  
Hongmin Deng

In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.


2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Zhenbo Lu ◽  
Wei Zhou ◽  
Shixiang Zhang ◽  
Chen Wang

Quick and accurate crash detection is important for saving lives and improved traffic incident management. In this paper, a feature fusion-based deep learning framework was developed for video-based urban traffic crash detection task, aiming at achieving a balance between detection speed and accuracy with limited computing resource. In this framework, a residual neural network (ResNet) combined with attention modules was proposed to extract crash-related appearance features from urban traffic videos (i.e., a crash appearance feature extractor), which were further fed to a spatiotemporal feature fusion model, Conv-LSTM (Convolutional Long Short-Term Memory), to simultaneously capture appearance (static) and motion (dynamic) crash features. The proposed model was trained by a set of video clips covering 330 crash and 342 noncrash events. In general, the proposed model achieved an accuracy of 87.78% on the testing dataset and an acceptable detection speed (FPS > 30 with GTX 1060). Thanks to the attention module, the proposed model can capture the localized appearance features (e.g., vehicle damage and pedestrian fallen-off) of crashes better than conventional convolutional neural networks. The Conv-LSTM module outperformed conventional LSTM in terms of capturing motion features of crashes, such as the roadway congestion and pedestrians gathering after crashes. Compared to traditional motion-based crash detection model, the proposed model achieved higher detection accuracy. Moreover, it could detect crashes much faster than other feature fusion-based models (e.g., C3D). The results show that the proposed model is a promising video-based urban traffic crash detection algorithm that could be used in practice in the future.


Smart Cities ◽  
2020 ◽  
Vol 4 (1) ◽  
pp. 1-16
Author(s):  
Haoran Niu ◽  
Olufemi A. Omitaomu ◽  
Qing C. Cao

Events detection is a key challenge in power grid frequency disturbances analysis. Accurate detection of events is crucial for situational awareness of the power system. In this paper, we study the problem of events detection in power grid frequency disturbance analysis using synchrophasors data streams. Current events detection approaches for power grid rely on individual detection algorithm. This study integrates some of the existing detection algorithms using the concept of machine committee to develop improved detection approaches for grid disturbance analysis. Specifically, we propose two algorithms—an Event Detection Machine Committee (EDMC) algorithm and a Change-Point Detection Machine Committee (CPDMC) algorithm. Both algorithms use parallel architecture to fuse detection knowledge of its individual methods to arrive at an overall output. The EDMC algorithm combines five individual event detection methods, while the CPDMC algorithm combines two change-point detection methods. Each method performs the detection task separately. The overall output of each algorithm is then computed using a voting strategy. The proposed algorithms are evaluated using three case studies of actual power grid disturbances. Compared with the individual results of the various detection methods, we found that the EDMC algorithm is a better fit for analyzing synchrophasors data; it improves the detection accuracy; and it is suitable for practical scenarios.


2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Jun Liu ◽  
Shuyu Chen ◽  
Zhen Zhou ◽  
Tianshu Wu

Virtual machines (VM) on a Cloud platform can be influenced by a variety of factors which can lead to decreased performance and downtime, affecting the reliability of the Cloud platform. Traditional anomaly detection algorithms and strategies for Cloud platforms have some flaws in their accuracy of detection, detection speed, and adaptability. In this paper, a dynamic and adaptive anomaly detection algorithm based on Self-Organizing Maps (SOM) for virtual machines is proposed. A unified modeling method based on SOM to detect the machine performance within the detection region is presented, which avoids the cost of modeling a single virtual machine and enhances the detection speed and reliability of large-scale virtual machines in Cloud platform. The important parameters that affect the modeling speed are optimized in the SOM process to significantly improve the accuracy of the SOM modeling and therefore the anomaly detection accuracy of the virtual machine.


2021 ◽  
Vol 922 (1) ◽  
pp. 012001
Author(s):  
O M Lawal ◽  
Z Huamin ◽  
Z Fan

Abstract Fruit detection algorithm as an integral part of harvesting robot is expected to be robust, accurate, and fast against environmental factors such as occlusion by stem and leaves, uneven illumination, overlapping fruit and many more. For this reason, this paper explored and compared ablation studies on proposed YOLOFruit, YOLOv4, and YOLOv5 detection algorithms. The final selected YOLOFruit algorithm used ResNet43 backbone with Combined activation function for feature extraction, Spatial Pyramid Pooling Network (SPPNet) for detection accuracies, Feature Pyramid Network (FPN) for feature pyramids, Distance Intersection Over Union-Non Maximum Suppression (DIoU-NMS) for detection efficiency and accuracy, and Complete Intersection Over Union (CIoU) loss for faster and better performance. The obtained results showed that the average detection accuracy of YOLOFruit at 86.2% is 1% greater than YOLOv4 at 85.2% and 4.3% higher than YOLOv5 at 81.9%, while the detection time of YOLOFruit at 11.9ms is faster than YOLOv4 at 16.6ms, but not with YOLOv5 at 2.7ms. Hence, the YOLOFruit detection algorithm is highly prospective for better generalization and real-time fruit detection.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1678 ◽  
Author(s):  
Lei Pang ◽  
Hui Liu ◽  
Yang Chen ◽  
Jungang Miao

The detection of objects concealed under people’s clothing is a very challenging task, which has crucial applications for security. When testing the human body for metal contraband, the concealed targets are usually small in size and are required to be detected within a few seconds. Focusing on weapon detection, this paper proposes using a real-time detection method for detecting concealed metallic weapons on the human body applied to passive millimeter wave (PMMW) imagery based on the You Only Look Once (YOLO) algorithm, YOLOv3, and a small sample dataset. The experimental results from YOLOv3-13, YOLOv3-53, and Single Shot MultiBox Detector (SSD) algorithm, SSD-VGG16, are compared ultimately, using the same PMMW dataset. For the perspective of detection accuracy, detection speed, and computation resource, it shows that the YOLOv3-53 model had a detection speed of 36 frames per second (FPS) and a mean average precision (mAP) of 95% on a GPU-1080Ti computer, more effective and feasible for the real-time detection of weapon contraband on human body for PMMW images, even with small sample data.


2014 ◽  
Vol 614 ◽  
pp. 317-320
Author(s):  
Wei Xing Zhu ◽  
Lei Yang ◽  
Liang Tang

In order to detect and identify the abnormal respiratory of pig and provide real-time warning, in this paper we propose to use machine vision that apply the feature of area operator to detect the porcine respiratory frequency .The structure of the paper is as follows .Firstly the videos of pig standing in the piggery are captured through the use of camera and transmitted to the computer. Then on the MATLAB platform, the changing abdominal area of pig is extracted. Secondly the characteristics of adaptive and multi-resolution analysis of wavelet analysis allow us to remove the burrs of the area signal. Moreover, we use the peak point detection algorithm to acquire pig’s respiratory frequency during the monitoring time, which finally is transformed into breath rate. The experimental results show that the detection accuracy of respiratory frequency is higher than 92% for the abnormal breathing pigs.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Ying Miao ◽  
Danyang Shao ◽  
Zhimin Yan

In this paper, we analyze the location-following processing of the image by successive approximation with the need for directed privacy. To solve the detection problem of moving the human body in the dynamic background, the motion target detection module integrates the two ideas of feature information detection and human body model segmentation detection and combines the deep learning framework to complete the detection of the human body by detecting the feature points of key parts of the human body. The detection of human key points depends on the human pose estimation algorithm, so the research in this paper is based on the bottom-up model in the multiperson pose estimation method; firstly, all the human key points in the image are detected by feature extraction through the convolutional neural network, and then the accurate labelling of human key points is achieved by using the heat map and offset fusion optimization method in the feature point confidence map prediction, and finally, the human body detection results are obtained. In the study of the correlation algorithm, this paper combines the HOG feature extraction of the KCF algorithm and the scale filter of the DSST algorithm to form a fusion correlation filter based on the principle study of the MOSSE correlation filter. The algorithm solves the problems of lack of scale estimation of KCF algorithm and low real-time rate of DSST algorithm and improves the tracking accuracy while ensuring the real-time performance of the algorithm.


2022 ◽  
Vol 2022 ◽  
pp. 1-12
Author(s):  
Dongmei Shi ◽  
Hongyu Tang

Deep learning theory is widely used in face recognition. Combined with the needs of classroom attendance and students’ learning status monitoring, this article analyzes the YOLO (You Only Look Once) face recognition algorithms based on regression method. Aiming at the problem of small target missing detection in the YOLOv3 network structure, an improved YOLOv3 algorithm based on Bayesian optimization is proposed. The algorithm uses deep separable convolution instead of conventional convolution to improve the Darknet-53 basic network, and it reduces the amount of calculation and parameters of the network. A multiscale feature pyramid is built, and an attention guidance module is designed to strengthen multiscale fusion, detecting different sizes of targets. The loss function is improved to solve the imbalance of positive and negative sample distribution and the imbalance between simple samples and difficult samples. The Bayesian function is adopted to optimize the classifier and improve the classification efficiency and accuracy, ensuring the accuracy of small target detection. Five groups of comparative experiments are carried out on public COCO and VOC2012 datasets and self-built datasets. The experimental results show that the proposed improved YOLOv3 model can effectively improve the detection accuracy of multiple faces and small targets. Compared with the traditional YOLOv3 model, the mean mAP of the target is improved by more than 1.2%.


Sign in / Sign up

Export Citation Format

Share Document