Head Detection Based on DR Feature Extraction Network and Mixed Dilated Convolution Module

Junwen Liu; Yongjun Zhang; Jianbin Xie; Yan Wei; Zewei Wang; Mengjia Niu

doi:10.3390/electronics10131565

Head Detection Based on DR Feature Extraction Network and Mixed Dilated Convolution Module

Electronics ◽

10.3390/electronics10131565 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1565

Author(s):

Junwen Liu ◽

Yongjun Zhang ◽

Jianbin Xie ◽

Yan Wei ◽

Zewei Wang ◽

...

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Transmission Rate ◽

Pedestrian Detection ◽

Human Head ◽

Detection Rates ◽

Translational Invariance ◽

Dilated Convolution ◽

Head Detection ◽

Small Targets

Pedestrian detection for complex scenes suffers from pedestrian occlusion issues, such as occlusions between pedestrians. As well-known, compared with the variability of the human body, the shape of a human head and their shoulders changes minimally and has high stability. Therefore, head detection is an important research area in the field of pedestrian detection. The translational invariance of neural network enables us to design a deep convolutional neural network, which means that, even if the appearance and location of the target changes, it can still be recognized effectively. However, the problems of scale invariance and high miss detection rates for small targets still exist. In this paper, a feature extraction network DR-Net based on Darknet-53 is proposed to improve the information transmission rate between convolutional layers and to extract more semantic information. In addition, the MDC (mixed dilated convolution) with different sampling rates of dilated convolution is embedded to improve the detection rate of small targets. We evaluated our method on three publicly available datasets and achieved excellent results. The AP (Average Precision) value on the Brainwash dataset, HollywoodHeads dataset, and SCUT-HEAD dataset reached 92.1%, 84.8%, and 90% respectively.

Pedestrian detection algorithm based on improved muti-scale feature fusion

Journal of Physics Conference Series ◽

10.1088/1742-6596/2078/1/012008 ◽

2021 ◽

Vol 2078 (1) ◽

pp. 012008

Author(s):

Hui Liu ◽

Keyang Cheng

Keyword(s):

Clustering Algorithm ◽

Feature Fusion ◽

Pedestrian Detection ◽

Detection Algorithm ◽

Data Sets ◽

False Detection ◽

Scale Feature ◽

Multi Scale ◽

Dilated Convolution ◽

Small Targets

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.

New End-to-End Strategy Based on DeepLabv3+ Semantic Segmentation for Human Head Detection

Sensors ◽

10.3390/s21175848 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5848

Author(s):

Mohamed Chouai ◽

Petr Dolezel ◽

Dominik Stursa ◽

Zdenek Nemec

Keyword(s):

Detection System ◽

Pedestrian Detection ◽

Human Head ◽

Semantic Segmentation ◽

Person Detection ◽

Head Detection ◽

Bounding Boxes ◽

Global Accuracy ◽

Private Datasets ◽

Safety Systems

In the field of computer vision, object detection consists of automatically finding objects in images by giving their positions. The most common fields of application are safety systems (pedestrian detection, identification of behavior) and control systems. Another important application is head/person detection, which is the primary material for road safety, rescue, surveillance, etc. In this study, we developed a new approach based on two parallel Deeplapv3+ to improve the performance of the person detection system. For the implementation of our semantic segmentation model, a working methodology with two types of ground truths extracted from the bounding boxes given by the original ground truths was established. The approach has been implemented in our two private datasets as well as in a public dataset. To show the performance of the proposed system, a comparative analysis was carried out on two deep learning semantic segmentation state-of-art models: SegNet and U-Net. By achieving 99.14% of global accuracy, the result demonstrated that the developed strategy could be an efficient way to build a deep neural network model for semantic segmentation. This strategy can be used, not only for the detection of the human head but also be applied in several semantic segmentation applications.

Simulation of Human Ear Recognition Sound Direction Based on Convolutional Neural Network

Journal of Intelligent Systems ◽

10.1515/jisys-2019-0250 ◽

2020 ◽

Vol 30 (1) ◽

pp. 209-223

Author(s):

Zhuhe Wang ◽

Nan Li ◽

Tao Wu ◽

Haoxuan Zhang ◽

Tao Feng

Keyword(s):

Neural Network ◽

Neural Networks ◽

Feature Selection ◽

Convolutional Neural Network ◽

Probability Distributions ◽

Human Head ◽

Original Data ◽

Translational Invariance ◽

Sound Direction ◽

Human Ear

Abstract In recent years, more and more people are applying Convolutional Neural Networks to the study of sound signals. The main reason is the translational invariance of convolution in time and space. Thereby the diversity of the sound signal can be overcome. However, in terms of sound direction recognition, there are also problems such as a microphone matrix being too large, and feature selection. This paper proposes a sound direction recognition using a simulated human head with microphones at both ears. Theoretically, the two microphones cannot distinguish the front and rear directions. However, we use the original data of the two channels as the input of the convolutional neural network, and the resolution effect can reach more than 0.9. For comparison, we also chose the delay feature (GCC) for sound direction recognition. Finally, we also conducted experiments that used probability distributions to identify more directions.

ECG signal feature extraction and classification using wavelet transform and neural network

PsycEXTRA Dataset ◽

10.1037/e605272012-004 ◽

2011 ◽

Author(s):

Thomas Sri Widodo

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Wavelet Transform ◽

Ecg Signal

Feature Extraction And Classification For ECG Signals Processing Based On Stationary Multiwavelet Transform And Artificial Neural Network

10.36541/0231-000-029-008 ◽

2018 ◽

pp. 85

Author(s):

زهراء خضير طه

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Feature Extraction ◽

Ecg Signals ◽

Multiwavelet Transform ◽

Artificial Neural

Improving feature extraction in keystroke dynamics using optimization techniques and neural network

International Conference on Sustainable Energy and Intelligent Systems (SEISCON 2011) ◽

10.1049/cp.2011.0493 ◽

2011 ◽

Cited By ~ 5

Author(s):

M. Akila ◽

S.S. Kumar

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Optimization Techniques ◽

Keystroke Dynamics

Lightweight convolutional neural network-based pedestrian detection and re-identification in multiple scenarios

Machine Vision and Applications ◽

10.1007/s00138-021-01169-7 ◽

2021 ◽

Vol 32 (2) ◽

Author(s):

Xiao Ke ◽

Xinru Lin ◽

Liyun Qin

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Pedestrian Detection ◽

Multiple Scenarios

Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9413453 ◽

2021 ◽

Author(s):

Chao-Han Huck Yang ◽

Jun Qi ◽

Samuel Yen-Chi Chen ◽

Pin-Yu Chen ◽

Sabato Marco Siniscalchi ◽

...

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Speech Recognition ◽

Convolutional Neural Network ◽

Automatic Speech Recognition

Application of Deep Learning in Fault Diagnosis of Rotating Machinery

Processes ◽

10.3390/pr9060919 ◽

2021 ◽

Vol 9 (6) ◽

pp. 919

Author(s):

Wanlu Jiang ◽

Chenyang Wang ◽

Jiayun Zou ◽

Shuqing Zhang

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Fault Diagnosis ◽

Rotating Machinery ◽

Generative Adversarial Networks ◽

Extraction Ability ◽

One Dimensional ◽

Adversarial Networks ◽

Diagnosis Model ◽

The One

The field of mechanical fault diagnosis has entered the era of “big data”. However, existing diagnostic algorithms, relying on artificial feature extraction and expert knowledge are of poor extraction ability and lack self-adaptability in the mass data. In the fault diagnosis of rotating machinery, due to the accidental occurrence of equipment faults, the proportion of fault samples is small, the samples are imbalanced, and available data are scarce, which leads to the low accuracy rate of the intelligent diagnosis model trained to identify the equipment state. To solve the above problems, an end-to-end diagnosis model is first proposed, which is an intelligent fault diagnosis method based on one-dimensional convolutional neural network (1D-CNN). That is to say, the original vibration signal is directly input into the model for identification. After that, through combining the convolutional neural network with the generative adversarial networks, a data expansion method based on the one-dimensional deep convolutional generative adversarial networks (1D-DCGAN) is constructed to generate small sample size fault samples and construct the balanced data set. Meanwhile, in order to solve the problem that the network is difficult to optimize, gradient penalty and Wasserstein distance are introduced. Through the test of bearing database and hydraulic pump, it shows that the one-dimensional convolution operation has strong feature extraction ability for vibration signals. The proposed method is very accurate for fault diagnosis of the two kinds of equipment, and high-quality expansion of the original data can be achieved.

Deep convolutional neural networks for human movement detection using wireless signals

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189629 ◽

2021 ◽

pp. 1-10

Author(s):

Chien-Cheng Leea ◽

Zhongjian Gao ◽

Xiu-Chi Huanga

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Convolutional Neural Network ◽

Detection System ◽

Deep Convolutional Neural Network ◽

Human Detection ◽

Two Dimensional ◽

Dimensional Matrix ◽

State Classification ◽

Propagation Paths

This paper proposes a Wi-Fi-based indoor human detection system using a deep convolutional neural network. The system detects different human states in various situations, including different environments and propagation paths. The main improvements proposed by the system is that there is no cameras overhead and no sensors are mounted. This system captures useful amplitude information from the channel state information and converts this information into an image-like two-dimensional matrix. Next, the two-dimensional matrix is used as an input to a deep convolutional neural network (CNN) to distinguish human states. In this work, a deep residual network (ResNet) architecture is used to perform human state classification with hierarchical topological feature extraction. Several combinations of datasets for different environments and propagation paths are used in this study. ResNet’s powerful inference simplifies feature extraction and improves the accuracy of human state classification. The experimental results show that the fine-tuned ResNet-18 model has good performance in indoor human detection, including people not present, people still, and people moving. Compared with traditional machine learning using handcrafted features, this method is simple and effective.