DETECTION OF A HUMAN HEAD ON A LOW-QUALITY IMAGE AND ITS SOFTWARE IMPLEMENTATION

<p><strong>Abstract.</strong> The paper considers the task solution of detection on two-dimensional images not only face, but head of a human regardless of the turn to the observer. Such task is also complicated by the fact that the image receiving at the input of the recognition algorithm may be noisy or captured in low light conditions. The minimum size of a person’s head in an image to be detected for is 10&thinsp;&times;&thinsp;10 pixels. In the course of development, a dataset was prepared containing over 1000 labelled images of classrooms at BSTU n.a. V.G. Shukhov. The markup was carried out using a segmentation software tool specially developed by the authors. Three architectures of convolutional neural networks were trained for human head detection task: a fully convolutional neural network (FCN) with clustering, the Faster R-CNN architecture and the Mask R-CNN architecture. The third architecture works more than ten times slower than the first one, but it almost does not give false positives and has the precision and recall of head detection over 90% on both test and training samples. The Faster R-CNN architecture gives worse accuracy than Mask R-CNN, but it gives fewer false positives than FCN with clustering. Based on Mask R-CNN authors have developed software for human head detection on a lowquality image. It is two-level web-service with client and server modules. This software is used to detect and count people in the premises. The developed software works with IP cameras, which ensures its scalability for different practical computer vision applications.</p>

Download Full-text

Learning-free pattern detection for manuscript research:

International Journal on Document Analysis and Recognition (IJDAR) ◽

10.1007/s10032-021-00371-7 ◽

2021 ◽

Author(s):

Hussein Mohammed ◽

Volker Märgner ◽

Giovanni Ciotti

Keyword(s):

South Asian ◽

State Of The Art ◽

Research Question ◽

Software Tool ◽

Pattern Detection ◽

Asian Studies ◽

Medieval Manuscripts ◽

South Asian Studies ◽

Training Samples ◽

Nearest Neighbour Classifier

AbstractAutomatic pattern detection has become increasingly important for scholars in the humanities as the number of manuscripts that have been digitised has grown. Most of the state-of-the-art methods used for pattern detection depend on the availability of a large number of training samples, which are typically not available in the humanities as they involve tedious manual annotation by researchers (e.g. marking the location and size of words, drawings, seals and so on). This makes the applicability of such methods very limited within the field of manuscript research. We propose a learning-free approach based on a state-of-the-art Naïve Bayes Nearest-Neighbour classifier for the task of pattern detection in manuscript images. The method has already been successfully applied to an actual research question from South Asian studies about palm-leaf manuscripts. Furthermore, state-of-the-art results have been achieved on two extremely challenging datasets, namely the AMADI_LontarSet dataset of handwriting on palm leaves for word-spotting and the DocExplore dataset of medieval manuscripts for pattern detection. A performance analysis is provided as well in order to facilitate later comparisons by other researchers. Finally, an easy-to-use implementation of the proposed method is developed as a software tool and made freely available.

Download Full-text

Software tool for naval surface warfare simulation and training

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-2006-6s218 ◽

2007 ◽

Vol 6 (s2) ◽

pp. S427-S444 ◽

Cited By ~ 1

Author(s):

Sergiu Dascalu ◽

Sermsak Buntha ◽

Daniela Saru ◽

Narayan Debnath

Keyword(s):

Software Tool ◽

Simulation And Training ◽

Surface Warfare Simulation ◽

And Training ◽

Naval Surface Warfare

Download Full-text

Head Detection Based on DR Feature Extraction Network and Mixed Dilated Convolution Module

Electronics ◽

10.3390/electronics10131565 ◽

2021 ◽

Vol 10 (13) ◽

pp. 1565

Author(s):

Junwen Liu ◽

Yongjun Zhang ◽

Jianbin Xie ◽

Yan Wei ◽

Zewei Wang ◽

...

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Transmission Rate ◽

Pedestrian Detection ◽

Human Head ◽

Detection Rates ◽

Translational Invariance ◽

Dilated Convolution ◽

Head Detection ◽

Small Targets

Pedestrian detection for complex scenes suffers from pedestrian occlusion issues, such as occlusions between pedestrians. As well-known, compared with the variability of the human body, the shape of a human head and their shoulders changes minimally and has high stability. Therefore, head detection is an important research area in the field of pedestrian detection. The translational invariance of neural network enables us to design a deep convolutional neural network, which means that, even if the appearance and location of the target changes, it can still be recognized effectively. However, the problems of scale invariance and high miss detection rates for small targets still exist. In this paper, a feature extraction network DR-Net based on Darknet-53 is proposed to improve the information transmission rate between convolutional layers and to extract more semantic information. In addition, the MDC (mixed dilated convolution) with different sampling rates of dilated convolution is embedded to improve the detection rate of small targets. We evaluated our method on three publicly available datasets and achieved excellent results. The AP (Average Precision) value on the Brainwash dataset, HollywoodHeads dataset, and SCUT-HEAD dataset reached 92.1%, 84.8%, and 90% respectively.

Download Full-text

Learning When Agents Can Talk to Drivers Using the INAGT Dataset and Multisensor Fusion

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies ◽

10.1145/3478125 ◽

2021 ◽

Vol 5 (3) ◽

pp. 1-28

Author(s):

Tong Wu ◽

Nikolas Martelaro ◽

Simon Stent ◽

Jorge Ortiz ◽

Wendy Ju

Keyword(s):

Sensor Fusion ◽

Deep Neural Network ◽

False Positives ◽

Driver Distraction ◽

Third Party ◽

Good Time ◽

Visual Data ◽

Perceived Safety ◽

To Receive ◽

And Training

This paper examines sensor fusion techniques for modeling opportunities for proactive speech-based in-car interfaces. We leverage the Is Now a Good Time (INAGT) dataset, which consists of automotive, physiological, and visual data collected from drivers who self-annotated responses to the question "Is now a good time?," indicating the opportunity to receive non-driving information during a 50-minute drive. We augment this original driver-annotated data with third-party annotations of perceived safety, in order to explore potential driver overconfidence. We show that fusing automotive, physiological, and visual data allows us to predict driver labels of availability, achieving an 0.874 F1-score by extracting statistically relevant features and training with our proposed deep neural network, PazNet. Using the same data and network, we achieve an 0.891 F1-score for predicting third-party labeled safe moments. We train these models to avoid false positives---determinations that it is a good time to interrupt when it is not---since false positives may cause driver distraction or service deactivation by the driver. Our analyses show that conservative models still leave many moments for interaction and show that most inopportune moments are short. This work lays a foundation for using sensor fusion models to predict when proactive speech systems should engage with drivers.

Download Full-text

Human head detection using Histograms of Oriented optical flow in low quality videos with occlusion

2013, 7th International Conference on Signal Processing and Communication Systems (ICSPCS) ◽

10.1109/icspcs.2013.6723967 ◽

2013 ◽

Cited By ~ 1

Author(s):

Fu-Chun Hsu ◽

Jayavardhana Gubbi ◽

Marimuthu Palaniswami

Keyword(s):

Optical Flow ◽

Human Head ◽

Head Detection

Download Full-text

Potential of Multi-Temporal ALOS-2 PALSAR-2 ScanSAR Data for Vegetation Height Estimation in Tropical Forests of Mexico

Remote Sensing ◽

10.3390/rs10081277 ◽

2018 ◽

Vol 10 (8) ◽

pp. 1277 ◽

Cited By ~ 8

Author(s):

Mikhail Urbazaev ◽

Felix Cremer ◽

Mirco Migliavacca ◽

Markus Reichstein ◽

Christiane Schmullius ◽

...

Keyword(s):

Spatial Autocorrelation ◽

Test Data ◽

Predictive Performance ◽

Airborne Lidar ◽

Height Estimation ◽

Vegetation Height ◽

Training Samples ◽

Multi Temporal ◽

L Band ◽

And Training

Information on the spatial distribution of forest structure parameters (e.g., aboveground biomass, vegetation height) are crucial for assessing terrestrial carbon stocks and emissions. In this study, we sought to assess the potential and merit of multi-temporal dual-polarised L-band observations for vegetation height estimation in tropical deciduous and evergreen forests of Mexico. We estimated vegetation height using dual-polarised L-band observations and a machine learning approach. We used airborne LiDAR-based vegetation height for model training and for result validation. We split LiDAR-based vegetation height into training and test data using two different approaches, i.e., considering and ignoring spatial autocorrelation between training and test data. Our results indicate that ignoring spatial autocorrelation leads to an overoptimistic model’s predictive performance. Accordingly, a spatial splitting of the reference data should be preferred in order to provide realistic retrieval accuracies. Moreover, the model’s predictive performance increases with an increasing number of spatial predictors and training samples, but saturates at a specific level (i.e., at 12 dual-polarised L-band backscatter measurements and at around 20% of all training samples). In consideration of spatial autocorrelation between training and test data, we determined an optimal number of L-band observations and training samples as a trade-off between retrieval accuracy and data collection effort. In summary, our study demonstrates the merit of multi-temporal ScanSAR L-band observations for estimation of vegetation height at a larger scale and provides a workflow for robust predictions of this parameter.

Download Full-text

Radar Target Recognition Based on Sparse Representation of Dictionary Optimized

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.701-702.433 ◽

2014 ◽

Vol 701-702 ◽

pp. 433-436

Author(s):

Pei Pei Duan ◽

Hui Li ◽

Qi Li

Keyword(s):

Matching Pursuit ◽

Target Recognition ◽

Recognition Rate ◽

Recognition Algorithm ◽

Radar Target ◽

Redundant Dictionary ◽

Range Resolution ◽

Radar Target Recognition ◽

Training Samples ◽

High Range

The high range resolution profile samples are numerous and sparse. But less radar target recognition algorithms based on high range resolution profiles (HRRP) employed the sparseness of HRRP samples. A new radar target recognition algorithm using a fast sparse decomposition method is presented here. This algorithm was to be carried out in three major steps. First, the Gabor redundant dictionary was partitioned according to its atom characteristics to decrease the atoms storage. Then, the matching pursuit algorithm was improved by the genetic algorithm and the fast cross-correlations calculation to accelerate training samples decomposition and generate the taxonomic dictionaries. Finally, the reconstruction errors of testing samples were used to recognize different radar targets. The simulations show that this method can resist noise disturbs and its recognition rate is high.

Download Full-text

Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2009.04.008 ◽

2009 ◽

Vol 39 (7) ◽

pp. 646-649 ◽

Cited By ~ 24

Author(s):

Qi Shen ◽

Zhen Mei ◽

Bao-Xian Ye

Keyword(s):

Gene Expression ◽

Particle Swarm Optimization ◽

Gene Expression Data ◽

Particle Swarm ◽

Expression Data ◽

Swarm Optimization ◽

Modified Particle Swarm Optimization ◽

Training Samples ◽

Samples Selection ◽

And Training

Download Full-text

Human head detection using multi-modal object features

Proceedings of the International Joint Conference on Neural Networks, 2003. ◽

10.1109/ijcnn.2003.1223738 ◽

2004 ◽

Cited By ~ 1

Author(s):

Yun Luo ◽

Yi Lu Murphey ◽

F. Khairallah

Keyword(s):

Human Head ◽

Head Detection ◽

Object Features

Download Full-text

Human Action Recognition Algorithm Based on Key Posture

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.631-632.1303 ◽

2013 ◽

Vol 631-632 ◽

pp. 1303-1308

Author(s):

He Jin Yuan

Keyword(s):

Action Recognition ◽

Clustering Algorithm ◽

Recognition Accuracy ◽

Human Action Recognition ◽

Human Action ◽

Recognition Algorithm ◽

Training Samples ◽

Action Sequences

A novel human action recognition algorithm based on key posture is proposed in this paper. In the method, the mesh features of each image in human action sequences are firstly calculated; then the key postures of the human mesh features are generated through k-medoids clustering algorithm; and the motion sequences are thus represented as vectors of key postures. The component of the vector is the occurrence number of the corresponding posture included in the action. For human action recognition, the observed action is firstly changed into key posture vector; then the correlevant coefficients to the training samples are calculated and the action which best matches the observed sequence is chosen as the final category. The experiments on Weizmann dataset demonstrate that our method is effective for human action recognition. The average recognition accuracy can exceed 90%.

Download Full-text