A scale-aware YOLO model for pedestrian detection

10.36227/techrxiv.13049129.v1 ◽

2020 ◽

Author(s):

Xingyi Yang ◽

Yonghu Wang ◽

Robert Laganiere

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Real Time ◽

State Of The Art ◽

Pedestrian Detection ◽

Small Scale ◽

Robust Detection ◽

Public Datasets ◽

Traditional Approaches ◽

New Framework

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>

Download Full-text

A scale-aware YOLO model for pedestrian detection

10.36227/techrxiv.13049129 ◽

2020 ◽

Author(s):

Xingyi Yang ◽

Yong Wang ◽

Robert Laganiere

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Real Time ◽

State Of The Art ◽

Pedestrian Detection ◽

Small Scale ◽

Robust Detection ◽

Public Datasets ◽

Traditional Approaches ◽

New Framework

<div>Pedestrian detection is considered one of the most challenging problems in computer vision, as it involves the combination of classification and localization within a scene. Recently, convolutional neural networks (CNNs) have been demonstrated to achieve superior detection results compared to traditional approaches. Although YOLOv3 (an improved You Only Look Once model) is proposed as one of state-of-the-art methods in CNN-based object detection, it remains very challenging to leverage this method for real-time pedestrian detection. In this paper, we propose a new framework called SA YOLOv3, a scale-aware You Only Look Once framework which improves YOLOv3 in improving pedestrian detection of small scale pedestrian instances in a real-time manner.</div><div>Our network introduces two sub-networks which detect pedestrians of different scales. Outputs from the sub-networks are then combined to generate robust detection results.</div><div>Experimental results show that the proposed SA YOLOv3 framework outperforms the results of YOLOv3 on public datasets and run at an average of 11 fps on a GPU.</div>

Download Full-text

Chemical ripening and contaminations detection using neural networks-based image features and spectrometric signatures

Machine Graphics and Vision ◽

10.22630/mgv.2021.30.1.2 ◽

2021 ◽

Vol 30 (1) ◽

pp. 23-43

Author(s):

Roopalakshmi R

Keyword(s):

Neural Networks ◽

Computer Vision ◽

State Of The Art ◽

Calcium Carbide ◽

Image Features ◽

The State ◽

Hazardous Chemicals ◽

Confidence Levels ◽

Chemical Ripening ◽

New Framework

In this pandemic-prone era, health is of utmost concern for everyone and hence eating good quality fruits is very much essential for sound health. Unfortunately, nowadays it is quite very difficult to obtain naturally ripened fruits, due to existence of chemically ripened fruits being ripened using hazardous chemicals such as calcium carbide. However, most of the state-of-the art techniques are primarily focusing on identification of chemically ripened fruits with the help of computer vision-based approaches, which are less effective towards quantification of chemical contaminations present in the sample fruits. To solve these issues, a new framework for chemical ripening and contamination detection is presented, which employs both visual and IR spectrometric signatures in two different stages. The experiments conducted on both the GUI tool as well as hardware-based setups, clearly demonstrate the efficiency of the proposed framework in terms of detection confidence levels followed by the percentage of presence of chemicals in the sample fruit.

Download Full-text

Random Forest with Adaptive Local Template for Pedestrian Detection

Mathematical Problems in Engineering ◽

10.1155/2015/767423 ◽

2015 ◽

Vol 2015 ◽

pp. 1-11 ◽

Cited By ~ 2

Author(s):

Tao Xiang ◽

Tao Li ◽

Mao Ye ◽

Zijian Liu

Keyword(s):

Computer Vision ◽

Random Forest ◽

Classification Accuracy ◽

Template Matching ◽

Detection Method ◽

State Of The Art ◽

Pedestrian Detection ◽

Sliding Window ◽

Experimental Results ◽

Training Samples

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.

Download Full-text

Comparative Performance Analysis of Neural Network Real-Time Object Detections in Different Implementations

EPJ Web of Conferences ◽

10.1051/epjconf/202022602020 ◽

2020 ◽

Vol 226 ◽

pp. 02020

Author(s):

Alexey V. Stadnik ◽

Pavel S. Sazhin ◽

Slavomir Hnatic

Keyword(s):

Neural Network ◽

Neural Networks ◽

Computer Vision ◽

Performance Analysis ◽

Object Detection ◽

Real Time ◽

Network Architecture ◽

Neural Network Architecture ◽

Comparative Performance

The performance of neural networks is one of the most important topics in the field of computer vision. In this work, we analyze the speed of object detection using the well-known YOLOv3 neural network architecture in different frameworks under different hardware requirements. We obtain results, which allow us to formulate preliminary qualitative conclusions about the feasibility of various hardware scenarios to solve tasks in real-time environments.

Download Full-text

Vision-Based Hand Tracking and Gesture Recognition for Augmented Assembly System

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.392-394.1030 ◽

2008 ◽

Vol 392-394 ◽

pp. 1030-1036 ◽

Cited By ~ 3

Author(s):

Yue Ming Wu ◽

Han Wu He ◽

J. Sun ◽

T. Ru ◽

De Tao Zheng

Keyword(s):

Computer Vision ◽

Real Time ◽

Gesture Recognition ◽

Hand Tracking ◽

Assembly System ◽

Time Performance ◽

Traditional Approaches

A real time hand tracking and gesture recognition approach which can deal with dynamic backgrounds is presented. This approach is based on computer vision. It segments hand from dynamic backgrounds using the color-based and appearance-based methods. Then, it locates the hand according to marker on the hand. Last, it recognizes the gesture based on geometry constraint of hand. In comparison with the traditional approaches, it provides a good real time performance, is easy to realize, does not require a stationary camera and is not sensitive with intensity different because its gesture recognition does not depend on the templates. Moreover, an augmented assembly system using the presented approach is described. The experiment result of the augmented assembly system demonstrated the effectiveness and robustness of our approach.

Download Full-text

Investigation of optimal configurations of a convolutional neural network for the identification of objects in real-time

Information Technology and Nanotechnology ◽

10.18287/1613-0073-2019-2416-417-423 ◽

2019 ◽

pp. 417-423

Author(s):

M A Isayev ◽

D A Savelyev

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Real Time ◽

State Of The Art ◽

Average Precision ◽

The Core ◽

Particular Solution ◽

Optimal Configurations

The comparison of different convolutional neural networks which are the core of the most actual solutions in the computer vision area is considers in hhe paper. The study includes benchmarks of this state-of-the-art solutions by some criteria, such as mAP (mean average precision), FPS (frames per seconds), for the possibility of real-time usability. It is concluded on the best convolutional neural network model and deep learning methods that were used at particular solution.

Download Full-text

Group-Wise Dynamic Dropout Based on Latent Semantic Variations

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6782 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11229-11236

Author(s):

Zhiwei Ke ◽

Zhiwei Wen ◽

Weicheng Xie ◽

Yi Wang ◽

Linlin Shen

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Semantic Information ◽

State Of The Art ◽

Classification Performance ◽

Network Robustness ◽

Feature Detectors ◽

Data Points ◽

Adversarial Examples ◽

Public Datasets

Dropout regularization has been widely used in various deep neural networks to combat overfitting. It works by training a network to be more robust on information-degraded data points for better generalization. Conventional dropout and variants are often applied to individual hidden units in a layer to break up co-adaptations of feature detectors. In this paper, we propose an adaptive dropout to reduce the co-adaptations in a group-wise manner by coarse semantic information to improve feature discriminability. In particular, we showed that adjusting the dropout probability based on local feature densities can not only improve the classification performance significantly but also enhance the network robustness against adversarial examples in some cases. The proposed approach was evaluated in comparison with the baseline and several state-of-the-art adaptive dropouts over four public datasets of Fashion-MNIST, CIFAR-10, CIFAR-100 and SVHN.

Download Full-text

Friendly Farmer

International Journal of Advanced Research in Science, Communication and Technology ◽

10.48175/ijarsct-1414 ◽

2021 ◽

pp. 488-491

Author(s):

Ritwik Chavhan ◽

Kadir Sheikh ◽

Rishikesh Bondade ◽

Swaraj Dhanulkar ◽

Aniket Ninave ◽

...

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Food Security ◽

Convolutional Neural Networks ◽

Image Recognition ◽

Associate Degree ◽

Plant Disease ◽

State Of The Art ◽

Smallholder Farmers ◽

Definite Diagnosis

Plant disease is an ongoing challenge for smallholder farmers, which threatens income and food security. The recent revolution in smartphone penetration and computer vision models has created an opportunity for image classification in agriculture. The project focuses on providing the data relating to the pesticide/insecticide and therefore the quantity of pesticide/insecticide to be used for associate degree unhealthy crop. The user, is that the farmer clicks an image of the crop and uploads it to the server via the humanoid application. When uploading the image the farmer gets associate degree distinctive ID displayed on his application screen. The farmer must create note of that ID since that ID must be utilized by the farmer later to retrieve the message when a minute. The uploaded image is then processed by Convolutional Neural Networks. Convolutional Neural Networks (CNNs) are considered state-of-the-art in image recognition and offer the ability to provide a prompt and definite diagnosis. Then the result consisting of the malady name and therefore the affected space is retrieved. This result's then uploaded into the message table within the server. Currently the Farmer are going to be ready to retrieve the whole info during a respectable format by coming into the distinctive ID he had received within the Application.

Download Full-text

Convolutional Neural Networks Inference Memory Optimization with Receptive Field-Based InputTiling

10.21203/rs.3.rs-743636/v1 ◽

2021 ◽

Author(s):

Weihao Zhuang ◽

Tristan Hascoet ◽

Xunquan Chen ◽

Ryoichi Takashima ◽

Tetsuya Takiguchi ◽

...

Keyword(s):

Neural Networks ◽

Computer Vision ◽

Convolutional Neural Networks ◽

Language Processing ◽

State Of The Art ◽

Input Image ◽

Memory Consumption ◽

Excellent Performance ◽

Conceptual Approach ◽

Recent Developments

Abstract Currently, deep learning plays an indispensable role in many fields, including computer vision, natural language processing, and speech recognition. Convolutional Neural Networks (CNNs) have demonstrated excellent performance in computer vision tasks thanks to their powerful feature extraction capability. However, as the larger models have shown higher accuracy, recent developments have led to state-of-the-art CNN models with increasing resource consumption. This paper investigates a conceptual approach to reduce the memory consumption of CNN inference. Our method consists of processing the input image in a sequence of carefully designed tiles within the lower subnetwork of the CNN, so as to minimize its peak memory consumption, while keeping the end-to-end computation unchanged. This method introduces a trade-off between memory consumption and computations, which is particularly suitable for high-resolution inputs. Our experimental results show that MobileNetV2 memory consumption can be reduced by up to 5.3 times with our proposed method. For ResNet50, one of the most commonly used CNN models in computer vision tasks, memory can be optimized by up to 2.3 times.

Download Full-text