OSDDY: embedded system-based object surveillance detection system with small drone using deep YOLO

AbstractComputer vision is an interdisciplinary domain for object detection. Object detection relay is a vital part in assisting surveillance, vehicle detection and pose estimation. In this work, we proposed a novel deep you only look once (deep YOLO V3) approach to detect the multi-object. This approach looks at the entire frame during the training and test phase. It followed a regression-based technique that used a probabilistic model to locate objects. In this, we construct 106 convolution layers followed by 2 fully connected layers and 812 × 812 × 3 input size to detect the drones with small size. We pre-train the convolution layers for classification at half the resolution and then double the resolution for detection. The number of filters of each layer will be set to 16. The number of filters of the last scale layer is more than 16 to improve the small object detection. This construction uses up-sampling techniques to improve undesired spectral images into the existing signal and rescaling the features in specific locations. It clearly reveals that the up-sampling detects small objects. It actually improves the sampling rate. This YOLO architecture is preferred because it considers less memory resource and computation cost rather than more number of filters. The proposed system is designed and trained to perform a single type of class called drone and the object detection and tracking is performed with the embedded system-based deep YOLO. The proposed YOLO approach predicts the multiple bounding boxes per grid cell with better accuracy. The proposed model has been trained with a large number of small drones with different conditions like open field, and marine environment with complex background.

Download Full-text

Design and Implementation of Spurs Detection System Based on OpenCV

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.345.217 ◽

2011 ◽

Vol 345 ◽

pp. 217-222

Author(s):

Peng He ◽

Lian Peng Wang ◽

Na Wang ◽

Gang Xu

Keyword(s):

Computed Tomography ◽

Embedded System ◽

Detection System ◽

Experimental Results ◽

Design And Implementation ◽

Function Expansion ◽

The Embedded System ◽

Improved Function ◽

Small Bone ◽

Improved Algorithm

In order to better solve the problem of detection of small bone spurs with convenient and accurate way, a portable spur detection system is designed. This system, in view of spur reproducibility characteristic, is characterized by the application for a kind of the improved algorithm based on the OpenCV. And it was successfully transplanted into the embedded system. The experimental results indicated that this system might precisely examine the small spur with difficulty discovery by naked eyes used fully by two images of computed tomography which done in different periods. The spur detection system needs to be further improved function to realize more applications. In fact, function expansion based on the system is easy to realize.

Download Full-text

Pest Animal's Detection, and Habitat Identification in Low-resolution Airborne Thermal Imagery

10.20944/preprints202009.0480.v1 ◽

2020 ◽

Author(s):

Anwaar Ulhaq

Keyword(s):

Invasive Species ◽

High Resolution ◽

Object Detection ◽

Visual Analysis ◽

Detection System ◽

Detection Algorithm ◽

Probability Of Detection ◽

Small Object ◽

Pest Species ◽

Thermal Imagery

Invasive species are significant threats to global agriculture and food security being the major causes of crop loss. An operative biosecurity policy requires full automation of detection and habitat identification of the potential pests and pathogens. Unmanned Aerial Vehicles (UAVs) mounted thermal imaging cameras can observe and detect pest animals and their habitats, and estimate their population size around the clock. However, their effectiveness becomes limited due to manual detection of cryptic species in hours of captured flight videos, failure in habitat disclosure and the requirement of expensive high-resolution cameras. Therefore, the cost and efficiency trade-off often restricts the use of these systems. In this paper, we present an invasive animal species detection system that uses cost-effectiveness of consumer-level cameras while harnessing the power of transfer learning and an optimised small object detection algorithm. Our proposed optimised object detection algorithm named Optimised YOLO (OYOLO) enhances YOLO (You Only Look Once) by improving its training and structure for remote detection of elusive targets. Our system, trained on the massive data collected from New South Wales and Western Australia, can detect invasive species (rabbits, Kangaroos and pigs) in real-time with a higher probability of detection (85–100 %), compared to the manual detection. This work will enhance the visual analysis of pest species while performing well on low, medium and high-resolution thermal imagery, and equally accessible to all stakeholders and end-users in Australia via a public cloud.

Download Full-text

An Evaluation of Deep Learning Methods for Small Object Detection

Journal of Electrical and Computer Engineering ◽

10.1155/2020/3189691 ◽

2020 ◽

Vol 2020 ◽

pp. 1-18 ◽

Cited By ~ 2

Author(s):

Nhat-Duy Nguyen ◽

Tien Do ◽

Thanh Duc Ngo ◽

Duy-Dinh Le

Keyword(s):

Deep Learning ◽

Object Detection ◽

State Of The Art ◽

Rapid Development ◽

Empirical Evaluation ◽

Grid Cell ◽

Small Object ◽

Feature Maps ◽

Comparative Results ◽

Small Object Detection

Small object detection is an interesting topic in computer vision. With the rapid development in deep learning, it has drawn attention of several researchers with innovations in approaches to join a race. These innovations proposed comprise region proposals, divided grid cell, multiscale feature maps, and new loss function. As a result, performance of object detection has recently had significant improvements. However, most of the state-of-the-art detectors, both in one-stage and two-stage approaches, have struggled with detecting small objects. In this study, we evaluate current state-of-the-art models based on deep learning in both approaches such as Fast RCNN, Faster RCNN, RetinaNet, and YOLOv3. We provide a profound assessment of the advantages and limitations of models. Specifically, we run models with different backbones on different datasets with multiscale objects to find out what types of objects are suitable for each model along with backbones. Extensive empirical evaluation was conducted on 2 standard datasets, namely, a small object dataset and a filtered dataset from PASCAL VOC 2007. Finally, comparative results and analyses are then presented.

Download Full-text

Learning Adjustable Reduced Downsampling Network for Small Object Detection in Urban Environments

Remote Sensing ◽

10.3390/rs13183608 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3608

Author(s):

Huijie Zhang ◽

Li An ◽

Vena W. Chu ◽

Douglas A. Stow ◽

Xiaobai Liu ◽

...

Keyword(s):

Object Detection ◽

Network Architecture ◽

Sample Selection ◽

Urban Environments ◽

Statistical Characteristics ◽

Small Object ◽

Feature Representations ◽

Training Samples ◽

Bounding Boxes ◽

Small Object Detection

Detecting small objects (e.g., manhole covers, license plates, and roadside milestones) in urban images is a long-standing challenge mainly due to the scale of small object and background clutter. Although convolution neural network (CNN)-based methods have made significant progress and achieved impressive results in generic object detection, the problem of small object detection remains unsolved. To address this challenge, in this study we developed an end-to-end network architecture that has three significant characteristics compared to previous works. First, we designed a backbone network module, namely Reduced Downsampling Network (RD-Net), to extract informative feature representations with high spatial resolutions and preserve local information for small objects. Second, we introduced an Adjustable Sample Selection (ADSS) module which frees the Intersection-over-Union (IoU) threshold hyperparameters and defines positive and negative training samples based on statistical characteristics between generated anchors and ground reference bounding boxes. Third, we incorporated the generalized Intersection-over-Union (GIoU) loss for bounding box regression, which efficiently bridges the gap between distance-based optimization loss and area-based evaluation metrics. We demonstrated the effectiveness of our method by performing extensive experiments on the public Urban Element Detection (UED) dataset acquired by Mobile Mapping Systems (MMS). The Average Precision (AP) of the proposed method was 81.71%, representing an improvement of 1.2% compared with the popular detection framework Faster R-CNN.

Download Full-text

Objects Detection Using Sensors Data Fusion in Autonomous Driving Scenarios

Electronics ◽

10.3390/electronics10232903 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2903

Author(s):

Razvan Bocu ◽

Dorin Bocu ◽

Maksim Iavich

Keyword(s):

Object Detection ◽

Detection System ◽

Autonomous Driving ◽

3D Object ◽

3D Objects ◽

Autonomous Cars ◽

Bounding Boxes ◽

Objects Detection ◽

3D Object Detection ◽

Fine Tune

The relatively complex task of detecting 3D objects is essential in the realm of autonomous driving. The related algorithmic processes generally produce an output that consists of a series of 3D bounding boxes that are placed around specific objects of interest. The related scientific literature usually suggests that the data that are generated by different sensors or data acquisition devices are combined in order to work around inherent limitations that are determined by the consideration of singular devices. Nevertheless, there are practical issues that cannot be addressed reliably and efficiently through this strategy, such as the limited field-of-view, and the low-point density of acquired data. This paper reports a contribution that analyzes the possibility of efficiently and effectively using 3D object detection in a cooperative fashion. The evaluation of the described approach is performed through the consideration of driving data that is collected through a partnership with several car manufacturers. Considering their real-world relevance, two driving contexts are analyzed: a roundabout, and a T-junction. The evaluation shows that cooperative perception is able to isolate more than 90% of the 3D entities, as compared to approximately 25% in the case when singular sensing devices are used. The experimental setup that generated the data that this paper describes, and the related 3D object detection system, are currently actively used by the respective car manufacturers’ research groups in order to fine tune and improve their autonomous cars’ driving modules.

Download Full-text

Deep learning for small object detection in images

10.32469/10355/79470 ◽

2020 ◽

Author(s):

◽

Yang Liu

Keyword(s):

Deep Learning ◽

Object Detection ◽

State Of The Art ◽

Aerial Imagery ◽

Small Object ◽

Learning Models ◽

Learning Methods ◽

Bounding Boxes ◽

Small Object Detection ◽

Instance Segmentation

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the rapid development of deep learning in computer vision, especially deep convolutional neural networks (CNNs), significant advances have been made in recent years on object recognition and detection in images. Highly accurate detection results have been achieved for large objects, whereas detection accuracy on small objects remains to be low. This dissertation focuses on investigating deep learning methods for small object detection in images and proposing new methods with improved performance. First, we conducted a comprehensive review of existing deep learning methods for small object detections, in which we summarized and categorized major techniques and models, identified major challenges, and listed some future research directions. Existing techniques were categorized into using contextual information, combining multiple feature maps, creating sufficient positive examples, and balancing foreground and background examples. Methods developed in four related areas, generic object detection, face detection, object detection in aerial imagery, and segmentation, were summarized and compared. In addition, the performances of several leading deep learning methods for small object detection, including YOLOv3, Faster R-CNN, and SSD, were evaluated based on three large benchmark image datasets of small objects. Experimental results showed that Faster R-CNN performed the best, while YOLOv3 was a close second. Furthermore, a new deep learning method, called Retina-context Net, was proposed and outperformed state-of-the art one-stage deep learning models, including SSD, YOLOv3 and RetinaNet, on the COCO and SUN benchmark datasets. Secondly, we created a new dataset for bird detection, called Little Birds in Aerial Imagery (LBAI), from real-life aerial imagery. LBAI contains birds with sizes ranging from 10 by 10 pixels to 40 by 40 pixels. We adapted and applied several state-of-the-art deep learning models to LBAI, including object detection models such as YOLOv2, SSH, and Tiny Face, and instance segmentation models such as U-Net and Mask R-CNN. Our empirical results illustrated the strength and weakness of these methods, showing that SSH performed the best for easy cases, whereas Tiny Face performed the best for hard cases with cluttered backgrounds. Among small instance segmentation methods, U-Net achieved slightly better performance than Mask R-CNN. Thirdly, we proposed a new graph neural network-based object detection algorithm, called GODM, to take the spatial information of candidate objects into consideration in small object detection. Instead of detecting small objects independently as the existing deep learning methods do, GODM treats the candidate bounding boxes generated by existing object detectors as nodes and creates edges based on the spatial or semantic relationship between the candidate bounding boxes. GODM contains four major components: node feature generation, graph generation, node class labelling, and graph convolutional neural network model. Several graph generation methods were proposed. Experimental results on the LBDA dataset show that GODM outperformed existing state-of-the-art object detector Faster R-CNN significantly, up to 12% better in accuracy. Finally, we proposed a new computer vision-based grass analysis using machine learning. To deal with the variation of lighting condition, a two-stage segmentation strategy is proposed for grass coverage computation based on a blackboard background. On a real world dataset we collected from natural environments, the proposed method was robust to varying environments, lighting, and colors. For grass detection and coverage computation, the error rate was just 3%.

Download Full-text

An Approach to Improve SSD through Skip Connection of Multiscale Feature Maps

Computational Intelligence and Neuroscience ◽

10.1155/2020/2936920 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Xiaoguo Zhang ◽

Ye Gao ◽

Fei Ye ◽

Qihan Liu ◽

Kaixin Zhang

Keyword(s):

Object Detection ◽

Real Time ◽

Semantic Information ◽

Feature Fusion ◽

Poor Performance ◽

Detection Performance ◽

Single Shot ◽

Small Object ◽

Feature Maps ◽

Input Size

SSD (Single Shot MultiBox Detector) is one of the best object detection algorithms and is able to provide high accurate object detection performance in real time. However, SSD shows relatively poor performance on small object detection because its shallow prediction layer, which is responsible for detecting small objects, lacks enough semantic information. To overcome this problem, SKIPSSD, an improved SSD with a novel skip connection of multiscale feature maps, is proposed in this paper to enhance the semantic information and the details of the prediction layers through skippingly fusing high-level and low-level feature maps. For the detail of the fusion methods, we design two feature fusion modules and multiple fusion strategies to improve the SSD detector’s sensitivity and perception ability. Experimental results on the PASCAL VOC2007 test set demonstrate that SKIPSSD significantly improves the detection performance and outperforms lots of state-of-the-art object detectors. With an input size of 300 × 300, SKIPSSD achieves 79.0% mAP (mean average precision) at 38.7 FPS (frame per second) on a single 1080 GPU, 1.8% higher than the mAP of SSD while still keeping the real-time detection speed.

Download Full-text

A Practice for Object Detection Using YOLO Algorithm

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217249 ◽

2021 ◽

pp. 268-272

Author(s):

Dr. Suwarna Gothane

Keyword(s):

Neural Network ◽

Object Detection ◽

Grid Cell ◽

Confidence Score ◽

Autonomous Driving ◽

Time Frame ◽

Input Image ◽

Small Time ◽

Bounding Boxes ◽

Over Time

When we look at images or videos, we can easily locate and identify the objects of our interest within moments. Passing on this intelligence to computers is nothing but object detection - locating the object and identifying it. Object Detection has found its application in a wide variety of domains such as video surveillance, image retrieval systems, autonomous driving vehicles and many more. Various algorithms can be used for object detection but we will be focusing on the YoloV3 algorithm. YOLO stands for "You Only Look Once". The YOLO model is very accurate and allows us to detect the objects present in the frame. YOLO follows a completely different approach. Instead of selecting some regions, it applies a neural network to the entire image to predict bounding boxes and their probabilities. YOLO is a single deep convolutional neural network that splits the input image into a set of grid cells, so unlike image classification or face detection, each grid cell in YOLO algorithm will have an associated vector in the output that tells us if an object exists in that grid cell, the class of that object, the predicted bounding box for that object. The model here is progressive so it learns more over time, increasing its prediction accuracy over time. The way the model works is that it makes many predictions in one frame and decides to use the most accurate prediction, thus discarding the other. The predictions are made randomly, so if the model feels like there is an object in the frame which is of a very small pixel it will take that also into consideration. To make it more precise and clearer, the model simply creates bounding boxes around everything in the frame, it would make predictions for each box and pick the one with the most confidence score. All this is done in a small-time frame, thus showing why this specific model is the best to use in a real time situation.

Download Full-text