scholarly journals GLE-Net: A Global and Local Ensemble Network for Aerial Object Detection

Author(s):  
Jiajia Liao ◽  
Yujun Liu ◽  
Yingchao Piao ◽  
Jinhe Su ◽  
Guorong Cai ◽  
...  

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.

2020 ◽  
Vol 71 (7) ◽  
pp. 868-880
Author(s):  
Nguyen Hong-Quan ◽  
Nguyen Thuy-Binh ◽  
Tran Duc-Long ◽  
Le Thi-Lan

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2834
Author(s):  
Billur Kazaz ◽  
Subhadipto Poddar ◽  
Saeed Arabi ◽  
Michael A. Perez ◽  
Anuj Sharma ◽  
...  

Construction activities typically create large amounts of ground disturbance, which can lead to increased rates of soil erosion. Construction stormwater practices are used on active jobsites to protect downstream waterbodies from offsite sediment transport. Federal and state regulations require routine pollution prevention inspections to ensure that temporary stormwater practices are in place and performing as intended. This study addresses the existing challenges and limitations in the construction stormwater inspections and presents a unique approach for performing unmanned aerial system (UAS)-based inspections. Deep learning-based object detection principles were applied to identify and locate practices installed on active construction sites. The system integrates a post-processing stage by clustering results. The developed framework consists of data preparation with aerial inspections, model training, validation of the model, and testing for accuracy. The developed model was created from 800 aerial images and was used to detect four different types of construction stormwater practices at 100% accuracy on the Mean Average Precision (MAP) with minimal false positive detections. Results indicate that object detection could be implemented on UAS-acquired imagery as a novel approach to construction stormwater inspections and provide accurate results for site plan comparisons by rapidly detecting the quantity and location of field-installed stormwater practices.


2021 ◽  
Vol 23 (11) ◽  
pp. 159-165
Author(s):  
JAYANTH DWIJESH H P ◽  
◽  
SANDEEP S V ◽  
RASHMI S ◽  
◽  
...  

In today’s world, accurate and fast information is vital for safe aircraft landings. The purpose of an EMAS (Engineered Materials Arresting System) is to prevent an aeroplane from overrunning with no human injury and minimal damage to the aircraft. Although various algorithms for object detection analysis have been developed, only a few researchers have examined image analysis as a landing assist. Image intensity edges are employed in one system to detect the sides of a runway in an image sequence, allowing the runway’s 3-dimensional position and orientation to be approximated. A fuzzy network system is used to improve object detection and extraction from aerial images. In another system, multi-scale, multiplatform imagery is used to combine physiologically and geometrically inspired algorithms for recognizing objects from hyper spectral and/or multispectral (HS/MS) imagery. However, the similarity in the top view of runways, buildings, highways, and other objects is a disadvantage of these methods. We propose a new method for detecting and tracking the runway based on pattern matching and texture analysis of digital images captured by aircraft cameras. Edge detection techniques are used to recognize runways from aerial images. The edge detection algorithms employed in this paper are the Hough Transform, Canny Filter, and Sobel Filter algorithms, which result in efficient detection.


2021 ◽  
Vol 23 (06) ◽  
pp. 47-57
Author(s):  
Aditya Kulkarni ◽  
◽  
Manali Munot ◽  
Sai Salunkhe ◽  
Shubham Mhaske ◽  
...  

With the development in technologies right from serial to parallel computing, GPU, AI, and deep learning models a series of tools to process complex images have been developed. The main focus of this research is to compare various algorithms(pre-trained models) and their contributions to process complex images in terms of performance, accuracy, time, and their limitations. The pre-trained models we are using are CNN, R-CNN, R-FCN, and YOLO. These models are python language-based and use libraries like TensorFlow, OpenCV, and free image databases (Microsoft COCO and PAS-CAL VOC 2007/2012). These not only aim at object detection but also on building bounding boxes around appropriate locations. Thus, by this review, we get a better vision of these models and their performance and a good idea of which models are ideal for various situations.


2021 ◽  
Vol 2113 (1) ◽  
pp. 012045
Author(s):  
Chunlei Zhou ◽  
Xiangzhou Chen ◽  
Wenli Liu ◽  
Tianyu Dong ◽  
Huang Yun

Abstract With the increase in the number of traction substations year by year, manual inspections are gradually being replaced by unattended inspections. Target detection algorithms based on deep learning are more widely used in intelligent inspections of power equipment. However, in practical applications, it is found that due to the small target to be detected, the accuracy of the deep learning model will decrease when the shooting angle is inclined and the light conditions are poor. This is because the algorithm’s robustness is low, and the detection ability of the model will be seriously affected when the angle or illumination difference with the sample is large. Based on this, the feature fusion part of the YOLOv3 algorithm and the selection of the loss function and the size of the anchor frame are improved, and the improved ASFF fusion method is used to classify various images in the power equipment. Actual measurement and repeated experiments show that the proposed method can be effectively applied to image recognition of various power equipment, optimize robustness, and greatly improve the image recognition efficiency of power equipment.


2021 ◽  
Author(s):  
Sujata Butte ◽  
Aleksandar Vakanski ◽  
Kasia Duellman ◽  
Haotian Wang ◽  
Amin Mirkouei

Author(s):  
Vibhavari B Rao

The crime rates today can inevitably put a civilian's life in danger. While consistent efforts are being made to alleviate crime, there is also a dire need to create a smart and proactive surveillance system. Our project implements a smart surveillance system that would alert the authorities in real-time when a crime is being committed. During armed robberies and hostage situations, most often, the police cannot reach the place on time to prevent it from happening, owing to the lag in communication between the informants of the crime scene and the police. We propose an object detection model that implements deep learning algorithms to detect objects of violence such as pistols, knives, rifles from video surveillance footage, and in turn send real-time alerts to the authorities. There are a number of object detection algorithms being developed, each being evaluated under the performance metric mAP. On implementing Faster R-CNN with ResNet 101 architecture we found the mAP score to be about 91%. However, the downside to this is the excessive training and inferencing time it incurs. On the other hand, YOLOv5 architecture resulted in a model that performed very well in terms of speed. Its training speed was found to be 0.012 s / image during training but naturally, the accuracy was not as high as Faster R-CNN. With good computer architecture, it can run at about 40 fps. Thus, there is a tradeoff between speed and accuracy and it's important to strike a balance. We use transfer learning to improve accuracy by training the model on our custom dataset. This project can be deployed on any generic CCTV camera by setting up a live RTSP (real-time streaming protocol) and streaming the footage on a laptop or desktop where the deep learning model is being run.


2020 ◽  
Vol 12 (9) ◽  
pp. 1435 ◽  
Author(s):  
Chengyuan Li ◽  
Bin Luo ◽  
Hailong Hong ◽  
Xin Su ◽  
Yajun Wang ◽  
...  

Different from object detection in natural image, optical remote sensing object detection is a challenging task, due to the diverse meteorological conditions, complex background, varied orientations, scale variations, etc. In this paper, to address this issue, we propose a novel object detection network (the global-local saliency constraint network, GLS-Net) that can make full use of the global semantic information and achieve more accurate oriented bounding boxes. More precisely, to improve the quality of the region proposals and bounding boxes, we first propose a saliency pyramid which combines a saliency algorithm with a feature pyramid network, to reduce the impact of complex background. Based on the saliency pyramid, we then propose a global attention module branch to enhance the semantic connection between the target and the global scenario. A fast feature fusion strategy is also used to combine the local object information based on the saliency pyramid with the global semantic information optimized by the attention mechanism. Finally, we use an angle-sensitive intersection over union (IoU) method to obtain a more accurate five-parameter representation of the oriented bounding boxes. Experiments with a publicly available object detection dataset for aerial images demonstrate that the proposed GLS-Net achieves a state-of-the-art detection performance.


Sign in / Sign up

Export Citation Format

Share Document