scholarly journals Potato crop stress identification in aerial images using deep learning‐based object detection

2021 ◽  
Author(s):  
Sujata Butte ◽  
Aleksandar Vakanski ◽  
Kasia Duellman ◽  
Haotian Wang ◽  
Amin Mirkouei
Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2834
Author(s):  
Billur Kazaz ◽  
Subhadipto Poddar ◽  
Saeed Arabi ◽  
Michael A. Perez ◽  
Anuj Sharma ◽  
...  

Construction activities typically create large amounts of ground disturbance, which can lead to increased rates of soil erosion. Construction stormwater practices are used on active jobsites to protect downstream waterbodies from offsite sediment transport. Federal and state regulations require routine pollution prevention inspections to ensure that temporary stormwater practices are in place and performing as intended. This study addresses the existing challenges and limitations in the construction stormwater inspections and presents a unique approach for performing unmanned aerial system (UAS)-based inspections. Deep learning-based object detection principles were applied to identify and locate practices installed on active construction sites. The system integrates a post-processing stage by clustering results. The developed framework consists of data preparation with aerial inspections, model training, validation of the model, and testing for accuracy. The developed model was created from 800 aerial images and was used to detect four different types of construction stormwater practices at 100% accuracy on the Mean Average Precision (MAP) with minimal false positive detections. Results indicate that object detection could be implemented on UAS-acquired imagery as a novel approach to construction stormwater inspections and provide accurate results for site plan comparisons by rapidly detecting the quantity and location of field-installed stormwater practices.


Electronics ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 583 ◽  
Author(s):  
Khang Nguyen ◽  
Nhut T. Huynh ◽  
Phat C. Nguyen ◽  
Khanh-Duy Nguyen ◽  
Nguyen D. Vo ◽  
...  

Unmanned aircraft systems or drones enable us to record or capture many scenes from the bird’s-eye view and they have been fast deployed to a wide range of practical domains, i.e., agriculture, aerial photography, fast delivery and surveillance. Object detection task is one of the core steps in understanding videos collected from the drones. However, this task is very challenging due to the unconstrained viewpoints and low resolution of captured videos. While deep-learning modern object detectors have recently achieved great success in general benchmarks, i.e., PASCAL-VOC and MS-COCO, the robustness of these detectors on aerial images captured by drones is not well studied. In this paper, we present an evaluation of state-of-the-art deep-learning detectors including Faster R-CNN (Faster Regional CNN), RFCN (Region-based Fully Convolutional Networks), SNIPER (Scale Normalization for Image Pyramids with Efficient Resampling), Single-Shot Detector (SSD), YOLO (You Only Look Once), RetinaNet, and CenterNet for the object detection in videos captured by drones. We conduct experiments on VisDrone2019 dataset which contains 96 videos with 39,988 annotated frames and provide insights into efficient object detectors for aerial images.


AI ◽  
2020 ◽  
Vol 1 (2) ◽  
pp. 166-179 ◽  
Author(s):  
Ziyang Tang ◽  
Xiang Liu ◽  
Hanlin Chen ◽  
Joseph Hupy ◽  
Baijian Yang

Unmanned Aerial Systems, hereafter referred to as UAS, are of great use in hazard events such as wildfire due to their ability to provide high-resolution video imagery over areas deemed too dangerous for manned aircraft and ground crews. This aerial perspective allows for identification of ground-based hazards such as spot fires and fire lines, and to communicate this information with fire fighting crews. Current technology relies on visual interpretation of UAS imagery, with little to no computer-assisted automatic detection. With the help of big labeled data and the significant increase of computing power, deep learning has seen great successes on object detection with fixed patterns, such as people and vehicles. However, little has been done for objects, such as spot fires, with amorphous and irregular shapes. Additional challenges arise when data are collected via UAS as high-resolution aerial images or videos; an ample solution must provide reasonable accuracy with low delays. In this paper, we examined 4K ( 3840 × 2160 ) videos collected by UAS from a controlled burn and created a set of labeled video sets to be shared for public use. We introduce a coarse-to-fine framework to auto-detect wildfires that are sparse, small, and irregularly-shaped. The coarse detector adaptively selects the sub-regions that are likely to contain the objects of interest while the fine detector passes only the details of the sub-regions, rather than the entire 4K region, for further scrutiny. The proposed two-phase learning therefore greatly reduced time overhead and is capable of maintaining high accuracy. Compared against the real-time one-stage object backbone of YoloV3, the proposed methods improved the mean average precision(mAP) from 0 . 29 to 0 . 67 , with an average inference speed of 7.44 frames per second. Limitations and future work are discussed with regard to the design and the experiment results.


Author(s):  
Jiajia Liao ◽  
Yujun Liu ◽  
Yingchao Piao ◽  
Jinhe Su ◽  
Guorong Cai ◽  
...  

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.


2021 ◽  
Author(s):  
Mirela Beloiu ◽  
Dimitris Poursanidis ◽  
Samuel Hoffmann ◽  
Nektarios Chrysoulakis ◽  
Carl Beierkuhnlein

<p>Recent advances in deep learning techniques for object detection and the availability of high-resolution images facilitate the analysis of both temporal and spatial vegetation patterns in remote areas. High-resolution satellite imagery has been used successfully to detect trees in small areas with homogeneous rather than heterogeneous forests, in which single tree species have a strong contrast compared to their neighbors and landscape. However, no research to date has detected trees at the treeline in the remote and complex heterogeneous landscape of Greece using deep learning methods. We integrated high-resolution aerial images, climate data, and topographical characteristics to study the treeline dynamic over 70 years in the Samaria National Park on the Mediterranean island of Crete, Greece. We combined mapping techniques with deep learning approaches to detect and analyze spatio-temporal dynamics in treeline position and tree density. We use visual image interpretation to detect single trees on high-resolution aerial imagery from 1945, 2008, and 2015. Using the RGB aerial images from 2008 and 2015 we test a Convolution Neural Networks (CNN)-object detection approach (SSD) and a CNN-based segmentation technique (U-Net). Based on the mapping and deep learning approach, we have not detected a shift in treeline elevation over the last 70 years, despite warming, although tree density has increased. However, we show that CNN approach accurately detects and maps tree position and density at the treeline. We also reveal that the treeline elevation on Crete varies with topography. Treeline elevation decreases from the southern to the northern study sites. We explain these differences between study sites by the long-term interaction between topographical characteristics and meteorological factors. The study highlights the feasibility of using deep learning and high-resolution imagery as a promising technique for monitoring forests in remote areas.</p>


2017 ◽  
Author(s):  
Lars W. Sommer ◽  
Tobias Schuchert ◽  
Jürgen Beyerer

2019 ◽  
Author(s):  
◽  
Peng Sun

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the widespread usage of many different types of sensors in recent years, large amounts of diverse and complex sensor data have been generated and analyzed to extract useful information. This dissertation focuses on two types of data: aerial images and physiological sensor data. Several new methods have been proposed based on deep learning techniques to advance the state-of-the-art in analyzing these data. For aerial images, a new method for designing effective loss functions for training deep neural networks for object detection, called adaptive salience biased loss (ASBL), has been proposed. In addition, several state-of-the-art deep neural network models for object detection, including RetinaNet, UNet, Yolo, etc., have been adapted and modified to achieve improved performance on a new set of real-world aerial images for bird detection. For physiological sensor data, a deep learning method for alcohol usage detection, called Deep ADA, has been proposed to improve the automatic detection of alcohol usage (ADA) system, which is statistical data analysis pipeline to detect drinking episodes based on wearable physiological sensor data collected from real subjects. Object detection in aerial images remains a challenging problem due to low image resolutions, complex backgrounds, and variations of sizes and orientations of objects in images. The new ASBL method has been designed for training deep neural network object detectors to achieve improved performance. ASBL can be implemented at the image level, which is called image-based ASBL, or at the anchor level, which is called anchor-based ASBL. The method computes saliency information of input images and anchors generated by deep neural network object detectors, and weights different training examples and anchors differently based on their corresponding saliency measurements. It gives complex images and difficult targets more weights during training. In our experiments using two of the largest public benchmark data sets of aerial images, DOTA and NWPU VHR-10, the existing RetinaNet was trained using ASBL to generate an one-stage detector, ASBL-RetinaNet. ASBL-RetinaNet significantly outperformed the original RetinaNet by 3.61 mAP and 12.5 mAP on the two data sets, respectively. In addition, ASBL-RetinaNet outperformed 10 other state-of-art object detection methods. To improve bird detection in aerial images, the Little Birds in Aerial Imagery (LBAI) dataset has been created from real-life aerial imagery data. LBAI contains various flocks and species of birds that are small in size, ranging from 10 by 10 pixel to 40 by 40 pixel. The dataset was labeled and further divided into two subsets, Easy and Hard, based on the complex of background. We have applied and improved some of the best deep learning models to LBAI images, including object detection techniques, such as YOLOv3, SSD, and RetinaNet, and semantic segmentation techniques, such as U-Net and Mask R-CNN. Experimental results show that RetinaNet performed the best overall, outperforming other models by 1.4 and 4.9 F1 scores on the Easy and Hard LBAI dataset, respectively. For physiological sensor data analysis, Deep ADA has been developed to extract features from physiological signals and predict alcohol usage of real subjects in their daily lives. The features extracted are using Convolutional Neural Networks without any human intervention. A large amount of unlabeled data has been used in an unsupervised learning matter to improve the quality of learned features. The method outperformed traditional feature extraction methods by up to 19% higher accuracy.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1718
Author(s):  
Chien-Hsing Chou ◽  
Yu-Sheng Su ◽  
Che-Ju Hsu ◽  
Kong-Chang Lee ◽  
Ping-Hsuan Han

In this study, we designed a four-dimensional (4D) audiovisual entertainment system called Sense. This system comprises a scene recognition system and hardware modules that provide haptic sensations for users when they watch movies and animations at home. In the scene recognition system, we used Google Cloud Vision to detect common scene elements in a video, such as fire, explosions, wind, and rain, and further determine whether the scene depicts hot weather, rain, or snow. Additionally, for animated videos, we applied deep learning with a single shot multibox detector to detect whether the animated video contained scenes of fire-related objects. The hardware module was designed to provide six types of haptic sensations set as line-symmetry to provide a better user experience. After the system considers the results of object detection via the scene recognition system, the system generates corresponding haptic sensations. The system integrates deep learning, auditory signals, and haptic sensations to provide an enhanced viewing experience.


Sign in / Sign up

Export Citation Format

Share Document