scholarly journals Individual Sick Fir Tree (Abies mariesii) Identification in Insect Infested Forests by Means of UAV Images and Deep Learning

2021 ◽  
Vol 13 (2) ◽  
pp. 260
Author(s):  
Ha Trang Nguyen ◽  
Maximo Larry Lopez Caceres ◽  
Koma Moritake ◽  
Sarah Kentsch ◽  
Hase Shu ◽  
...  

Insect outbreaks are a recurrent natural phenomenon in forest ecosystems expected to increase due to climate change. Recent advances in Unmanned Aerial Vehicles (UAV) and Deep Learning (DL) Networks provide us with tools to monitor them. In this study we used nine orthomosaics and normalized Digital Surface Models (nDSM) to detect and classify healthy and sick Maries fir trees as well as deciduous trees. This study aims at automatically classifying treetops by means of a novel computer vision treetops detection algorithm and the adaptation of existing DL architectures. Considering detection alone, the accuracy results showed 85.70% success. In terms of detection and classification, we were able to detect/classify correctly 78.59% of all tree classes (39.64% for sick fir). However, with data augmentation, detection/classification percentage of the sick fir class rose to 73.01% at the cost of the result accuracy of all tree classes that dropped 63.57%. The implementation of UAV, computer vision and DL techniques contribute to the development of a new approach to evaluate the impact of insect outbreaks in forest.

2021 ◽  
Author(s):  
Ha Trang Nguyen ◽  
Maximo Larry Lopez Caceres ◽  
Koma Moritake ◽  
Sarah Kentsch ◽  
Hase Shu ◽  
...  

<p>Insect outbreaks are a recurrent natural phenomenon in forest ecosystems expected to increase due to climate change. Recent advances in Unmanned Aerial Vehicles (UAV) and Deep Learning (DL) Networks provide us with tools to monitor them. In this study we used nine orthomosaics and normalized Digital Surface Models (nDSM) to detect and classify healthy and sick Maries fir trees as well as deciduous trees. This study aims at automatically classifying treetops by means of a novel computer vision treetops detection algorithm and the adaptation of existing DL architectures. Considering detection alone, the accuracy results showed 85.70% success. In terms of detection and classification, we were able to detect/classify correctly 78.59% of all tree classes (39.64% for sick fir). However, with data augmentation, detection/classification percentage of the sick fir class rose to 73.01% at the cost of the result accuracy of all tree classes that dropped 63.57%. The implementation of UAV, computer vision and DL techniques contribute to the development of a new approach to evaluate the impact of insect outbreaks in forest.</p>


2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Malte Seemann ◽  
Lennart Bargsten ◽  
Alexander Schlaefer

AbstractDeep learning methods produce promising results when applied to a wide range of medical imaging tasks, including segmentation of artery lumen in computed tomography angiography (CTA) data. However, to perform sufficiently, neural networks have to be trained on large amounts of high quality annotated data. In the realm of medical imaging, annotations are not only quite scarce but also often not entirely reliable. To tackle both challenges, we developed a two-step approach for generating realistic synthetic CTA data for the purpose of data augmentation. In the first step moderately realistic images are generated in a purely numerical fashion. In the second step these images are improved by applying neural domain adaptation. We evaluated the impact of synthetic data on lumen segmentation via convolutional neural networks (CNNs) by comparing resulting performances. Improvements of up to 5% in terms of Dice coefficient and 20% for Hausdorff distance represent a proof of concept that the proposed augmentation procedure can be used to enhance deep learning-based segmentation for artery lumen in CTA images.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


Author(s):  
Du Chunqi ◽  
Shinobu Hasegawa

In computer vision and computer graphics, 3D reconstruction is the process of capturing real objects’ shapes and appearances. 3D models always can be constructed by active methods which use high-quality scanner equipment, or passive methods that learn from the dataset. However, both of these two methods only aimed to construct the 3D models, without showing what element affects the generation of 3D models. Therefore, the goal of this research is to apply deep learning to automatically generating 3D models, and finding the latent variables which affect the reconstructing process. The existing research GANs can be trained in little data with two networks called Generator and Discriminator, respectively. Generator can produce synthetic data, and Discriminator can discriminate between the generator’s output and real data. The existing research shows that InFoGAN can maximize the mutual information between latent variables and observation. In our approach, we will generate the 3D models based on InFoGAN and design two constraints, shape-constraint and parameters-constraint, respectively. Shape-constraint utilizes the data augmentation method to limit the synthetic data generated in the models’ profiles. At the same time, we also try to employ parameters-constraint to find the 3D models’ relationship corresponding to the latent variables. Furthermore, our approach will be a challenge in the architecture of generating 3D models built on InFoGAN. Finally, in the process of generation, we might discover the contribution of the latent variables influencing the 3D models to the whole network.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Hassan Raza Bukhari ◽  
Rafia Mumtaz ◽  
Salman Inayat ◽  
Uferah Shafi ◽  
Ihsan Ul Haq ◽  
...  

Information ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 125 ◽  
Author(s):  
Alexander Buslaev ◽  
Vladimir I. Iglovikov ◽  
Eugene Khvedchenya ◽  
Alex Parinov ◽  
Mikhail Druzhinin ◽  
...  

Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations that preserve corresponding output labels. In computer vision, image augmentations have become a common implicit regularization technique to combat overfitting in deep learning models and are ubiquitously used to improve performance. While most deep learning frameworks implement basic image transformations, the list is typically limited to some variations of flipping, rotating, scaling, and cropping. Moreover, image processing speed varies in existing image augmentation libraries. We present Albumentations, a fast and flexible open source library for image augmentation with many various image transform operations available that is also an easy-to-use wrapper around other augmentation libraries. We discuss the design principles that drove the implementation of Albumentations and give an overview of the key features and distinct capabilities. Finally, we provide examples of image augmentations for different computer vision tasks and demonstrate that Albumentations is faster than other commonly used image augmentation tools on most image transform operations.


2021 ◽  
Vol 14 (4) ◽  
pp. 1-17
Author(s):  
Dilawar Ali ◽  
Steven Verstockt ◽  
Nico Van De Weghe

Rephotography is the process of recapturing the photograph of a location from the same perspective in which it was captured earlier. A rephotographed image is the best presentation to visualize and study the social changes of a location over time. Traditionally, only expert artists and photographers are capable of generating the rephotograph of any specific location. Manual editing or human eye judgment that is considered for generating rephotographs normally requires a lot of precision, effort and is not always accurate. In the era of computer science and deep learning, computer vision techniques make it easier and faster to perform precise operations to an image. Until now many research methodologies have been proposed for rephotography but none of them is fully automatic. Some of these techniques require manual input by the user or need multiple images of the same location with 3D point cloud data while others are only suggestions to the user to perform rephotography. In historical records/archives most of the time we can find only one 2D image of a certain location. Computational rephotography is a challenge in the case of using only one image of a location captured at different timestamps because it is difficult to find the accurate perspective of a single 2D historical image. Moreover, in the case of building rephotography, it is required to maintain the alignments and regular shape. The features of a building may change over time and in most of the cases, it is not possible to use a features detection algorithm to detect the key features. In this research paper, we propose a methodology to rephotograph house images by combining deep learning and traditional computer vision techniques. The purpose of this research is to rephotograph an image of the past based on a single image. This research will be helpful not only for computer scientists but also for history and cultural heritage research scholars to study the social changes of a location during a specific time period, and it will allow users to go back in time to see how a specific place looked in the past. We have achieved good, fully automatic rephotographed results based on façade segmentation using only a single image.


2021 ◽  
Author(s):  
Lama Alsudias ◽  
Paul Rayson

BACKGROUND Twitter is a real time messaging platform widely used by people and organisations to share ‎information on many topics. It could potentially be useful to analyse tweets for infectious ‎disease monitoring purposes ‎ in order to reduce reporting lag time, and to provide an ‎independent complementary source of data, compared to traditional approaches. ‎However, such analysis is currently not possible in the Arabic speaking world due to lack of ‎basic building blocks for research.‎ OBJECTIVE We collect around 4,000 Arabic tweets related to COVID-19 and Influenza. We clean and ‎label the tweets relative to the Arabic Infectious Diseases Ontology which includes non-‎standard terminology and 11 core concepts and 21 relations. The aim of this study is to ‎analyse Arabic tweets to estimate their usefulness for health surveillance, understand the ‎impact of the informal terms in the analysis, show the effect of the deep learning methods ‎in the classification process, and identify the locations where the infection is spreading.‎ METHODS We apply multi-label classification techniques: Binary Relevance, Classifier Chains, Label ‎Powerset, Adapted Algorithm (MLKNN), NBSVM, BERT, and AraBERT to identify infected ‎people. We also use Named Entity Recognition to predict the locations affected. ‎ RESULTS We achieve an F1-score up to 88% in the Influenza case study and 94% in the COVID-19 one. ‎ ‎ Adapting for non-standard terminology and informal language helps to improve ‎accuracy by as ‎much as 15% with an average improvement of 8%.‎ Deep learning methods ‎achieve around 5% on hamming loss during the classifying process. Our geo-location ‎detection algorithm can predict on average 54% accuracy for the location of the users using ‎tweet content.‎ ‎ ‎ ‎ CONCLUSIONS This study identifies two Arabic social media datasets for monitoring tweets related to ‎Influenza and COVID-19‎. It demonstrates the importance of including informal terms, which ‎is regularly used by social media users, in the analysis. It also proves that BERT achieves good ‎results when used with new terms in COVID-19 tweets. Finally, the tweet content may ‎contain useful information to determine the location of the disease spread.


2020 ◽  
Vol 28 (1) ◽  
pp. 81-96
Author(s):  
José Miguel Buenaposada ◽  
Luis Baumela

In recent years we have witnessed significant progress in the performance of object detection in images. This advance stems from the use of rich discriminative features produced by deep models and the adoption of new training techniques. Although these techniques have been extensively used in the mainstream deep learning-based models, it is still an open issue to analyze their impact in alternative, and computationally more efficient, ensemble-based approaches. In this paper we evaluate the impact of the adoption of data augmentation, bounding box refinement and multi-scale processing in the context of multi-class Boosting-based object detection. In our experiments we show that use of these training advancements significantly improves the object detection performance.


Symmetry ◽  
2021 ◽  
Vol 13 (8) ◽  
pp. 1497
Author(s):  
Harold Achicanoy ◽  
Deisy Chaves ◽  
Maria Trujillo

Deep learning applications on computer vision involve the use of large-volume and representative data to obtain state-of-the-art results due to the massive number of parameters to optimise in deep models. However, data are limited with asymmetric distributions in industrial applications due to rare cases, legal restrictions, and high image-acquisition costs. Data augmentation based on deep learning generative adversarial networks, such as StyleGAN, has arisen as a way to create training data with symmetric distributions that may improve the generalisation capability of built models. StyleGAN generates highly realistic images in a variety of domains as a data augmentation strategy but requires a large amount of data to build image generators. Thus, transfer learning in conjunction with generative models are used to build models with small datasets. However, there are no reports on the impact of pre-trained generative models, using transfer learning. In this paper, we evaluate a StyleGAN generative model with transfer learning on different application domains—training with paintings, portraits, Pokémon, bedrooms, and cats—to generate target images with different levels of content variability: bean seeds (low variability), faces of subjects between 5 and 19 years old (medium variability), and charcoal (high variability). We used the first version of StyleGAN due to the large number of publicly available pre-trained models. The Fréchet Inception Distance was used for evaluating the quality of synthetic images. We found that StyleGAN with transfer learning produced good quality images, being an alternative for generating realistic synthetic images in the evaluated domains.


Sign in / Sign up

Export Citation Format

Share Document