Tree species detection and identification from UAV imagery to support tropical forest monitoring

<p>As part of a project aiming to support FSC certified logging concessions in their tasks of forest inventory and management, we collected aerial imagery over 9000 ha of tropical forests in Northern Congo using long range Unmanned Aerial Vehicles (UAVs). Once processed into orthomosaics, the aerial imagery is used in combination with reference training samples to train a deep learning object detection model (FasterRCNN) capable of detecting and predicting tree species. The remoteness and diversity of these forests make both data acquisition and generation of a training dataset challenging. Unlike natural images containing common objects like cars, bicycles, cats and dogs, there is no easy way to create a training dataset of tree species from overhead imagery of tropical forests. The first reason is that a human operator cannot as easily recognize and label objects. The second reason is that the polymorphism of tree species, phenological variations and uncertainty associated with visual recognition makes the exhaustive labeling of all instances of each class very difficult. Such exhaustive labeling is required to successfully train any object detection model. To overcome these challenges we built an interactive and ergonomic interface that allows a human operator to work in a spatial context, being guided by the approximate geographic location of already inventoried trees. We solved the issue of non-exhaustive instance labeling by building synthetic images, hence allowing full control of the training data. In addition to these specific developments related to training data generation, we will present details of the UAV missions, modelling results on synthetic images, and finally preliminary results of model transfer to aerial imagery.</p>

Download Full-text

Augmenting Crop Detection for Precision Agriculture with Deep Visual Transfer Learning—A Case Study of Bale Detection

Remote Sensing ◽

10.3390/rs13010023 ◽

2020 ◽

Vol 13 (1) ◽

pp. 23

Author(s):

Wei Zhao ◽

William Yamada ◽

Tianxin Li ◽

Matthew Digman ◽

Troy Runge

Keyword(s):

Object Detection ◽

Transfer Learning ◽

Precision Agriculture ◽

Crop Production ◽

Domain Adaptation ◽

Training Data ◽

Detection Accuracy ◽

Detection Model ◽

Agriculture Products

In recent years, precision agriculture has been researched to increase crop production with less inputs, as a promising means to meet the growing demand of agriculture products. Computer vision-based crop detection with unmanned aerial vehicle (UAV)-acquired images is a critical tool for precision agriculture. However, object detection using deep learning algorithms rely on a significant amount of manually prelabeled training datasets as ground truths. Field object detection, such as bales, is especially difficult because of (1) long-period image acquisitions under different illumination conditions and seasons; (2) limited existing prelabeled data; and (3) few pretrained models and research as references. This work increases the bale detection accuracy based on limited data collection and labeling, by building an innovative algorithms pipeline. First, an object detection model is trained using 243 images captured with good illimitation conditions in fall from the crop lands. In addition, domain adaptation (DA), a kind of transfer learning, is applied for synthesizing the training data under diverse environmental conditions with automatic labels. Finally, the object detection model is optimized with the synthesized datasets. The case study shows the proposed method improves the bale detecting performance, including the recall, mean average precision (mAP), and F measure (F1 score), from averages of 0.59, 0.7, and 0.7 (the object detection) to averages of 0.93, 0.94, and 0.89 (the object detection + DA), respectively. This approach could be easily scaled to many other crop field objects and will significantly contribute to precision agriculture.

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

Methodology for Collecting a Training Dataset for an Intrusion Detection Model

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2021-33(5)-5 ◽

2021 ◽

Vol 33 (5) ◽

pp. 83-104

Author(s):

Aleksandr Igorevich Getman ◽

Maxim Nikolaevich Goryunov ◽

Andrey Georgievich Matskevich ◽

Dmitry Aleksandrovich Rybolovlev

Keyword(s):

Attack Detection ◽

Training Data ◽

Training Dataset ◽

Training Models ◽

The Public ◽

Detection Model ◽

Computer Attacks ◽

Model Training ◽

Public Datasets

The paper discusses the issues of training models for detecting computer attacks based on the use of machine learning methods. The results of the analysis of publicly available training datasets and tools for analyzing network traffic and identifying features of network sessions are presented sequentially. The drawbacks of existing tools and possible errors in the datasets formed with their help are noted. It is concluded that it is necessary to collect own training data in the absence of guarantees of the public datasets reliability and the limited use of pre-trained models in networks with characteristics that differ from the characteristics of the network in which the training traffic was collected. A practical approach to generating training data for computer attack detection models is proposed. The proposed solutions have been tested to evaluate the quality of model training on the collected data and the quality of attack detection in conditions of real network infrastructure.

Download Full-text

Development of a Wearable Camera and AI Algorithm for Medication Behavior Recognition

Sensors ◽

10.3390/s21113594 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3594

Author(s):

Hwiwon Lee ◽

Sekyoung Youm

Keyword(s):

Object Detection ◽

Action Recognition ◽

Health Professionals ◽

Medication Compliance ◽

Image Data ◽

Image Sensor ◽

Training Dataset ◽

Behavior Recognition ◽

Recognition Model ◽

Detection Model

As many as 40% to 50% of patients do not adhere to long-term medications for managing chronic conditions, such as diabetes or hypertension. Limited opportunity for medication monitoring is a major problem from the perspective of health professionals. The availability of prompt medication error reports can enable health professionals to provide immediate interventions for patients. Furthermore, it can enable clinical researchers to modify experiments easily and predict health levels based on medication compliance. This study proposes a method in which videos of patients taking medications are recorded using a camera image sensor integrated into a wearable device. The collected data are used as a training dataset based on applying the latest convolutional neural network (CNN) technique. As for an artificial intelligence (AI) algorithm to analyze the medication behavior, we constructed an object detection model (Model 1) using the faster region-based CNN technique and a second model that uses the combined feature values to perform action recognition (Model 2). Moreover, 50,000 image data were collected from 89 participants, and labeling was performed on different data categories to train the algorithm. The experimental combination of the object detection model (Model 1) and action recognition model (Model 2) was newly developed, and the accuracy was 92.7%, which is significantly high for medication behavior recognition. This study is expected to enable rapid intervention for providers seeking to treat patients through rapid reporting of drug errors.

Download Full-text

Obtaining Urban Waterlogging Depths from Video Images Using Synthetic Image Data

Remote Sensing ◽

10.3390/rs12061014 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1014

Author(s):

Jingchao Jiang ◽

Cheng-Zhi Qin ◽

Juan Yu ◽

Changxiu Cheng ◽

Junzhi Liu ◽

...

Keyword(s):

Object Detection ◽

Data Augmentation ◽

Open Data ◽

Image Data ◽

Training Data ◽

Synthetic Image ◽

Detection Model ◽

Video Images ◽

Image Dataset ◽

Water Surfaces

Reference objects in video images can be used to indicate urban waterlogging depths. The detection of reference objects is the key step to obtain waterlogging depths from video images. Object detection models with convolutional neural networks (CNNs) have been utilized to detect reference objects. These models require a large number of labeled images as the training data to ensure the applicability at a city scale. However, it is hard to collect a sufficient number of urban flooding images containing valuable reference objects, and manually labeling images is time-consuming and expensive. To solve the problem, we present a method to synthesize image data as the training data. Firstly, original images containing reference objects and original images with water surfaces are collected from open data sources, and reference objects and water surfaces are cropped from these original images. Secondly, the reference objects and water surfaces are further enriched via data augmentation techniques to ensure the diversity. Finally, the enriched reference objects and water surfaces are combined to generate a synthetic image dataset with annotations. The synthetic image dataset is further used for training an object detection model with CNN. The waterlogging depths are calculated based on the reference objects detected by the trained model. A real video dataset and an artificial image dataset are used to evaluate the effectiveness of the proposed method. The results show that the detection model trained using the synthetic image dataset can effectively detect reference objects from images, and it can achieve acceptable accuracies of waterlogging depths based on the detected reference objects. The proposed method has the potential to monitor waterlogging depths at a city scale.

Download Full-text

Estimation of Carbon Stock in the Regenerating Tree Species of the Intact and Disturbed Forest Sites in Tanzania

International Journal of Environment and Climate Change ◽

10.9734/ijecc/2018/v8i227139 ◽

2018 ◽

pp. 80-95

Author(s):

Elly Josephat Ligate ◽

Can Chen ◽

Chengzhen Wu

Keyword(s):

Tropical Forests ◽

Tree Species ◽

Carbon Stock ◽

Anthropogenic Activities ◽

Carbon Stocks ◽

Livestock Grazing ◽

Forest Monitoring ◽

Land Uses ◽

Adult Trees ◽

Crop Agriculture

Aim: Estimation of carbon in the forests located in the coast of tropics is needed to support conservation and forest monitoring strategies. This study aimed at quantifying carbon stocks in the regenerating tree species of intact forest (IFS), disturbed by agriculture (ADS) and by livestock grazing sites (LDS) to understand the importance of coastal trees in carbon stocking as part of mitigating climate change impacts. Methodology: Thirty-three independent measurements of tree carbon stocks were carried out on 33 tree families found in the coastal zone of Tanzania. The vegetation was inventoried by means of a floristic survey of the woody component across intact, crop agriculture and livestock disturbed land use sites. The biomass was then estimated by employing the existing allometric equations for tropical forests. Thereafter, the above ground stored carbon was quantified on the sampled tree species found in each land uses. Results: The results showed that there were significant variations (p ≤ .05) of carbon stock values across species and land uses. The average carbon (Kg/ha) stored in the regenerated adult trees was 1200 in IFS, 600 in ADS, 400 in LDS. Saplings had 0.43 in LDS, 0.07 in ADS and 0.01 in IFS. Indeed, seedlings had the average of 0.41 in IFS, 0.22 in ADS and 0.05 in LDS. Conclusion: These findings show that crop-agriculture highly affects the regeneration potential of trees, biomass accumulation and carbon stock than livestock grazing. To restore carbon storage potential of coastal tropical forests, crop-agriculture must be discouraged, while livestock grazing can be integrated in forest management. Indeed, further studies are required to gauge the integration levels of any anthropogenic activities, so that the natural capacity of coastal tropical forests to regenerate and stock carbon is not comprised further.

Download Full-text

Coarse-to-Fine Adaptive People Detection for Video Sequences by Maximizing Mutual Information †

Sensors ◽

10.3390/s19010004 ◽

2018 ◽

Vol 19 (1) ◽

pp. 4

Author(s):

Álvaro García-Martín ◽

Juan SanMiguel ◽

José Martínez

Keyword(s):

Mutual Information ◽

Detection Threshold ◽

Ground Truth ◽

Training Data ◽

Training Dataset ◽

People Detection ◽

Detection Model ◽

Unseen Data ◽

Bounding Boxes ◽

Coarse To Fine

Applying people detectors to unseen data is challenging since patterns distributions, such as viewpoints, motion, poses, backgrounds, occlusions and people sizes, may significantly differ from the ones of the training dataset. In this paper, we propose a coarse-to-fine framework to adapt frame by frame people detectors during runtime classification, without requiring any additional manually labeled ground truth apart from the offline training of the detection model. Such adaptation make use of multiple detectors mutual information, i.e., similarities and dissimilarities of detectors estimated and agreed by pair-wise correlating their outputs. Globally, the proposed adaptation discriminates between relevant instants in a video sequence, i.e., identifies the representative frames for an adaptation of the system. Locally, the proposed adaptation identifies the best configuration (i.e., detection threshold) of each detector under analysis, maximizing the mutual information to obtain the detection threshold of each detector. The proposed coarse-to-fine approach does not require training the detectors for each new scenario and uses standard people detector outputs, i.e., bounding boxes. The experimental results demonstrate that the proposed approach outperforms state-of-the-art detectors whose optimal threshold configurations are previously determined and fixed from offline training data.

Download Full-text

Pano-RSOD: A Dataset and Benchmark for Panoramic Road Scene Object Detection

Electronics ◽

10.3390/electronics8030329 ◽

2019 ◽

Vol 8 (3) ◽

pp. 329 ◽

Cited By ~ 2

Author(s):

Yong Li ◽

Guofeng Tong ◽

Huashuai Gao ◽

Yuebin Wang ◽

Liqiang Zhang ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Training Data ◽

Training Dataset ◽

Panoramic Image ◽

Information Object ◽

Wide Range ◽

Panoramic Images ◽

Bounding Boxes ◽

Scene Object

Panoramic images have a wide range of applications in many fields with their ability to perceive all-round information. Object detection based on panoramic images has certain advantages in terms of environment perception due to the characteristics of panoramic images, e.g., lager perspective. In recent years, deep learning methods have achieved remarkable results in image classification and object detection. Their performance depends on the large amount of training data. Therefore, a good training dataset is a prerequisite for the methods to achieve better recognition results. Then, we construct a benchmark named Pano-RSOD for panoramic road scene object detection. Pano-RSOD contains vehicles, pedestrians, traffic signs and guiding arrows. The objects of Pano-RSOD are labelled by bounding boxes in the images. Different from traditional object detection datasets, Pano-RSOD contains more objects in a panoramic image, and the high-resolution images have 360-degree environmental perception, more annotations, more small objects and diverse road scenes. The state-of-the-art deep learning algorithms are trained on Pano-RSOD for object detection, which demonstrates that Pano-RSOD is a useful benchmark, and it provides a better panoramic image training dataset for object detection tasks, especially for small and deformed objects.

Download Full-text

A Study on Training Data Selection for Object Detection in Nighttime Traffic Scenes

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.16.avm-202 ◽

2020 ◽

Vol 2020 (16) ◽

pp. 203-1-203-6

Author(s):

Astrid Unger ◽

Margrit Gelautz ◽

Florian Seitner

Keyword(s):

Object Detection ◽

Weather Conditions ◽

Training Data ◽

Data Selection ◽

Training Dataset ◽

Detection Algorithms ◽

Low Visibility ◽

Selection For ◽

Training Data Selection ◽

Selection Of

With the growing demand for robust object detection algorithms in self-driving systems, it is important to consider the varying lighting and weather conditions in which cars operate all year round. The goal of our work is to gain a deeper understanding of meaningful strategies for selecting and merging training data from currently available databases and self-annotated videos in the context of automotive night scenes. We retrain an existing Convolutional Neural Network (YOLOv3) to study the influence of different training dataset combinations on the final object detection results in nighttime and low-visibility traffic scenes. Our evaluation shows that a suitable selection of training data from the GTSRD, VIPER, and BDD databases in conjunction with selfrecorded night scenes can achieve an mAP of 63,5% for ten object classes, which is an improvement of 16,7% when compared to the performance of the original YOLOv3 network on the same test set.

Download Full-text

The Evaluation of Acute Myeloid Leukaemia (AML) Blood Cell Detection Models Using Different YOLO Approaches

10.1101/2021.08.04.455113 ◽

2021 ◽

Author(s):

Kaung Myat Naing ◽

Veerayuth Kittichai ◽

Teerawat Tongloy ◽

Santhad Chuwongin Chuwongin ◽

Siridech Boonsang

Keyword(s):

Acute Myeloid Leukaemia ◽

Object Detection ◽

Myeloid Leukaemia ◽

Data Augmentation ◽

Training Data ◽

Training Dataset ◽

Cell Detection ◽

Augmentation Techniques ◽

Model Training ◽

Acute Myeloid

This study proposes to evaluate the performance of Acute Myeloid Leukaemia (AML) blast cell detection models in microscopic examination images for faster diagnosis and disease monitoring. One of the popular deep learning algorithms such as You Only Look Once (YOLO) developed for object detection is the successful state-of-the-art algorithms in real-time object detection systems. We employ four versions of the YOLO algorithm: YOLOv3, YOLOv3-Tiny, YOLOv2 and YOLOv2-Tiny for detection of 15-class of AML blood cells in examination images. We also acquired the publicly available dataset from The Cancer Imaging Archive (TCIA), which consists of 18,365 expert-labelled single-cell images. Data augmentation techniques are additionally applied to enhance and balance the training images in the dataset. The overall results indicated that four types of YOLO approach have outstanding performances of more than 92% in precision and sensitivity. In comparison, YOLOv3 has more reliable performance than the other three approaches. Consistently, the AUC values for the four YOLO models are 0.969 (YOLOv3), 0.967 (YOLOv3-Tiny), 0.963 (YOLOv2), and 0.948 (YOLOv2-Tiny). Furthermore, we compare the best model's performance between approaches that use the entire training dataset without using data augmentation techniques and image division with data augmentation techniques. Remarkably, by using 33.51 percent of the training data in model training, the prediction outcomes from the model that used image partitioning with data augmentation were similar to those obtained using the complete training dataset. This work potentially provides a beneficial digital rapid tool in the screening and evaluation of numerous haematological disorders.

Download Full-text