scholarly journals DEAL: Difficulty-Aware Active Learning for Semantic Segmentation

Author(s):  
Shuai Xie ◽  
Zunlei Feng ◽  
Ying Chen ◽  
Songtao Sun ◽  
Chao Ma ◽  
...  
Author(s):  
M. Kölle ◽  
V. Walter ◽  
S. Schmohl ◽  
U. Soergel

Abstract. Automated semantic interpretation of 3D point clouds is crucial for many tasks in the domain of geospatial data analysis. For this purpose, labeled training data is required, which has often to be provided manually by experts. One approach to minimize effort in terms of costs of human interaction is Active Learning (AL). The aim is to process only the subset of an unlabeled dataset that is particularly helpful with respect to class separation. Here a machine identifies informative instances which are then labeled by humans, thereby increasing the performance of the machine. In order to completely avoid involvement of an expert, this time-consuming annotation can be resolved via crowdsourcing. Therefore, we propose an approach combining AL with paid crowdsourcing. Although incorporating human interaction, our method can run fully automatically, so that only an unlabeled dataset and a fixed financial budget for the payment of the crowdworkers need to be provided. We conduct multiple iteration steps of the AL process on the ISPRS Vaihingen 3D Semantic Labeling benchmark dataset (V3D) and especially evaluate the performance of the crowd when labeling 3D points. We prove our concept by using labels derived from our crowd-based AL method for classifying the test dataset. The analysis outlines that by labeling only 0:4% of the training dataset by the crowd and spending less than 145 $, both our trained Random Forest and sparse 3D CNN classifier differ in Overall Accuracy by less than 3 percentage points compared to the same classifiers trained on the complete V3D training set.


2021 ◽  
Author(s):  
Benjamin Kellenberger ◽  
Devis Tuia ◽  
Dan Morris

<p>Ecological research like wildlife censuses increasingly relies on data on the scale of Terabytes. For example, modern camera trap datasets contain millions of images that require prohibitive amounts of manual labour to be annotated with species, bounding boxes, and the like. Machine learning, especially deep learning [3], could greatly accelerate this task through automated predictions, but involves expansive coding and expert knowledge.</p><p>In this abstract we present AIDE, the Annotation Interface for Data-driven Ecology [2]. In a first instance, AIDE is a web-based annotation suite for image labelling with support for concurrent access and scalability, up to the cloud. In a second instance, it tightly integrates deep learning models into the annotation process through active learning [7], where models learn from user-provided labels and in turn select the most relevant images for review from the large pool of unlabelled ones (Fig. 1). The result is a system where users only need to label what is required, which saves time and decreases errors due to fatigue.</p><p><img src="https://contentmanager.copernicus.org/fileStorageProxy.php?f=gnp.0402be60f60062057601161/sdaolpUECMynit/12UGE&app=m&a=0&c=131251398e575ac9974634bd0861fadc&ct=x&pn=gnp.elif&d=1" alt=""></p><p><em>Fig. 1: AIDE offers concurrent web image labelling support and uses annotations and deep learning models in an active learning loop.</em></p><p>AIDE includes a comprehensive set of built-in models, such as ResNet [1] for image classification, Faster R-CNN [5] and RetinaNet [4] for object detection, and U-Net [6] for semantic segmentation. All models can be customised and used without having to write a single line of code. Furthermore, AIDE accepts any third-party model with minimal implementation requirements. To complete the package, AIDE offers both user annotation and model prediction evaluation, access control, customisable model training, and more, all through the web browser.</p><p>AIDE is fully open source and available under https://github.com/microsoft/aerial_wildlife_detection.</p><p> </p><p><strong>References</strong></p>


Author(s):  
M. Kölle ◽  
D. Laupheimer ◽  
V. Walter ◽  
N. Haala ◽  
U. Soergel

Abstract. Semantic interpretation of multi-modal datasets is of great importance in many domains of geospatial data analysis. However, when training models for automated semantic segmentation, labeled training data is required and in case of multi-modality for each representation form of the scene. To completely avoid the time-consuming and cost-intensive involvement of an expert in the annotation procedure, we propose an Active Learning (AL) pipeline where a Random Forest classifier selects a subset of points sufficient for training and where necessary labels are received from the crowd. In this AL loop, we aim on coupled semantic segmentation of an Airborne Laser Scanning (ALS) point cloud and the corresponding 3D textured mesh generated from LiDAR data and imagery in a hybrid manner. Within this work we pursue two main objectives: i) We evaluate the performance of the AL pipeline applied to an ultra-high resolution ALS point cloud and a derived textured mesh (both benchmark datasets are available at https://ifpwww.ifp.uni-stuttgart.de/benchmark/hessigheim/default.aspx). ii) We investigate the capabilities of the crowd regarding interpretation of 3D geodata and observed that the crowd performs about 3 percentage points better when labeling meshes compared to point clouds. We additionally demonstrate that labels received solely by the crowd can power a machine learning system only differing in Overall Accuracy by less than 2 percentage points for the point cloud and less than 3 percentage points for the mesh, compared to using the completely labeled training pool. For deriving this sparse training set, we ask the crowd to label 0.25 % of available training points, resulting in costs of 190 $.


Sign in / Sign up

Export Citation Format

Share Document