scholarly journals Improving Remote Sensing Multiple Classification by Data and Ensemble Selection

2021 ◽  
Vol 87 (11) ◽  
pp. 841-852
Author(s):  
S. Boukir ◽  
L. Guo ◽  
N. Chehata

In this article, margin theory is exploited to design better ensemble classifiers for remote sensing data. A semi-supervised version of the ensemble margin is at the core of this work. Some major challenges in ensemble learning are investigated using this paradigm in the difficult context of land cover classification: selecting the most informative instances to form an appropriate training set, and selecting the best ensemble members. The main contribution of this work lies in the explicit use of the ensemble margin as a decision method to select training data and base classifiers in an ensemble learning framework. The selection of training data is achieved through an innovative iterative guided bagging algorithm exploiting low-margin instances. The overall classification accuracy is improved by up to 3%, with more dramatic improvement in per-class accuracy (up to 12%). The selection of ensemble base classifiers is achieved by an ordering-based ensemble-selection algorithm relying on an original margin-based criterion that also targets low-margin instances. This method reduces the complexity (ensemble size under 30) but maintains performance.

2017 ◽  
Vol 21 (9) ◽  
pp. 4747-4765 ◽  
Author(s):  
Clara Linés ◽  
Micha Werner ◽  
Wim Bastiaanssen

Abstract. The implementation of drought management plans contributes to reduce the wide range of adverse impacts caused by water shortage. A crucial element of the development of drought management plans is the selection of appropriate indicators and their associated thresholds to detect drought events and monitor the evolution. Drought indicators should be able to detect emerging drought processes that will lead to impacts with sufficient anticipation to allow measures to be undertaken effectively. However, in the selection of appropriate drought indicators, the connection to the final impacts is often disregarded. This paper explores the utility of remotely sensed data sets to detect early stages of drought at the river basin scale and determine how much time can be gained to inform operational land and water management practices. Six different remote sensing data sets with different spectral origins and measurement frequencies are considered, complemented by a group of classical in situ hydrologic indicators. Their predictive power to detect past drought events is tested in the Ebro Basin. Qualitative (binary information based on media records) and quantitative (crop yields) data of drought events and impacts spanning a period of 12 years are used as a benchmark in the analysis. Results show that early signs of drought impacts can be detected up to 6 months before impacts are reported in newspapers, with the best correlation–anticipation relationships for the standard precipitation index (SPI), the normalised difference vegetation index (NDVI) and evapotranspiration (ET). Soil moisture (SM) and land surface temperature (LST) offer also good anticipation but with weaker correlations, while gross primary production (GPP) presents moderate positive correlations only for some of the rain-fed areas. Although classical hydrological information from water levels and water flows provided better anticipation than remote sensing indicators in most of the areas, correlations were found to be weaker. The indicators show a consistent behaviour with respect to the different levels of crop yield in rain-fed areas among the analysed years, with SPI, NDVI and ET providing again the stronger correlations. Overall, the results confirm remote sensing products' ability to anticipate reported drought impacts and therefore appear as a useful source of information to support drought management decisions.


2021 ◽  
Author(s):  
Octavian Dumitru ◽  
Gottfried Schwarz ◽  
Mihai Datcu ◽  
Dongyang Ao ◽  
Zhongling Huang ◽  
...  

<p>During the last years, much progress has been reached with machine learning algorithms. Among the typical application fields of machine learning are many technical and commercial applications as well as Earth science analyses, where most often indirect and distorted detector data have to be converted to well-calibrated scientific data that are a prerequisite for a correct understanding of the desired physical quantities and their relationships.</p><p>However, the provision of sufficient calibrated data is not enough for the testing, training, and routine processing of most machine learning applications. In principle, one also needs a clear strategy for the selection of necessary and useful training data and an easily understandable quality control of the finally desired parameters.</p><p>At a first glance, one could guess that this problem could be solved by a careful selection of representative test data covering many typical cases as well as some counterexamples. Then these test data can be used for the training of the internal parameters of a machine learning application. At a second glance, however, many researchers found out that a simple stacking up of plain examples is not the best choice for many scientific applications.</p><p>To get improved machine learning results, we concentrated on the analysis of satellite images depicting the Earth’s surface under various conditions such as the selected instrument type, spectral bands, and spatial resolution. In our case, such data are routinely provided by the freely accessible European Sentinel satellite products (e.g., Sentinel-1, and Sentinel-2). Our basic work then included investigations of how some additional processing steps – to be linked with the selected training data – can provide better machine learning results.</p><p>To this end, we analysed and compared three different approaches to find out machine learning strategies for the joint selection and processing of training data for our Earth observation images:</p><ul><li>One can optimize the training data selection by adapting the data selection to the specific instrument, target, and application characteristics [1].</li> <li>As an alternative, one can dynamically generate new training parameters by Generative Adversarial Networks. This is comparable to the role of a sparring partner in boxing [2].</li> <li>One can also use a hybrid semi-supervised approach for Synthetic Aperture Radar images with limited labelled data. The method is split in: polarimetric scattering classification, topic modelling for scattering labels, unsupervised constraint learning, and supervised label prediction with constraints [3].</li> </ul><p>We applied these strategies in the ExtremeEarth sea-ice monitoring project (http://earthanalytics.eu/). As a result, we can demonstrate for which application cases these three strategies will provide a promising alternative to a simple conventional selection of available training data.</p><p>[1] C.O. Dumitru et. al, “Understanding Satellite Images: A Data Mining Module for Sentinel Images”, Big Earth Data, 2020, 4(4), pp. 367-408.</p><p>[2] D. Ao et. al., “Dialectical GAN for SAR Image Translation: From Sentinel-1 to TerraSAR-X”, Remote Sensing, 2018, 10(10), pp. 1-23.</p><p>[3] Z. Huang, et. al., "HDEC-TFA: An Unsupervised Learning Approach for Discovering Physical Scattering Properties of Single-Polarized SAR Images", IEEE Transactions on Geoscience and Remote Sensing, 2020, pp.1-18.</p>


PeerJ ◽  
2019 ◽  
Vol 6 ◽  
pp. e6227 ◽  
Author(s):  
Michele Dalponte ◽  
Lorenzo Frizzera ◽  
Damiano Gianelle

An international data science challenge, called National Ecological Observatory Network—National Institute of Standards and Technology data science evaluation, was set up in autumn 2017 with the goal to improve the use of remote sensing data in ecological applications. The competition was divided into three tasks: (1) individual tree crown (ITC) delineation, for identifying the location and size of individual trees; (2) alignment between field surveyed trees and ITCs delineated on remote sensing data; and (3) tree species classification. In this paper, the methods and results of team Fondazione Edmund Mach (FEM) are presented. The ITC delineation (Task 1 of the challenge) was done using a region growing method applied to a near-infrared band of the hyperspectral images. The optimization of the parameters of the delineation algorithm was done in a supervised way on the basis of the Jaccard score using the training set provided by the organizers. The alignment (Task 2) between the delineated ITCs and the field surveyed trees was done using the Euclidean distance among the position, the height, and the crown radius of the ITCs and the field surveyed trees. The classification (Task 3) was performed using a support vector machine classifier applied to a selection of the hyperspectral bands and the canopy height model. The selection of the bands was done using the sequential forward floating selection method and the Jeffries Matusita distance. The results of the three tasks were very promising: team FEM ranked first in the data science competition in Task 1 and 2, and second in Task 3. The Jaccard score of the delineated crowns was 0.3402, and the results showed that the proposed approach delineated both small and large crowns. The alignment was correctly done for all the test samples. The classification results were good (overall accuracy of 88.1%, kappa accuracy of 75.7%, and mean class accuracy of 61.5%), although the accuracy was biased toward the most represented species.


2012 ◽  
Vol 546-547 ◽  
pp. 508-513 ◽  
Author(s):  
Qiong Wu ◽  
Ling Wei Wang ◽  
Jia Wu

The characteristics of hyperspectral data with large number of bands, each bands have correlation, which has required a very high demand of solving the problem. In this paper, we take the features of hyperspectral remote sensing data and classification algorithms as the background, applying the ensemble learning to image classification.The experiment based on Weka. I compared the classification accuracy of Bagging, Boosting and Stacking on the base classifiers J48 and BP. The results show that ensemble learning on hyperspectral data can achieve higher classification accuracy. So that it provide a new method for the classification of hyperspectral remote sensing image.


2017 ◽  
Author(s):  
◽  
Xiaoxiao Du

Imagine you are traveling to Columbia, MO for the first time. On your flight to Columbia, the woman sitting next to you recommended a bakery by a large park with a big yellow umbrella outside. After you land, you need directions to the hotel from the airport. Suppose you are driving a rental car, you will need to park your car at a parking lot or a parking structure. After a good night's sleep in the hotel, you may decide to go for a run in the morning on the closest trail and stop by that recommended bakery under a big yellow umbrella. It would be helpful in the course of completing all these tasks to accurately distinguish the proper car route and walking trail, find a parking lot, and pinpoint the yellow umbrella. Satellite imagery and other geo-tagged data such as Open Street Maps provide effective information for this goal. Open Street Maps can provide road information and suggest bakery within a five-mile radius. The yellow umbrella is a distinctive color and, perhaps, is made of a distinctive material that can be identified from a hyperspectral camera. Open Street Maps polygons are tagged with information such as "parking lot" and "sidewalk." All these information can and should be fused to help identify and offer better guidance on the tasks you are completing. Supervised learning methods generally require precise labels for each training data point. It is hard (and probably at an extra cost) to manually go through and label each pixel in the training imagery. GPS coordinates cannot always be fully trusted as a GPS device may only be accurate to the level of several pixels. In many cases, it is practically infeasible to obtain accurate pixel-level training labels to perform fusion for all the imagery and maps available. Besides, the training data may come in a variety of data types, such as imagery or as a 3D point cloud. The imagery may have different resolutions, scales and, even, coordinate systems. Previous fusion methods are generally only limited to data mapped to the same pixel grid, with accurate labels. Furthermore, most fusion methods are restricted to only two sources, even if certain methods, such as pan-sharpening, can deal with different geo-spatial types or data of different resolution. It is, therefore, necessary and important, to come up with a way to perform fusion on multiple sources of imagery and map data, possibly with different resolutions and of different geo-spatial types with consideration of uncertain labels. I propose a Multiple Instance Choquet Integral framework for multi-resolution multisensor fusion with uncertain training labels. The Multiple Instance Choquet Integral (MICI) framework addresses uncertain training labels and performs both classification and regression. Three classifier fusion models, i.e. the noisy-or, min-max, and generalized-mean models, are derived under MICI. The Multi-Resolution Multiple Instance Choquet Integral (MR-MICI) framework is built upon the MICI framework and further addresses multiresolution in the fusion sources in addition to the uncertainty in training labels. For both MICI and MR-MICI, a monotonic normalized fuzzy measure is learned to be used with the Choquet integral to perform two-class classifier fusion given bag-level training labels. An optimization scheme based on the evolutionary algorithm is used to optimize the models proposed. For regression problems where the desired prediction is real-valued, the primary instance assumption is adopted. The algorithms are applied to target detection, regression and scene understanding applications. Experiments are conducted on the fusion of remote sensing data (hyperspectral and LiDAR) over the campus of University of Southern Mississippi - Gulfpark. Clothpanel sub-pixel and super-pixel targets were placed on campus with varying levels of occlusion and the proposed algorithms can successfully detect the targets in the scene. A semi-supervised approach is developed to automatically generate training labels based on data from Google Maps, Google Earth and Open Street Map. Based on such training labels with uncertainty, the proposed algorithms can also identify materials on campus for scene understanding, such as road, buildings, sidewalks, etc. In addition, the algorithms are used for weed detection and real-valued crop yield prediction experiments based on remote sensing data that can provide information for agricultural applications.


2021 ◽  
Author(s):  
Federico Figari Tomenotti

Change detection is a well-known topic of remote sensing. The goal is to track and monitor the evolution of changes affecting the Earth surface over time. The recently increased availability in remote sensing data for Earth observation and in computational power has raised the interest in this field of research. In particular, the keywords “multitemporal” and “heterogeneous” play prominent roles. The former refers to the availability and the comparison of two or more satellite images of the same place on the ground, in order to find changes and track the evolution of the observed surface, maybe with different time sensitivities. The latter refers to the capability of performing change detection with images coming from different sources, corresponding to different sensors, wavelengths, polarizations, acquisition geometries, etc. This thesis addresses the challenging topic of multitemporal change detection with heterogeneous remote sensing images. It proposes a novel approach, taking inspiration from recent developments in the literature. The proposed method is based on deep learning - involving autoencoders of convolutional neural networks - and represents an exapmple of unsupervised change detection. A major novelty of the work consists in including a prior information model, used to make the method unsupervised, within a well-established algorithm such as the canonical correlation analysis, and in combining these with a deep learning framework to give rise to an image translation method able to compare heterogeneous images regardless of their highly different domains. The theoretical analysis is supported by experimental results, comparing the proposed methodology to the state of the art of this discipline. Two different datasets were used for the experiments, and the results obtained on both of them show the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document