Imbalanced Learning in Land Cover Classification: Improving Minority Classes’ Prediction Accuracy Using the Geometric SMOTE Algorithm

The automatic production of land use/land cover maps continues to be a challenging problem, with important impacts on the ability to promote sustainability and good resource management. The ability to build robust automatic classifiers and produce accurate maps can have a significant impact on the way we manage and optimize natural resources. The difficulty in achieving these results comes from many different factors, such as data quality and uncertainty. In this paper, we address the imbalanced learning problem, a common and difficult conundrum in remote sensing that affects the quality of classification results, by proposing Geometric-SMOTE, a novel oversampling method, as a tool for addressing the imbalanced learning problem in remote sensing. Geometric-SMOTE is a sophisticated oversampling algorithm which increases the quality of the instances generated in previous methods, such as the synthetic minority oversampling technique. The performance of Geometric- SMOTE, in the LUCAS (Land Use/Cover Area Frame Survey) dataset, is compared to other oversamplers using a variety of classifiers. The results show that Geometric-SMOTE significantly outperforms all the other oversamplers and improves the robustness of the classifiers. These results indicate that, when using imbalanced datasets, remote sensing researchers should consider the use of these new generation oversamplers to increase the quality of the classification results.

Download Full-text

Response to Johnson B.A. Scale Issues Related to the Accuracy Assessment of Land Use/Land Cover Maps Produced Using Multi-Resolution Data: Comments on “The Improvement of Land Cover Classification by Thermal Remote Sensing”. Remote Sens. 2015, 7, 8368–8390

Remote Sensing ◽

10.3390/rs71013440 ◽

2015 ◽

Vol 7 (10) ◽

pp. 13440-13447 ◽

Cited By ~ 2

Author(s):

Liya Sun ◽

Karsten Schulz

Keyword(s):

Remote Sensing ◽

Land Use ◽

Land Cover ◽

Accuracy Assessment ◽

Land Cover Classification ◽

Land Use Land Cover ◽

Thermal Remote Sensing ◽

Land Cover Maps ◽

Resolution Data

Download Full-text

Scale Issues Related to the Accuracy Assessment of Land Use/Land Cover Maps Produced Using Multi-Resolution Data: Comments on “The Improvement of Land Cover Classification by Thermal Remote Sensing”. Remote Sens. 2015, 7(7), 8368–8390

Remote Sensing ◽

10.3390/rs71013436 ◽

2015 ◽

Vol 7 (10) ◽

pp. 13436-13439 ◽

Cited By ~ 12

Author(s):

Brian Johnson

Keyword(s):

Remote Sensing ◽

Land Use ◽

Land Cover ◽

Accuracy Assessment ◽

Land Cover Classification ◽

Land Use Land Cover ◽

Thermal Remote Sensing ◽

Land Cover Maps ◽

Resolution Data

Download Full-text

Impact of land-use and land-cover change on groundwater quality in the Lower Shiwalik hills: a remote sensing and GIS based approach

Open Geosciences ◽

10.2478/v10085-010-0003-x ◽

2010 ◽

Vol 2 (2) ◽

Cited By ~ 25

Author(s):

Sudhir Singh ◽

Chander Singh ◽

Saumitra Mukherjee

Keyword(s):

Remote Sensing ◽

Land Use ◽

Land Cover ◽

Groundwater Quality ◽

Large Scale ◽

Hydrological Cycle ◽

Groundwater Resources ◽

Local Level ◽

Remote Sensing And Gis

AbstractHuman activities have exerted small to large scale changes on the hydrological cycle. The current scenario regarding groundwater resources suggests that globally there is a water crisis in terms of quantity (availability) and quality. Therefore there is a great need for the assessment and monitoring of quality and quantity of groundwater resources at local level. This paper presents a case study of the lower Shiwalik hills, in Rupnagar, Punjab, India, to trace land-use and land-cover changes during the past 17 years, with an emphasis on groundwater quality and quantity. This study was performed in alluvial and hilly terrain. The results show that the quantity of groundwater increased with the help of natural and artificial recharge due to change in land-use and land-cover pattern (increased area of fallow land). The quality of groundwater deteriorated due to input of fertilizers for enhancing the short-term soil fertility. Using a Remote Sensing and GIS based approach, we show the final results in map form. In particular we highlight a potential groundwater exploration site, which could be useful for district level planning. Our research shows that the change in land-use and land-cover affects the quantity and quality of groundwater.

Download Full-text

Improving Imbalanced Land Cover Classification with K-Means SMOTE: Detecting and Oversampling Distinctive Minority Spectral Signatures

Information ◽

10.3390/info12070266 ◽

2021 ◽

Vol 12 (7) ◽

pp. 266

Author(s):

Joao Fonseca ◽

Georgios Douzas ◽

Fernando Bacao

Keyword(s):

Remote Sensing ◽

Land Cover ◽

Policy Development ◽

Class Imbalance ◽

Development Planning ◽

Remotely Sensed Data ◽

K Nearest Neighbors ◽

Automatic Production ◽

Spectral Signatures ◽

Land Cover Maps

Land cover maps are a critical tool to support informed policy development, planning, and resource management decisions. With significant upsides, the automatic production of Land Use/Land Cover maps has been a topic of interest for the remote sensing community for several years, but it is still fraught with technical challenges. One such challenge is the imbalanced nature of most remotely sensed data. The asymmetric class distribution impacts negatively the performance of classifiers and adds a new source of error to the production of these maps. In this paper, we address the imbalanced learning problem, by using K-means and the Synthetic Minority Oversampling Technique (SMOTE) as an improved oversampling algorithm. K-means SMOTE improves the quality of newly created artificial data by addressing both the between-class imbalance, as traditional oversamplers do, but also the within-class imbalance, avoiding the generation of noisy data while effectively overcoming data imbalance. The performance of K-means SMOTE is compared to three popular oversampling methods (Random Oversampling, SMOTE and Borderline-SMOTE) using seven remote sensing benchmark datasets, three classifiers (Logistic Regression, K-Nearest Neighbors and Random Forest Classifier) and three evaluation metrics using a five-fold cross-validation approach with three different initialization seeds. The statistical analysis of the results show that the proposed method consistently outperforms the remaining oversamplers producing higher quality land cover classifications. These results suggest that LULC data can benefit significantly from the use of more sophisticated oversamplers as spectral signatures for the same class can vary according to geographical distribution.

Download Full-text