RUESVMs: An Ensemble Method to Handle the Class Imbalance Problem in Land Cover Mapping Using Google Earth Engine
Timely and accurate Land Cover (LC) information is required for various applications, such as climate change analysis and sustainable development. Although machine learning algorithms are most likely successful in LC mapping tasks, the class imbalance problem is known as a common challenge in this regard. This problem occurs during the training phase and reduces classification accuracy for infrequent and rare LC classes. To address this issue, this study proposes a new method by integrating random under-sampling of majority classes and an ensemble of Support Vector Machines, namely Random Under-sampling Ensemble of Support Vector Machines (RUESVMs). The performance of RUESVMs for LC classification was evaluated in Google Earth Engine (GEE) over two different case studies using Sentinel-2 time-series data and five well-known spectral indices, including the Normalized Difference Vegetation Index (NDVI), Green Normalized Difference Vegetation Index (GNDVI), Soil-Adjusted Vegetation Index (SAVI), Normalized Difference Built-up Index (NDBI), and Normalized Difference Water Index (NDWI). The performance of RUESVMs was also compared with the traditional SVM and combination of SVM with three benchmark data balancing techniques namely the Random Over-Sampling (ROS), Random Under-Sampling (RUS), and Synthetic Minority Over-sampling Technique (SMOTE). It was observed that the proposed method considerably improved the accuracy of LC classification, especially for the minority classes. After adopting RUESVMs, the overall accuracy of the generated LC map increased by approximately 4.95 percentage points, and this amount for the geometric mean of producer’s accuracies was almost 3.75 percentage points, in comparison to the most accurate data balancing method (i.e., SVM-SMOTE). Regarding the geometric mean of users’ accuracies, RUESVMs also outperformed the SVM-SMOTE method with an average increase of 6.45 percentage points.