Synthetic sampling for spatio-temporal land cover mapping with machine learning and the Google Earth Engine in Andalusia, Spain

Author(s):  
Laura Bindereif ◽  
Tobias Rentschler ◽  
Martin Batelheim ◽  
Marta Díaz-Zorita Bonilla ◽  
Philipp Gries ◽  
...  

<p>Land cover information plays an essential role for resource development, environmental monitoring and protection. Amongst other natural resources, soils and soil properties are strongly affected by land cover and land cover change, which can lead to soil degradation. Remote sensing techniques are very suitable for spatio-temporal mapping of land cover mapping and change detection. With remote sensing programs vast data archives were established. Machine learning applications provide appropriate algorithms to analyse such amounts of data efficiently and with accurate results. However, machine learning methods require specific sampling techniques and are usually made for balanced datasets with an even training sample frequency. Though, most real-world datasets are imbalanced and methods to reduce the imbalance of datasets with synthetic sampling are required. Synthetic sampling methods increase the number of samples in the minority class and/or decrease the number in the majority class to achieve higher model accuracy. The Synthetic Minority Over-Sampling Technique (SMOTE) is a method to generate synthetic samples and balance the dataset used in many machine learning applications. In the middle Guadalquivir basin, Andalusia, Spain, we used random forests with Landsat images from 1984 to 2018 as covariates to map the land cover change with the Google Earth Engine. The sampling design was based on stratified random sampling according to the CORINE land cover classification of 2012. The land cover classes in our study were arable land, permanent crops (plantations), pastures/grassland, forest and shrub. Artificial surfaces and water bodies were excluded from modelling. However, the number of the 130 training samples was imbalanced. The classes pasture (7 samples) and shrub (13 samples) show a lower number than the other classes (48, 47 and 16 samples). This led to misclassifications and negatively affected the classification accuracy. Therefore, we applied SMOTE to increase the number of samples and the classification accuracy of the model. Preliminary results are promising and show an increase of the classification accuracy, especially the accuracy of the previously underrepresented classes pasture and shrub. This corresponds to the results of studies with other objectives which also see the use of synthetic sampling methods as an improvement for the performance of classification frameworks.</p>

2020 ◽  
Vol 12 (4) ◽  
pp. 602 ◽  
Author(s):  
Qingyu Li ◽  
Chunping Qiu ◽  
Lei Ma ◽  
Michael Schmitt ◽  
Xiao Zhu

The remote sensing based mapping of land cover at extensive scales, e.g., of whole continents, is still a challenging task because of the need for sophisticated pipelines that combine every step from data acquisition to land cover classification. Utilizing the Google Earth Engine (GEE), which provides a catalog of multi-source data and a cloud-based environment, this research generates a land cover map of the whole African continent at 10 m resolution. This land cover map could provide a large-scale base layer for a more detailed local climate zone mapping of urban areas, which lie in the focus of interest of many studies. In this regard, we provide a free download link for our land cover maps of African cities at the end of this paper. It is shown that our product has achieved an overall accuracy of 81% for five classes, which is superior to the existing 10 m land cover product FROM-GLC10 in detecting urban class in city areas and identifying the boundaries between trees and low plants in rural areas. The best data input configurations are carefully selected based on a comparison of results from different input sources, which include Sentinel-2, Landsat-8, Global Human Settlement Layer (GHSL), Night Time Light (NTL) Data, Shuttle Radar Topography Mission (SRTM), and MODIS Land Surface Temperature (LST). We provide a further investigation of the importance of individual features derived from a Random Forest (RF) classifier. In order to study the influence of sampling strategies on the land cover mapping performance, we have designed a transferability analysis experiment, which has not been adequately addressed in the current literature. In this experiment, we test whether trained models from several cities contain valuable information to classify a different city. It was found that samples of the urban class have better reusability than those of other natural land cover classes, i.e., trees, low plants, bare soil or sand, and water. After experimental evaluation of different land cover classes across different cities, we conclude that continental land cover mapping results can be considerably improved when training samples of natural land cover classes are collected and combined from areas covering each Köppen climate zone.


2021 ◽  
Vol 13 (8) ◽  
pp. 1433
Author(s):  
Shobitha Shetty ◽  
Prasun Kumar Gupta ◽  
Mariana Belgiu ◽  
S. K. Srivastav

Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.


2021 ◽  
Vol 13 (4) ◽  
pp. 787
Author(s):  
Lei Zhou ◽  
Ting Luo ◽  
Mingyi Du ◽  
Qiang Chen ◽  
Yang Liu ◽  
...  

Machine learning has been successfully used for object recognition within images. Due to the complexity of the spectrum and texture of construction and demolition waste (C&DW), it is difficult to construct an automatic identification method for C&DW based on machine learning and remote sensing data sources. Machine learning includes many types of algorithms; however, different algorithms and parameters have different identification effects on C&DW. Exploring the optimal method for automatic remote sensing identification of C&DW is an important approach for the intelligent supervision of C&DW. This study investigates the megacity of Beijing, which is facing high risk of C&DW pollution. To improve the classification accuracy of C&DW, buildings, vegetation, water, and crops were selected as comparative training samples based on the Google Earth Engine (GEE), and Sentinel-2 was used as the data source. Three classification methods of typical machine learning algorithms (classification and regression trees (CART), random forest (RF), and support vector machine (SVM)) were selected to classify the C&DW from remote sensing images. Using empirical methods, the experimental trial method, and the grid search method, the optimal parameterization scheme of the three classification methods was studied to determine the optimal method of remote sensing identification of C&DW based on machine learning. Through accuracy evaluation and ground verification, the overall recognition accuracies of CART, RF, and SVM for C&DW were 73.12%, 98.05%, and 85.62%, respectively, under the optimal parameterization scheme determined in this study. Among these algorithms, RF was a better C&DW identification method than were CART and SVM when the number of decision trees was 50. This study explores the robust machine learning method for automatic remote sensing identification of C&DW and provides a scientific basis for intelligent supervision and resource utilization of C&DW.


2019 ◽  
Vol 11 (24) ◽  
pp. 3023 ◽  
Author(s):  
Shuai Xie ◽  
Liangyun Liu ◽  
Xiao Zhang ◽  
Jiangning Yang ◽  
Xidong Chen ◽  
...  

The Google Earth Engine (GEE) has emerged as an essential cloud-based platform for land-cover classification as it provides massive amounts of multi-source satellite data and high-performance computation service. This paper proposed an automatic land-cover classification method using time-series Landsat data on the GEE cloud-based platform. The Moderate Resolution Imaging Spectroradiometer (MODIS) land-cover products (MCD12Q1.006) with the International Geosphere–Biosphere Program (IGBP) classification scheme were used to provide accurate training samples using the rules of pixel filtering and spectral filtering, which resulted in an overall accuracy (OA) of 99.2%. Two types of spectral–temporal features (percentile composited features and median composited monthly features) generated from all available Landsat Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+) data from the year 2010 ± 1 were used as input features to a Random Forest (RF) classifier for land-cover classification. The results showed that the monthly features outperformed the percentile features, giving an average OA of 80% against 77%. In addition, the monthly features composited using the median outperformed those composited using the maximum Normalized Difference Vegetation Index (NDVI) with an average OA of 80% against 78%. Therefore, the proposed method is able to generate accurate land-cover mapping automatically based on the GEE cloud-based platform, which is promising for regional and global land-cover mapping.


2019 ◽  
Vol 11 (16) ◽  
pp. 1907 ◽  
Author(s):  
Mohammad Mardani ◽  
Hossein Mardani ◽  
Lorenzo De Simone ◽  
Samuel Varas ◽  
Naoki Kita ◽  
...  

In-time and accurate monitoring of land cover and land use are essential tools for countries to achieve sustainable food production. However, many developing countries are struggling to efficiently monitor land resources due to the lack of financial support and limited access to adequate technology. This study aims at offering a solution to fill in such a gap in developing countries, by developing a land cover solution that is free of costs. A fully automated framework for land cover mapping was developed using 10-m resolution open access satellite images and machine learning (ML) techniques for the African country of Lesotho. Sentinel-2 satellite images were accessed through Google Earth Engine (GEE) for initial processing and feature extraction at a national level. Also, Food and Agriculture Organization’s land cover of Lesotho (FAO LCL) data were used to train a support vector machine (SVM) and bagged trees (BT) classifiers. SVM successfully classified urban and agricultural lands with 62 and 67% accuracy, respectively. Also, BT could classify the two categories with 81 and 65% accuracy, correspondingly. The trained models could provide precise LC maps in minutes or hours. they can also be utilized as a viable solution for developing countries as an alternative to traditional geographic information system (GIS) methods, which are often labor intensive, require acquisition of very high-resolution commercial satellite imagery, time consuming and call for high budgets.


Ever since the advent of modern geo information systems, tracking environmental changes due to natural and/or manmade causes with the aid of remote sensing applications has been an indispensable tool in numerous fields of geography, most of the earth science disciplines, defence, intelligence, commerce, economics and administrative planning. One among these applications is the construction of land use and land cover maps through image classification process. Land Use / Land Cover (LULC) information is a crucial input in designing efficient strategies for managing natural resources and monitoring environmental changes from time to time. The present study aims to know the extent of land cover and its usage in Davangere region of Karnataka, India. In this study, satellite image of Davangere during October-November 2018 was used for LULC supervised classification with the help of remote sensing tools like QGIS and Google Earth Engine. Six LULC classes were decided to locate on the map and the accuracy assessment was done using theoretical error matrix and Kappa coefficient. The key findings include LULC under Water bodies (8%), Built up Area (15.1%), Vegetation (9%), Horticulture (20.8%), Agriculture (39.3%) and Others (7%) with overall accuracy of 94.8% and Kappa coefficient of 0.866 indicating almost accurate goodness of classification


2020 ◽  
Vol 12 (3) ◽  
pp. 503
Author(s):  
Li ◽  
Chen ◽  
Foody ◽  
Wang ◽  
Yang ◽  
...  

The generation of land cover maps with both fine spatial and temporal resolution would aid the monitoring of change on the Earth’s surface. Spatio-temporal sub-pixel land cover mapping (STSPM) uses a few fine spatial resolution (FR) maps and a time series of coarse spatial resolution (CR) remote sensing images as input to generate FR land cover maps with a temporal frequency of the CR data set. Traditional STSPM selects spatially adjacent FR pixels within a local window as neighborhoods to model the land cover spatial dependence, which can be a source of error and uncertainty in the maps generated by the analysis. This paper proposes a new STSPM using FR remote sensing images that pre- and/or post-date the CR image as ancillary data to enhance the quality of the FR map outputs. Spectrally similar pixels within the locality of a target FR pixel in the ancillary data are likely to represent the same land cover class and hence such same-class pixels can provide spatial information to aid the analysis. Experimental results showed that the proposed STSPM predicted land cover maps more accurately than two comparative state-of-the-art STSPM algorithms.


Sign in / Sign up

Export Citation Format

Share Document