Machine Learning Comparison and Parameter Setting Methods for the Detection of Dump Sites for Construction and Demolition Waste Using the Google Earth Engine

Machine learning has been successfully used for object recognition within images. Due to the complexity of the spectrum and texture of construction and demolition waste (C&DW), it is difficult to construct an automatic identification method for C&DW based on machine learning and remote sensing data sources. Machine learning includes many types of algorithms; however, different algorithms and parameters have different identification effects on C&DW. Exploring the optimal method for automatic remote sensing identification of C&DW is an important approach for the intelligent supervision of C&DW. This study investigates the megacity of Beijing, which is facing high risk of C&DW pollution. To improve the classification accuracy of C&DW, buildings, vegetation, water, and crops were selected as comparative training samples based on the Google Earth Engine (GEE), and Sentinel-2 was used as the data source. Three classification methods of typical machine learning algorithms (classification and regression trees (CART), random forest (RF), and support vector machine (SVM)) were selected to classify the C&DW from remote sensing images. Using empirical methods, the experimental trial method, and the grid search method, the optimal parameterization scheme of the three classification methods was studied to determine the optimal method of remote sensing identification of C&DW based on machine learning. Through accuracy evaluation and ground verification, the overall recognition accuracies of CART, RF, and SVM for C&DW were 73.12%, 98.05%, and 85.62%, respectively, under the optimal parameterization scheme determined in this study. Among these algorithms, RF was a better C&DW identification method than were CART and SVM when the number of decision trees was 50. This study explores the robust machine learning method for automatic remote sensing identification of C&DW and provides a scientific basis for intelligent supervision and resource utilization of C&DW.

Download Full-text

Assessing the Effect of Training Sampling Design on the Performance of Machine Learning Classifiers for Land Cover Mapping Using Multi-Temporal Remote Sensing Data and Google Earth Engine

Remote Sensing ◽

10.3390/rs13081433 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1433

Author(s):

Shobitha Shetty ◽

Prasun Kumar Gupta ◽

Mariana Belgiu ◽

S. K. Srivastav

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Sampling ◽

Sampling Design ◽

Remote Sensing Data ◽

Google Earth ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Multi Temporal ◽

Google Earth Engine

Machine learning classifiers are being increasingly used nowadays for Land Use and Land Cover (LULC) mapping from remote sensing images. However, arriving at the right choice of classifier requires understanding the main factors influencing their performance. The present study investigated firstly the effect of training sampling design on the classification results obtained by Random Forest (RF) classifier and, secondly, it compared its performance with other machine learning classifiers for LULC mapping using multi-temporal satellite remote sensing data and the Google Earth Engine (GEE) platform. We evaluated the impact of three sampling methods, namely Stratified Equal Random Sampling (SRS(Eq)), Stratified Proportional Random Sampling (SRS(Prop)), and Stratified Systematic Sampling (SSS) upon the classification results obtained by the RF trained LULC model. Our results showed that the SRS(Prop) method favors major classes while achieving good overall accuracy. The SRS(Eq) method provides good class-level accuracies, even for minority classes, whereas the SSS method performs well for areas with large intra-class variability. Toward evaluating the performance of machine learning classifiers, RF outperformed Classification and Regression Trees (CART), Support Vector Machine (SVM), and Relevance Vector Machine (RVM) with a >95% confidence level. The performance of CART and SVM classifiers were found to be similar. RVM achieved good classification results with a limited number of training samples.

Download Full-text

Synthetic sampling for spatio-temporal land cover mapping with machine learning and the Google Earth Engine in Andalusia, Spain

10.5194/egusphere-egu2020-1153 ◽

2020 ◽

Author(s):

Laura Bindereif ◽

Tobias Rentschler ◽

Martin Batelheim ◽

Marta Díaz-Zorita Bonilla ◽

Philipp Gries ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Land Cover ◽

Classification Accuracy ◽

Sampling Methods ◽

Google Earth ◽

Land Cover Mapping ◽

Machine Learning Applications ◽

Google Earth Engine ◽

Spatio Temporal

<p>Land cover information plays an essential role for resource development, environmental monitoring and protection. Amongst other natural resources, soils and soil properties are strongly affected by land cover and land cover change, which can lead to soil degradation. Remote sensing techniques are very suitable for spatio-temporal mapping of land cover mapping and change detection. With remote sensing programs vast data archives were established. Machine learning applications provide appropriate algorithms to analyse such amounts of data efficiently and with accurate results. However, machine learning methods require specific sampling techniques and are usually made for balanced datasets with an even training sample frequency. Though, most real-world datasets are imbalanced and methods to reduce the imbalance of datasets with synthetic sampling are required. Synthetic sampling methods increase the number of samples in the minority class and/or decrease the number in the majority class to achieve higher model accuracy. The Synthetic Minority Over-Sampling Technique (SMOTE) is a method to generate synthetic samples and balance the dataset used in many machine learning applications. In the middle Guadalquivir basin, Andalusia, Spain, we used random forests with Landsat images from 1984 to 2018 as covariates to map the land cover change with the Google Earth Engine. The sampling design was based on stratified random sampling according to the CORINE land cover classification of 2012. The land cover classes in our study were arable land, permanent crops (plantations), pastures/grassland, forest and shrub. Artificial surfaces and water bodies were excluded from modelling. However, the number of the 130 training samples was imbalanced. The classes pasture (7&#160;samples) and shrub (13&#160;samples) show a lower number than the other classes (48, 47 and 16&#160;samples). This led to misclassifications and negatively affected the classification accuracy. Therefore, we applied SMOTE to increase the number of samples and the classification accuracy of the model. Preliminary results are promising and show an increase of the classification accuracy, especially the accuracy of the previously underrepresented classes pasture and shrub. This corresponds to the results of studies with other objectives which also see the use of synthetic sampling methods as an improvement for the performance of classification frameworks.</p>

Download Full-text

Remote Sensing Mapping of Cage and Floating-raft Aquaculture in China's Offshore Waters Using Machine Learning Methods and Google Earth Engine

10.1109/agro-geoinformatics50104.2021.9530297 ◽

2021 ◽

Author(s):

Yunci Xu ◽

Wenqi Wu ◽

Lizhen Lu

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Google Earth ◽

Learning Methods ◽

Machine Learning Methods ◽

Floating Raft ◽

Google Earth Engine ◽

Remote Sensing Mapping

Download Full-text

Mapping Paddy Rice Fields by Combining Multi-Temporal Vegetation Index and Synthetic Aperture Radar Remote Sensing Data Using Google Earth Engine Machine Learning Platform

Remote Sensing ◽

10.3390/rs12182992 ◽

2020 ◽

Vol 12 (18) ◽

pp. 2992 ◽

Cited By ~ 1

Author(s):

Nengcheng Chen ◽

Lixiaona Yu ◽

Xiang Zhang ◽

Yonglin Shen ◽

Linglin Zeng ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Synthetic Aperture Radar ◽

Vegetation Index ◽

Vegetation Indices ◽

Google Earth ◽

Paddy Rice ◽

Synthetic Aperture ◽

Google Earth Engine ◽

Data Missing

The knowledge of the area and spatial distribution of paddy rice fields is important for water resource management. However, accurate map of paddy rice is a long-term challenge because of its spatiotemporal discontinuity and short duration. To solve this problem, this study proposed a paddy rice area extraction approach by using the combination of optical vegetation indices and synthetic aperture radar (SAR) data. This method is designed to overcome the data-missing problem due to cloud contamination and spatiotemporal discontinuities of the traditional optical remote sensing method. More specifically, the Sentinel-1A SAR and the Sentinel-2 multispectral imager (MSI) Level-2A imagery are used to identify paddy rice with a high temporal and spatial resolution. Three vegetation indices, namely normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), and land surface water index (LSWI), are estimated from optical bands. Two polarization bands (VH (vertical-horizontal) and VV (vertical-vertical)) are used to overcome the cloud contamination problem. This approach was applied with the random forest machine learning algorithm on the Google Earth Engine platform for the Jianghan Plain in China as an experimental area. The results of 39 experiments uncovered the effect of different factors. The results indicated that the combination of VV and VH band showed a better performance compared with other polarization bands; the average producer’s accuracy of paddy rice (PA) is 72.79%, 1.58% higher than the second one VH. Secondly, the combination of three indices also showed a better result than others, with average PA 73.82%, 1.42% higher than using NDVI alone. The classification result presented the best combination is EVI, VV, and VH polarization band. The producer’s accuracy of paddy rice was 76.67%, with the overall accuracy (OA) of 66.07%, and Kappa statistics of 0.45. However, NDVI, EVI, and VH showed better performance in mapping the morphology. The results demonstrated the method developed in this study can be successfully applied to the cloud-prone area for mapping paddy rice to overcome the data missing caused by cloud and rain during the paddy growing season.

Download Full-text

CROP CLASSIFICATION IN KARSHI STEPPE USING REMOTE SENSING INFORMATION AND GOOGLE EARTH ENGINE TOOL

European Science Review ◽

10.29013/esr-18-9.10.1-130-132 ◽

2018 ◽

pp. 130-132

Author(s):

Z. A. Gafurov ◽

S. B. Eltazarov ◽

T. A. Akhmedova

Keyword(s):

Remote Sensing ◽

Google Earth ◽

Google Earth Engine ◽

Crop Classification

Download Full-text

Exploratory Analysis of Driving Force of Wildfires in Australia: An Application of Machine Learning within Google Earth Engine

Remote Sensing ◽

10.3390/rs13010010 ◽

2020 ◽

Vol 13 (1) ◽

pp. 10

Author(s):

Andrea Sulova ◽

Jamal Jokar Arsanjani

Keyword(s):

Climate Change ◽

Machine Learning ◽

Random Forest ◽

Google Earth ◽

Summer Season ◽

Driving Factors ◽

Machine Learning Algorithms ◽

Classification And Regression Tree ◽

Training Dataset ◽

Google Earth Engine

Recent studies have suggested that due to climate change, the number of wildfires across the globe have been increasing and continue to grow even more. The recent massive wildfires, which hit Australia during the 2019–2020 summer season, raised questions to what extent the risk of wildfires can be linked to various climate, environmental, topographical, and social factors and how to predict fire occurrences to take preventive measures. Hence, the main objective of this study was to develop an automatized and cloud-based workflow for generating a training dataset of fire events at a continental level using freely available remote sensing data with a reasonable computational expense for injecting into machine learning models. As a result, a data-driven model was set up in Google Earth Engine platform, which is publicly accessible and open for further adjustments. The training dataset was applied to different machine learning algorithms, i.e., Random Forest, Naïve Bayes, and Classification and Regression Tree. The findings show that Random Forest outperformed other algorithms and hence it was used further to explore the driving factors using variable importance analysis. The study indicates the probability of fire occurrences across Australia as well as identifies the potential driving factors of Australian wildfires for the 2019–2020 summer season. The methodical approach and achieved results and drawn conclusions can be of great importance to policymakers, environmentalists, and climate change researchers, among others.

Download Full-text

PENERAPAN MACHINE LEARNING BERBASIS DATA GEOSPASIAL UNTUK OPTIMALISASI LAHAN PERTANIAN PADA MASA PANDEMI DAN PASCA PANDEMI

Seminar Nasional Geomatika ◽

10.24895/sng.2020.0-0.1131 ◽

2021 ◽

pp. 161

Author(s):

Royyannuur Kurniawan Endrayanto ◽

Adharul Muttaqin

Keyword(s):

Machine Learning ◽

Random Forest ◽

Early Warning Systems ◽

Google Earth ◽

Warning Systems ◽

Land Data Assimilation ◽

Google Earth Engine ◽

Land Data Assimilation System ◽

Data Assimilation System ◽

Assimilation System

Pertanian merupakan salah satu sektor penting karena dapat memenuhi kebutuhan pangan sebagai kebutuhan pokok. Kebutuhan pangan masih menjadi salah satu isu hangat terlebih di masa pandemi COVID- 19 seperti saat ini. Pemenuhan kebutuhan pangan juga berkaitan erat dengan jumlah bahan pangan yang diproduksi oleh petani. Lingkungan merupakan salah satu faktor keberhasilan dalam kegiatan pertanian. Kondisi lingkungan Indonesia yang beragam seperti suhu dan tingkat presipitasi menyebabkan adanya perbedaan jenis tanaman pangan potensial setiap daerah di Indonesia. Oleh karena itu perlu upaya untuk mengoptimalkan produksi lahan pertanian berdasarkan faktor lingkungan di setiap daerah. Upaya ini diharapkan dapat membantu menjaga ketahanan pangan baik di masa pandemi dan pasca pandemi. Pada penelitian ini diperkenalkan pemanfaatan data geospasial untuk klasifikasi jenis tanaman pangan menggunakan algoritma machine learning sebagai upaya optimalisasi lahan pertanian. Data yang digunakan adalah Famine Early Warning Systems Network (FEWS NET) Land Data Assimilation System (FLDAS). Algoritma machine learning yang digunakan adalah algoritma klasifikasi Random Forest. Teknologi yang digunakan adalah Google Colab, Google Earth Engine dan Python. Tujuan dari penelitian ini adalah untuk mengklasifikasikan tanaman pangan yang memiliki potensi paling baik untuk ditanam di suatu daerah berdasarkan kondisi lingkungan yang ada.

Download Full-text

Accuracies Achieved in Classifying Five Leading World Crop Types and their Growth Stages Using Optimal Earth Observing-1 Hyperion Hyperspectral Narrowbands on Google Earth Engine

Remote Sensing ◽

10.3390/rs10122027 ◽

2018 ◽

Vol 10 (12) ◽

pp. 2027 ◽

Cited By ~ 13

Author(s):

Itiya Aneece ◽

Prasad Thenkabail

Keyword(s):

United States ◽

Remote Sensing ◽

Hyperspectral Imaging ◽

The United States ◽

Google Earth ◽

Growth Stages ◽

Agricultural Crops ◽

Spectral Library ◽

Crop Types ◽

Google Earth Engine

As the global population increases, we face increasing demand for food and nutrition. Remote sensing can help monitor food availability to assess global food security rapidly and accurately enough to inform decision-making. However, advances in remote sensing technology are still often limited to multispectral broadband sensors. Although these sensors have many applications, they can be limited in studying agricultural crop characteristics such as differentiating crop types and their growth stages with a high degree of accuracy and detail. In contrast, hyperspectral data contain continuous narrowbands that provide data in terms of spectral signatures rather than a few data points along the spectrum, and hence can help advance the study of crop characteristics. To better understand and advance this idea, we conducted a detailed study of five leading world crops (corn, soybean, winter wheat, rice, and cotton) that occupy 75% and 54% of principal crop areas in the United States and the world respectively. The study was conducted in seven agroecological zones of the United States using 99 Earth Observing-1 (EO-1) Hyperion hyperspectral images from 2008–2015 at 30 m resolution. The authors first developed a first-of-its-kind comprehensive Hyperion-derived Hyperspectral Imaging Spectral Library of Agricultural crops (HISA) of these crops in the US based on USDA Cropland Data Layer (CDL) reference data. Principal Component Analysis was used to eliminate redundant bands by using factor loadings to determine which bands most influenced the first few principal components. This resulted in the establishment of 30 optimal hyperspectral narrowbands (OHNBs) for the study of agricultural crops. The rest of the 242 Hyperion HNBs were redundant, uncalibrated, or noisy. Crop types and crop growth stages were classified using linear discriminant analysis (LDA) and support vector machines (SVM) in the Google Earth Engine cloud computing platform using the 30 optimal HNBs (OHNBs). The best overall accuracies were between 75% to 95% in classifying crop types and their growth stages, which were achieved using 15–20 HNBs in the majority of cases. However, in complex cases (e.g., 4 or more crops in a Hyperion image) 25–30 HNBs were required to achieve optimal accuracies. Beyond 25–30 bands, accuracies asymptote. This research makes a significant contribution towards understanding modeling, mapping, and monitoring agricultural crops using data from upcoming hyperspectral satellites, such as NASA’s Surface Biology and Geology mission (formerly HyspIRI mission) and the recently launched HysIS (Indian Hyperspectral Imaging Satellite, 55 bands over 400–950 nm in VNIR and 165 bands over 900–2500 nm in SWIR), and contributions in advancing the building of a novel, first-of-its-kind global hyperspectral imaging spectral-library of agricultural crops (GHISA: www.usgs.gov/WGSC/GHISA).

Download Full-text

Mapping Regional Landscape by Using OpenstreetMap (OSM)

Environmental Information Systems ◽

10.4018/978-1-5225-7033-2.ch033 ◽

2019 ◽

pp. 771-790 ◽

Cited By ~ 1

Author(s):

Di Yang

Keyword(s):

Remote Sensing ◽

Land Use ◽

Forest Cover ◽

Regional Scale ◽

Google Earth ◽

Computing Power ◽

New Approach ◽

Computation Cost ◽

Regional Landscape ◽

Google Earth Engine

A forest patterns map over a large extent at high spatial resolution is a heavily computation task but is critical to most regions. There are two major difficulties in generating the classification maps at regional scale: large training points sets and expensive computation cost in classifier modelling. As one of the most well-known Volunteered Geographic Information (VGI) initiatives, OpenstreetMap contributes not only on road network distributions, but the potential of justify land cover and land use. Google Earth Engine is a platform designed for cloud-based mapping with a strong computing power. In this study, we proposed a new approach to generating forest cover map and quantifying road-caused forest fragmentations by using OpenstreetMap in conjunction with remote sensing dataset stored in Google Earth Engine. Additionally, the landscape metrics produced after incorporating OpenStreetMap (OSM) with the forest spatial pattern layers from our output indicated significant levels of forest fragmentation in Yucatan peninsula.

Download Full-text