Satellite imagery and machine learning for identification of aridity risk in central Java Indonesia

PeerJ Computer Science ◽

10.7717/peerj-cs.415 ◽

2021 ◽

Vol 7 ◽

pp. e415

Author(s):

Sri Yulianto Joko Prasetyo ◽

Kristoko Dwi Hartomo ◽

Mila Chrismawati Paseleng

Keyword(s):

Machine Learning ◽

Survey Data ◽

Vegetation Indices ◽

Machine Learning Algorithms ◽

Support Vector ◽

Landsat 8 ◽

Area Index ◽

Stable Algorithm ◽

Vegetation Condition ◽

Vegetation Health

This study aims to develop a software framework for predicting aridity using vegetation indices (VI) from LANDSAT 8 OLI images. VI data are predicted using machine learning (ml): Random Forest (RF) and Correlation and Regression Trees (CART). Comparison of prediction using Artificial Neural Network (ANN), Support Vector Machine (SVM), k-nearest neighbors (k-nn) and Multivariate Adaptive Regression Spline (MARS). Prediction results are interpolated using Inverse Distance Weight (IDW). This study was conducted in stages: (1) Image preprocessing; (2) calculating numerical data extracted from the LANDSAT band imagery using vegetation indices; (3) analyzing correlation coefficients between VI; (4) prediction using RF and CART; (5) comparing performances between RF and CART using ANN, SVM, k-nn, and MARS; (6) testing the accuracy of prediction using Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE); (7) interpolating with IDW. Correlation coefficient of VI data shows a positive correlation, the lowest r (0.07) and the highest r (0.98). The experiments show that the RF and CART algorithms have efficiency and effectivity in determining the aridity areas better than the ANN, SVM, k-nn, and MARS algorithm. RF has a difference between the predicted results and 1.04% survey data MAPE and the smallest value close to zero is 0.05 MSE. CART has a difference between the predicted results and 1.05% survey data MAPE and the smallest value approaching to zero which is 0.05 MSE. The prediction results of VI show that in 2020 most of the study areas were low vegetation areas with the Normalized Difference Vegetation Index (NDVI) < 0.21, had an indication of drought with the Vegetation Health Index (VHI) < 31.10, had a Vegetation Condition Index (VCI) in some areas between 35%–50% (moderate drought) and < 35% (high drought). The Burn Area Index (dBAI) values are between −3, 971 and −2,376 that show the areas have a low fire risk, and index values are between −0, 208 and −0,412 that show the areas are starting vegetation growth. The result of this study shows that the machine learning algorithms is an accurate and stable algorithm in predicting the risks of drought and land fire based on the VI data extracted from the LANDSAT 8 OLL imagery. The VI data contain the record of vegetation condition and its environment, including humidity, temperatures, and the environmental vegetation health.

Download Full-text

A COMPARISON OF MACHINE-LEARNING REGRESSION ALGORITHMS FOR THE ESTIMATION OF LAI USING LANDSAT - 8 SATELLITE DATA

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w16-679-2019 ◽

2019 ◽

Vol XLII-4/W16 ◽

pp. 679-683

Author(s):

V. P. Yadav ◽

R. Prasad ◽

R. Bala ◽

A. K. Vishwakarma ◽

S. A. Yadav ◽

...

Keyword(s):

Machine Learning ◽

Satellite Data ◽

Vegetation Index ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Accurate Estimation ◽

Support Vector ◽

Landsat 8 ◽

Area Index ◽

Global Circulation Models

Abstract. The leaf area index (LAI) is one of key variable of crops which plays important role in agriculture, ecology and climate change for global circulation models to compute energy and water fluxes. In the recent research era, the machine-learning algorithms have provided accurate computational approaches for the estimation of crops biophysical parameters using remotely sensed data. The three machine-learning algorithms, random forest regression (RFR), support vector regression (SVR) and artificial neural network regression (ANNR) were used to estimate the LAI for crops in the present study. The three different dates of Landsat-8 satellite images were used during January 2017 – March 2017 at different crops growth conditions in Varanasi district, India. The sampling regions were fully covered by major Rabi season crops like wheat, barley and mustard etc. In total pooled data, 60% samples were taken for the training of the algorithms and rest 40% samples were taken as testing and validation of the machinelearning regressions algorithms. The highest sensitivity of normalized difference vegetation index (NDVI) with LAI was found using RFR algorithms (R2 = 0.884, RMSE = 0.404) as compared to SVR (R2 = 0.847, RMSE = 0.478) and ANNR (R2 = 0.829, RMSE = 0.404). Therefore, RFR algorithms can be used for accurate estimation of LAI for crops using satellite data.

Download Full-text

Exploring the Potential of WorldView-2 Red-Edge Band-Based Vegetation Indices for Estimation of Mangrove Leaf Area Index with Machine Learning Algorithms

Remote Sensing ◽

10.3390/rs9101060 ◽

2017 ◽

Vol 9 (10) ◽

pp. 1060 ◽

Cited By ~ 31

Author(s):

Yuanhui Zhu ◽

Kai Liu ◽

Lin Liu ◽

Soe Myint ◽

Shugong Wang ◽

...

Keyword(s):

Machine Learning ◽

Leaf Area Index ◽

Leaf Area ◽

Learning Algorithms ◽

Vegetation Indices ◽

Machine Learning Algorithms ◽

Area Index ◽

Red Edge ◽

Edge Band

Download Full-text

Object-Oriented LULC Classification in Google Earth Engine Combining SNIC, GLCM, and Machine Learning Algorithms

Remote Sensing ◽

10.3390/rs12223776 ◽

2020 ◽

Vol 12 (22) ◽

pp. 3776

Author(s):

Andrea Tassi ◽

Marco Vizzari

Keyword(s):

Machine Learning ◽

Central Italy ◽

Object Oriented ◽

Google Earth ◽

Machine Learning Algorithms ◽

Support Vector ◽

Landsat 8 ◽

Good Reliability ◽

Google Earth Engine ◽

Occurrence Matrix

Google Earth Engine (GEE) is a versatile cloud platform in which pixel-based (PB) and object-oriented (OO) Land Use–Land Cover (LULC) classification approaches can be implemented, thanks to the availability of the many state-of-art functions comprising various Machine Learning (ML) algorithms. OO approaches, including both object segmentation and object textural analysis, are still not common in the GEE environment, probably due to the difficulties existing in concatenating the proper functions, and in tuning the various parameters to overcome the GEE computational limits. In this context, this work is aimed at developing and testing an OO classification approach combining the Simple Non-Iterative Clustering (SNIC) algorithm to identify spatial clusters, the Gray-Level Co-occurrence Matrix (GLCM) to calculate cluster textural indices, and two ML algorithms (Random Forest (RF) or Support Vector Machine (SVM)) to perform the final classification. A Principal Components Analysis (PCA) is applied to the main seven GLCM indices to synthesize in one band the textural information used for the OO classification. The proposed approach is implemented in a user-friendly, freely available GEE code useful to perform the OO classification, tuning various parameters (e.g., choose the input bands, select the classification algorithm, test various segmentation scales) and compare it with a PB approach. The accuracy of OO and PB classifications can be assessed both visually and through two confusion matrices that can be used to calculate the relevant statistics (producer’s, user’s, overall accuracy (OA)). The proposed methodology was broadly tested in a 154 km2 study area, located in the Lake Trasimeno area (central Italy), using Landsat 8 (L8), Sentinel 2 (S2), and PlanetScope (PS) data. The area was selected considering its complex LULC mosaic mainly composed of artificial surfaces, annual and permanent crops, small lakes, and wooded areas. In the study area, the various tests produced interesting results on the different datasets (OA: PB RF (L8 = 72.7%, S2 = 82%, PS = 74.2), PB SVM (L8 = 79.1%, S2 = 80.2%, PS = 74.8%), OO RF (L8 = 64%, S2 = 89.3%, PS = 77.9), OO SVM (L8 = 70.4, S2 = 86.9%, PS = 73.9)). The broad code application demonstrated very good reliability of the whole process, even though the OO classification process resulted, sometimes, too demanding on higher resolution data, considering the available computational GEE resources.

Download Full-text

Latest Advances in Fractional Snow Cover Mapping on MODIS Data by Machine Learning Algorithms

10.5194/egusphere-egu2020-13193 ◽

2020 ◽

Author(s):

Semih Kuter ◽

Zuhal Akyurek

Keyword(s):

Machine Learning ◽

Snow Cover ◽

General Circulation ◽

Snow Water Equivalent ◽

Machine Learning Algorithms ◽

Multivariate Adaptive Regression Splines ◽

Support Vector ◽

Landsat 8 ◽

European Alps ◽

Fractional Snow Cover

Spatial extent of snow has been declared as an essential climate variable. Accurate modeling of snow cover is crucial for the better prediction of snow water equivalent and, consequently, for the success of general circulation and weather forecasting models as well as climate change and hydrological studies. This presentation mainly focuses on the representation of the latest findings of our efforts in fractional snow cover mapping on MODIS images by data-driven machine learning methodologies. For this purpose, a dataset composed of 20 MODIS - Landsat 8 image pairs acquired between Apr 2013 and Dec 2016 over European Alps were employed. Artificial neural networks (ANN), multivariate adaptive regression splines (MARS), support vector regression (SVR) and random forest (RF) models were trained and tested by using reference FSC maps generated from higher spatial resolution Landsat 8 binary snow maps. ANN, MARS, SVR and RF models exhibited quite good performance with average R &#8776; 0.93, whereas the agreement between the reference FSC maps and the MODIS&#8217; own product MOD10A1 (C5) was slightly poorer with R &#8776; 0.88.

Download Full-text

Recognition of Maize Phenology in Sentinel Images with Machine Learning

Sensors ◽

10.3390/s22010094 ◽

2021 ◽

Vol 22 (1) ◽

pp. 94

Author(s):

Alvaro Murguia-Cozar ◽

Antonia Macedo-Cruz ◽

Demetrio Salvador Fernandez-Reynoso ◽

Jorge Arturo Salgado Transito

Keyword(s):

Machine Learning ◽

Satellite Image ◽

Spatial Association ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Models ◽

Phenological Stages ◽

Area Index ◽

Stage Of Development ◽

Sentinel 2

The scarcity of water for agricultural use is a serious problem that has increased due to intense droughts, poor management, and deficiencies in the distribution and application of the resource. The monitoring of crops through satellite image processing and the application of machine learning algorithms are technological strategies with which developed countries tend to implement better public policies regarding the efficient use of water. The purpose of this research was to determine the main indicators and characteristics that allow us to discriminate the phenological stages of maize crops (Zea mays L.) in Sentinel 2 satellite images through supervised classification models. The training data were obtained by monitoring cultivated plots during an agricultural cycle. Indicators and characteristics were extracted from 41 Sentinel 2 images acquired during the monitoring dates. With these images, indicators of texture, vegetation, and colour were calculated to train three supervised classifiers: linear discriminant (LD), support vector machine (SVM), and k-nearest neighbours (kNN) models. It was found that 45 of the 86 characteristics extracted contributed to maximizing the accuracy by stage of development and the overall accuracy of the trained classification models. The characteristics of the Moran’s I local indicator of spatial association (LISA) improved the accuracy of the classifiers when applied to the L*a*b* colour model and to the near-infrared (NIR) band. The local binary pattern (LBP) increased the accuracy of the classification when applied to the red, green, blue (RGB) and NIR bands. The colour ratios, leaf area index (LAI), RGB colour model, L*a*b* colour space, LISA, and LBP extracted the most important intrinsic characteristics of maize crops with regard to classifying the phenological stages of the maize cultivation. The quadratic SVM model was the best classifier of maize crop phenology, with an overall accuracy of 82.3%.

Download Full-text

FEASIBILITY OF MACHINE LEARNING METHODS FOR SEPARATING WOOD AND LEAF POINTS FROM TERRESTRIAL LASER SCANNING DATA

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-2-w4-157-2017 ◽

2017 ◽

Vol IV-2/W4 ◽

pp. 157-164 ◽

Cited By ~ 6

Author(s):

D. Wang ◽

M. Hollaus ◽

N. Pfeifer

Keyword(s):

Machine Learning ◽

Laser Scanning ◽

Learning Algorithms ◽

Terrestrial Laser Scanning ◽

Gaussian Mixture ◽

Machine Learning Algorithms ◽

Support Vector ◽

Area Index ◽

Training Samples

Classification of wood and leaf components of trees is an essential prerequisite for deriving vital tree attributes, such as wood mass, leaf area index (LAI) and woody-to-total area. Laser scanning emerges to be a promising solution for such a request. Intensity based approaches are widely proposed, as different components of a tree can feature discriminatory optical properties at the operating wavelengths of a sensor system. For geometry based methods, machine learning algorithms are often used to separate wood and leaf points, by providing proper training samples. However, it remains unclear how the chosen machine learning classifier and features used would influence classification results. To this purpose, we compare four popular machine learning classifiers, namely Support Vector Machine (SVM), Na¨ıve Bayes (NB), Random Forest (RF), and Gaussian Mixture Model (GMM), for separating wood and leaf points from terrestrial laser scanning (TLS) data. Two trees, an Erytrophleum fordii and a Betula pendula (silver birch) are used to test the impacts from classifier, feature set, and training samples. Our results showed that RF is the best model in terms of accuracy, and local density related features are important. Experimental results confirmed the feasibility of machine learning algorithms for the reliable classification of wood and leaf points. It is also noted that our studies are based on isolated trees. Further tests should be performed on more tree species and data from more complex environments.

Download Full-text

Comparative Analysis of Machine Learning Algorithms in Automatic Identification and Extraction of Water Boundaries

Applied Sciences ◽

10.3390/app112110062 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10062

Author(s):

Aimin Li ◽

Meng Fan ◽

Guangduo Qin ◽

Youcheng Xu ◽

Hailong Wang

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Water Bodies ◽

Support Vector ◽

Landsat 8 ◽

Transfer Performance ◽

Remote Sensing Images

Monitoring open water bodies accurately is important for assessing the role of ecosystem services in the context of human survival and climate change. There are many methods available for water body extraction based on remote sensing images, such as the normalized difference water index (NDWI), modified NDWI (MNDWI), and machine learning algorithms. Based on Landsat-8 remote sensing images, this study focuses on the effects of six machine learning algorithms and three threshold methods used to extract water bodies, evaluates the transfer performance of models applied to remote sensing images in different periods, and compares the differences among these models. The results are as follows. (1) Various algorithms require different numbers of samples to reach their optimal consequence. The logistic regression algorithm requires a minimum of 110 samples. As the number of samples increases, the order of the optimal model is support vector machine, neural network, random forest, decision tree, and XGBoost. (2) The accuracy evaluation performance of each machine learning on the test set cannot represent the local area performance. (3) When these models are directly applied to remote sensing images in different periods, the AUC indicators of each machine learning algorithm for three regions all show a significant decline, with a decrease range of 0.33–66.52%, and the differences among the different algorithm performances in the three areas are obvious. Generally, the decision tree algorithm has good transfer performance among the machine learning algorithms with area under curve (AUC) indexes of 0.790, 0.518, and 0.697 in the three areas, respectively, and the average value is 0.668. The Otsu threshold algorithm is the optimal among threshold methods, with AUC indexes of 0.970, 0.617, and 0.908 in the three regions respectively and an average AUC of 0.832.

Download Full-text

Evaluation of Machine Learning Algorithms for Surface Water Extraction in a Landsat 8 Scene of Nepal

Sensors ◽

10.3390/s19122769 ◽

2019 ◽

Vol 19 (12) ◽

pp. 2769 ◽

Cited By ~ 8

Author(s):

Tri Dev Acharya ◽

Anoj Subedi ◽

Dong Ha Lee

Keyword(s):

Machine Learning ◽

Surface Water ◽

Recursive Partitioning ◽

Learning Algorithms ◽

High Elevation ◽

Machine Learning Algorithms ◽

Water Extraction ◽

Support Vector ◽

Landsat 8 ◽

Water Index

With over 6000 rivers and 5358 lakes, surface water is one of the most important resources in Nepal. However, the quantity and quality of Nepal’s rivers and lakes are decreasing due to human activities and climate change. Despite the advancement of remote sensing technology and the availability of open access data and tools, the monitoring and surface water extraction works has not been carried out in Nepal. Single or multiple water index methods have been applied in the extraction of surface water with satisfactory results. Extending our previous study, the authors evaluated six different machine learning algorithms: Naive Bayes (NB), recursive partitioning and regression trees (RPART), neural networks (NNET), support vector machines (SVM), random forest (RF), and gradient boosted machines (GBM) to extract surface water in Nepal. With three secondary bands, slope, NDVI and NDWI, the algorithms were evaluated for performance with the addition of extra information. As a result, all the applied machine learning algorithms, except NB and RPART, showed good performance. RF showed overall accuracy (OA) and kappa coefficient (Kappa) of 1 for the all the multiband data with the reference dataset, followed by GBM, NNET, and SVM in metrics. The performances were better in the hilly regions and flat lands, but not well in the Himalayas with ice, snow and shadows, and the addition of slope and NDWI showed improvement in the results. Adding single secondary bands is better than adding multiple in most algorithms except NNET. From current and previous studies, it is recommended to separate any study area with and without snow or low and high elevation, then apply machine learning algorithms in original Landsat data or with the addition of slopes or NDWI for better performance.

Download Full-text

Fusion of Multispectral Aerial Imagery and Vegetation Indices for Machine Learning-Based Ground Classification

Remote Sensing ◽

10.3390/rs13081411 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1411

Author(s):

Yanchao Zhang ◽

Wen Yang ◽

Ying Sun ◽

Christine Chang ◽

Jiya Yu ◽

...

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Vegetation Indices ◽

Machine Learning Algorithms ◽

Support Vector ◽

Multispectral Images ◽

Learning Methods ◽

Machine Learning Methods ◽

Spectral Bands ◽

Vegetation Indexes

Unmanned Aerial Vehicles (UAVs) are emerging and promising platforms for carrying different types of cameras for remote sensing. The application of multispectral vegetation indices for ground cover classification has been widely adopted and has proved its reliability. However, the fusion of spectral bands and vegetation indices for machine learning-based land surface investigation has hardly been studied. In this paper, we studied the fusion of spectral bands information from UAV multispectral images and derived vegetation indices for almond plantation classification using several machine learning methods. We acquired multispectral images over an almond plantation using a UAV. First, a multispectral orthoimage was generated from the acquired multispectral images using SfM (Structure from Motion) photogrammetry methods. Eleven types of vegetation indexes were proposed based on the multispectral orthoimage. Then, 593 data points that contained multispectral bands and vegetation indexes were randomly collected and prepared for this study. After comparing six machine learning algorithms (Support Vector Machine, K-Nearest Neighbor, Linear Discrimination Analysis, Decision Tree, Random Forest, and Gradient Boosting), we selected three (SVM, KNN, and LDA) to study the fusion of multi-spectral bands information and derived vegetation index for classification. With the vegetation indexes increased, the model classification accuracy of all three selected machine learning methods gradually increased, then dropped. Our results revealed that that: (1) spectral information from multispectral images can be used for machine learning-based ground classification, and among all methods, SVM had the best performance; (2) combination of multispectral bands and vegetation indexes can improve the classification accuracy comparing to only spectral bands among all three selected methods; (3) among all VIs, NDEGE, NDVIG, and NDVGE had consistent performance in improving classification accuracies, and others may reduce the accuracy. Machine learning methods (SVM, KNN, and LDA) can be used for classifying almond plantation using multispectral orthoimages, and fusion of multispectral bands with vegetation indexes can improve machine learning-based classification accuracy if the vegetation indexes are properly selected.

Download Full-text

Estimating District-Level Electricity Consumption Using Remotely Sensed Data in Eastern Economic Corridor, Thailand

Remote Sensing ◽

10.3390/rs13224654 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4654

Author(s):

Sirikul Hutasavi ◽

Dongmei Chen

Keyword(s):

Machine Learning ◽

Industrial Development ◽

Statistical Data ◽

Local Level ◽

Electricity Consumption ◽

District Level ◽

Machine Learning Algorithms ◽

Support Vector ◽

Landsat 8 ◽

Remotely Sensed Data

The intensive industrial development in special economic zones, such as Thailand’s Eastern Economic Corridor, increases energy consumption, leading to an imbalance of energy supply and a challenge for energy management. Electricity consumption at a local level is crucial for utility planners to manage and invest in the electrical grid. With this study, we propose an electricity consumption estimation model at the district level using machine learning with publicly available statistical data and built-up area (BU), area of lit (AL), and sum of light intensity (SL) data extracted from Landsat 8 and Suomi NPP satellite nighttime light images. The models created from three machine learning algorithms, which included Multiple Linear Regression (MR), Decision Tree (DT), and Support Vector Regression (SVR), were compared. The results show that (1) electricity consumption is highly correlated with SL, AL, and BU; and (2) the DT model demonstrated a better performance in predicting local electricity consumption when compared to MR and SVR with the lowest error rate and highest R2. The local government in developing countries with limited data and financial resources can adopt the proposed approach to benefit from utilizing commonly available remote sensing and statistical data with simple machine learning models such as DT (regression method) for sustainable electricity management.

Download Full-text