Statistical Machine Learning Methods and Remote Sensing for Sustainable Development Goals: A Review

Interest in statistical analysis of remote sensing data to produce measurements of environment, agriculture, and sustainable development is established and continues to increase, and this is leading to a growing interaction between the earth science and statistical domains. With this in mind, we reviewed the literature on statistical machine learning methods commonly applied to remote sensing data. We focus particularly on applications related to the United Nations World Bank Sustainable Development Goals, including agriculture (food security), forests (life on land), and water (water quality). We provide a review of useful statistical machine learning methods, how they work in a remote sensing context, and examples of their application to these types of data in the literature. Rather than prescribing particular methods for specific applications, we provide guidance, examples, and case studies from the literature for the remote sensing practitioner and applied statistician. In the supplementary material, we also describe the necessary steps pre and post analysis for remote sensing data; the pre-processing and evaluation steps.

Download Full-text

GIS-Based Landslide Susceptibility Mapping Using Remote Sensing Data and Machine Learning Methods

Cartography from Pole to Pole - Lecture Notes in Geoinformation and Cartography ◽

10.1007/978-3-642-32618-9_23 ◽

2013 ◽

pp. 319-333

Author(s):

Fu Ren ◽

Xueling Wu

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Landslide Susceptibility ◽

Remote Sensing Data ◽

Susceptibility Mapping ◽

Landslide Susceptibility Mapping ◽

Learning Methods ◽

Sensing Data ◽

Machine Learning Methods

Download Full-text

Forecasting Rainfed Agricultural Production in Arid and Semi-Arid Lands Using Learning Machine Methods: A Case Study

Sustainability ◽

10.3390/su13094607 ◽

2021 ◽

Vol 13 (9) ◽

pp. 4607

Author(s):

Shahram Rezapour ◽

Erfan Jooyandeh ◽

Mohsen Ramezanzade ◽

Ali Mostafaeipour ◽

Mehdi Jahangiri ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Remote Sensing Data ◽

Weather Data ◽

Support Vector ◽

Learning Methods ◽

K Nearest Neighbors ◽

Sensing Data ◽

Machine Learning Methods ◽

Rainfed Farming

With the rising demand for food products and the direct impact of climate change on food production in many parts of the world, recent years have seen growing interest in the subject of food security and the role of rainfed farming in this area. Machine learning methods can be used to predict crop yield based on a combination of remote sensing data and data collected by ground weather stations. This paper argues that forecasting drylands farming yield can be reliable for management purpose under uncertain conditions using machine learning methods and remote sensing data and determines which indicators are most important in predicting the yield of chickpea. In this study, the yield of rainfed chickpea farms in 11 top chickpea producing counties in Kermanshah province, Iran, was predicted using three machine learning methods, namely support vector regression (SVR), random forest (RF), and K-nearest neighbors (KNN). To improve prediction accuracy, for each county, remote sensing data were overlaid by the satellite images of rainfed farms with a suitable slope and altitude for rainfed farming. An integrated database was created by combining weather data, remote sensing data, and chickpea yield statistics. The methods were evaluated using the leave-one-out cross-validation (LOOCV) technique and compared in terms of multiple measures. Given the sensitivity of rainfed chickpea yield to the time of data, the predictions were made in two scenarios: (1) using the averages of the data of all growing months, and (2) using the data of a combination of months. The results showed that RF provides more accurate yield predictions than other methods. The predictions of this method were 7–8% different from the statistics reported by the Statistical Center and the Ministry of Agriculture of Iran. It was found that for pre-harvest prediction of rainfed chickpea yield, using the data of the March–April period (the averages of two months) offers the best result in terms of the correlation coefficient for the relationship between the yield and the predictor indices.

Download Full-text

Improving Unmanned Aerial Vehicle Remote Sensing-Based Rice Nitrogen Nutrition Index Prediction with Machine Learning

Remote Sensing ◽

10.3390/rs12020215 ◽

2020 ◽

Vol 12 (2) ◽

pp. 215 ◽

Cited By ~ 9

Author(s):

Hainie Zha ◽

Yuxin Miao ◽

Tiantian Wang ◽

Yue Li ◽

Jing Zhang ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Unmanned Aerial Vehicle ◽

Spectral Reflectance ◽

Remote Sensing Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Aerial Vehicle ◽

N Management ◽

N Status

Optimizing nitrogen (N) management in rice is crucial for China’s food security and sustainable agricultural development. Nondestructive crop growth monitoring based on remote sensing technologies can accurately assess crop N status, which may be used to guide the in-season site-specific N recommendations. The fixed-wing unmanned aerial vehicle (UAV)-based remote sensing is a low-cost, easy-to-operate technology for collecting spectral reflectance imagery, an important data source for precision N management. The relationships between many vegetation indices (VIs) derived from spectral reflectance data and crop parameters are known to be nonlinear. As a result, nonlinear machine learning methods have the potential to improve the estimation accuracy. The objective of this study was to evaluate five different approaches for estimating rice (Oryza sativa L.) aboveground biomass (AGB), plant N uptake (PNU), and N nutrition index (NNI) at stem elongation (SE) and heading (HD) stages in Northeast China: (1) single VI (SVI); (2) stepwise multiple linear regression (SMLR); (3) random forest (RF); (4) support vector machine (SVM); and (5) artificial neural networks (ANN) regression. The results indicated that machine learning methods improved the NNI estimation compared to VI-SLR and SMLR methods. The RF algorithm performed the best for estimating NNI (R2 = 0.94 (SE) and 0.96 (HD) for calibration and 0.61 (SE) and 0.79 (HD) for validation). The root mean square errors (RMSEs) were 0.09, and the relative errors were <10% in all the models. It is concluded that the RF machine learning regression can significantly improve the estimation of rice N status using UAV remote sensing. The application machine learning methods offers a new opportunity to better use remote sensing data for monitoring crop growth conditions and guiding precision crop management. More studies are needed to further improve these machine learning-based models by combining both remote sensing data and other related soil, weather, and management information for applications in precision N and crop management.

Download Full-text

Spatial Prediction of Agrochemical Properties on the Scale of a Single Field Using Machine Learning Methods Based on Remote Sensing Data

Agronomy ◽

10.3390/agronomy11112266 ◽

2021 ◽

Vol 11 (11) ◽

pp. 2266

Author(s):

Ilnas Sahabiev ◽

Elena Smirnova ◽

Kamil Giniyatullin

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Remote Sensing Data ◽

Spatial Prediction ◽

Landsat 8 ◽

Landsat 8 Oli ◽

Learning Methods ◽

Machine Learning Methods ◽

Agrochemical Properties ◽

Sentinel 2

Creating accurate digital maps of the agrochemical properties of soils on a field scale with a limited data set is a problem that slows down the introduction of precision farming. The use of machine learning methods based on the use of direct and indirect predictors of spatial changes in the agrochemical properties of soils is promising. Spectral indicators of open soil based on remote sensing data, as well as soil properties, were used to create digital maps of available forms of nitrogen, phosphorus, and potassium. It was shown that machine learning methods based on support vectors (SVMr) and random forest (RF) using spectral reflectance data are similarly accurate at spatial prediction. An acceptable prediction was obtained for available nitrogen and available potassium; the variability of available phosphorus was modeled less accurately. The coefficient of determination (R2) of the best model for nitrogen is R2SVMr = 0.90 (Landsat 8 OLI) and R2SVMr = 0.79 (Sentinel 2), for potassium—R2SVMr = 0.82 (Landsat 8 OLI) and R2SVMr = 0.77 (Sentinel 2), for phosphorus—R2SVMr = 0.68 (Landsat 8 OLI), R2SVMr = 0.64 (Sentinel 2). The models based on remote sensing data were refined when soil organic matter (SOC) and fractions of texture (Silt, Clay) were included as predictors. The SVMr models were the most accurate. For Landsat 8 OLI, the SVMr model has a R2 value: nitrogen—R2 = 0.95, potassium—R2 = 0.89 and phosphorus—R2 = 0.65. Based on Sentinel 2, nitrogen—R2 = 0.92, potassium—R2 = 0.88, phosphorus—R2 = 0.72. The spatial prediction of nitrogen content is influenced by SOC, potassium—by SOC and texture, phosphorus—by texture. The validation of the final models was carried out on an independent sample on soils from a chernozem zone. For nitrogen based on Landsat 8 OLI R2 = 0.88, for potassium R2 = 0.65, and for phosphorus R2 = 0.31. Based on Sentinel 2, for nitrogen R2 = 0.85, for potassium R2 = 0.62, and for phosphorus R2 = 0.71. The inclusion of SOC and texture in remote sensing-based machine learning models makes it possible to improve the spatial prediction of nitrogen, phosphorus and potassium availability of soils in chernozem zones and can potentially be widely used to create digital agrochemical maps on the scale of a single field.

Download Full-text

Mapping Allochemical Limestone Formations in Hazara, Pakistan Using Google Cloud Architecture: Application of Machine-Learning Algorithms on Multispectral Data

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020058 ◽

2021 ◽

Vol 10 (2) ◽

pp. 58

Author(s):

Muhammad Fawad Akbar Khan ◽

Khan Muhammad ◽

Shahid Bashir ◽

Shahab Ud Din ◽

Muhammad Hanif

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Kappa Coefficient ◽

Machine Learning Algorithms ◽

Landsat 8 ◽

Sensing Data ◽

Fossiliferous Limestone

Low-resolution Geological Survey of Pakistan (GSP) maps surrounding the region of interest show oolitic and fossiliferous limestone occurrences correspondingly in Samanasuk, Lockhart, and Margalla hill formations in the Hazara division, Pakistan. Machine-learning algorithms (MLAs) have been rarely applied to multispectral remote sensing data for differentiating between limestone formations formed due to different depositional environments, such as oolitic or fossiliferous. Unlike the previous studies that mostly report lithological classification of rock types having different chemical compositions by the MLAs, this paper aimed to investigate MLAs’ potential for mapping subclasses within the same lithology, i.e., limestone. Additionally, selecting appropriate data labels, training algorithms, hyperparameters, and remote sensing data sources were also investigated while applying these MLAs. In this paper, first, oolitic (Samanasuk), fossiliferous (Lockhart and Margalla) limestone-bearing formations along with the adjoining Hazara formation were mapped using random forest (RF), support vector machine (SVM), classification and regression tree (CART), and naïve Bayes (NB) MLAs. The RF algorithm reported the best accuracy of 83.28% and a Kappa coefficient of 0.78. To further improve the targeted allochemical limestone formation map, annotation labels were generated by the fusion of maps obtained from principal component analysis (PCA), decorrelation stretching (DS), X-means clustering applied to ASTER-L1T, Landsat-8, and Sentinel-2 datasets. These labels were used to train and validate SVM, CART, NB, and RF MLAs to obtain a binary classification map of limestone occurrences in the Hazara division, Pakistan using the Google Earth Engine (GEE) platform. The classification of Landsat-8 data by CART reported 99.63% accuracy, with a Kappa coefficient of 0.99, and was in good agreement with the field validation. This binary limestone map was further classified into oolitic (Samanasuk) and fossiliferous (Lockhart and Margalla) formations by all the four MLAs; in this case, RF surpassed all the other algorithms with an improved accuracy of 96.36%. This improvement can be attributed to better annotation, resulting in a binary limestone classification map, which formed a mask for improved classification of oolitic and fossiliferous limestone in the area.

Download Full-text