Information Fusion and Machine Learning in Spatial Prediction for Local Agricultural Markets

Author(s):  
Washington R. Padilla ◽  
Jesús García ◽  
José M. Molina
Geoderma ◽  
2018 ◽  
Vol 316 ◽  
pp. 100-114 ◽  
Author(s):  
Carlos M. Guio Blanco ◽  
Victor M. Brito Gomez ◽  
Patricio Crespo ◽  
Mareike Ließ

2021 ◽  
Author(s):  
Gao Bingbo ◽  
Alfred Stein ◽  
Wang Jinfeng

<p>The soil heavy metal contamination has becoming a serious problem worldwide. An accurate prediction of soil heavy metal concentration at un-sampled locations using a small sample remains a challenge, because of many natural and human factors and resulted complex heterogeneous pattern, and the relationship between influencing factors are also not homogeneous. To overcome those heterogeneities and improve the prediction accuracy, a two point machine learning method is proposed in this paper by fully leveraging the spatial relationship and similarity relationship of high dimensional ancillary variables. It firstly models the difference between paired points using machine learning model, then predict the concentration differences between sampling points and the un-sampled points, and finally utilize the predicted differences to choose near neighbors to obtain the final concentration prediction. In this method, an innovative way to search near neighbors for local model from the difference of response variable was put forward to overcome the Curse of Dimensionality. Its performance was illustrated in two diverse case studies and it is demonstrated that proposed method can dramatically improve the prediction accuracy for soil heavy metal. Besides spatial prediction of soil pollution, it can also be applied to spatial prediction of other other elements of the earth system. And in further the machine learning method in this paper can be replaced to any other supervised learning model according to specific situations.</p> <p> </p> <p> </p> <p> </p> <p> </p>


2018 ◽  
Author(s):  
Ashton Shortridge ◽  
Clayton Queen ◽  
Alan Arbogast

This paper investigates the use of random forests and spatial random forests (RFsp) for the classification of coastal dune areas along 41km of Lake Michigan’s shoreline using a lidar- derived DEM. Terrain variables across a range of spatial neighborhood scales are utilized, and for two different cell resolutions. Distance is explicitly incorporated into the RFsp models through the calculation of buffer distances around small numbers (6-13) of gridded points in the study area. While classification accuracy is high generally, RFsp produced much more accurate results. At the fine scale, topographic variables and their neighborhood ranges were not predictive of dune areas, perhaps because large (> 0.1 hectare) neighborhoods were not tested at that scale. At the coarse scale these variables were much more important. The use of small numbers of gridded (non-sample) points to improve spatial prediction warrants further investigation.


2020 ◽  
Author(s):  
Hanna Meyer ◽  
Edzer Pebesma

<p>Spatial mapping is an important task in environmental science to reveal spatial patterns and changes of the environment. In this context predictive modelling using flexible machine learning algorithms has become very popular. However, looking at the diversity of modelled (global) maps of environmental variables, there might be increasingly the impression that machine learning is a magic tool to map everything. Recently, the reliability of such maps have been increasingly questioned, calling for a reliable quantification of uncertainties.</p><p>Though spatial (cross-)validation allows giving a general error estimate for the predictions, models are usually applied to make predictions for a much larger area or might even be transferred to make predictions for an area where they were not trained on. But by making predictions on heterogeneous landscapes, there will be areas that feature environmental properties that have not been observed in the training data and hence not learned by the algorithm. This is problematic as most machine learning algorithms are weak in extrapolations and can only make reliable predictions for environments with conditions the model has knowledge about. Hence predictions for environmental conditions that differ significantly from the training data have to be considered as uncertain.</p><p>To approach this problem, we suggest a measure of uncertainty that allows identifying locations where predictions should be regarded with care. The proposed uncertainty measure is based on distances to the training data in the multidimensional predictor variable space. However, distances are not equally relevant within the feature space but some variables are more important than others in the machine learning model and hence are mainly responsible for prediction patterns. Therefore, we weight the distances by the model-derived importance of the predictors. </p><p>As a case study we use a simulated area-wide response variable for Europe, bio-climatic variables as predictors, as well as simulated field samples. Random Forest is applied as algorithm to predict the simulated response. The model is then used to make predictions for entire Europe. We then calculate the corresponding uncertainty and compare it to the area-wide true prediction error. The results show that the uncertainty map reflects the patterns in the true error very well and considerably outperforms ensemble-based standard deviations of predictions as indicator for uncertainty.</p><p>The resulting map of uncertainty gives valuable insights into spatial patterns of prediction uncertainty which is important when the predictions are used as a baseline for decision making or subsequent environmental modelling. Hence, we suggest that a map of distance-based uncertainty should be given in addition to prediction maps.</p>


Forests ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 118 ◽  
Author(s):  
Viet-Hung Dang ◽  
Nhat-Duc Hoang ◽  
Le-Mai-Duyen Nguyen ◽  
Dieu Tien Bui ◽  
Pijush Samui

This study developed and verified a new hybrid machine learning model, named random forest machine (RFM), for the spatial prediction of shallow landslides. RFM is a hybridization of two state-of-the-art machine learning algorithms, random forest classifier (RFC) and support vector machine (SVM), in which RFC is used to generate subsets from training data and SVM is used to build decision functions for these subsets. To construct and verify the hybrid RFM model, a shallow landslide database of the Lang Son area (northern Vietnam) was prepared. The database consisted of 101 shallow landslide polygons and 14 conditioning factors. The relevance of these factors for shallow landslide susceptibility modeling was assessed using the ReliefF method. Experimental results pointed out that the proposed RFM can help to achieve the desired prediction with an F1 score of roughly 0.96. The performance of the RFM was better than those of benchmark approaches, including the SVM, RFC, and logistic regression. Thus, the newly developed RFM is a promising tool to help local authorities in shallow landslide hazard mitigations.


2021 ◽  
Author(s):  
Olga Makarieva ◽  
Aiding Kornejady ◽  
Andrey Shikhov ◽  
Esmaeil Silakhori ◽  
Nataliia Nesterova ◽  
...  

<p>Groundwater-fed aufeis and their dynamic indicate the intensity of water exchange processes in permafrost, including those in a changing climate. Spatiotemporal variation of aufeis has not yet been fully tackled as it entails a holistic understanding of the main hydrological, geological, geomorphological, and climate processes. Meanwhile, robust machine learning (ML) techniques have transcended the ongoing studies by extracting the emerging pattern from a large set of data and predicting future patterns. Once they are coupled with optimization algorithms, they can give even more reliable results by improving their learning capability, known as goodness-of-fit. Hence, the current study sets out to study the spatial pattern of aufeis in the North-East of Northern Hemisphere by adopting a robust pattern recognition algorithms, Support Vector Machine (SVM), coupled with three evolutionary optimization techniques, namely Imperialistic Competitive Algorithm (ICA), Grey Wolf Optimizer (GWO), and Bat optimizer (Bat). The latter was carried out by incorporating a wide range of topo-hydrological, geological, geomorphological, environmental, and climatic data together with the distribution of aufeis as the ground truth in the study region. Adhering to the spatial partitioning method, a different basin with the entire aufeis within was kept apart from the modeling process and used as a reference for model validation. By doing so, the spatial prediction models were sieved through multiple cutoff-dependent and -independent performance metrics to determine the superior model in terms of both learning and prediction/generalization capacity. To assess how the distribution of aufeis responds to different changing conditions, we employed Long Ashton Research Station Weather Generator 6 (LARS-WG6) in CMIP5 protocol under RCP 4.5 and RCP 8.5 scenarios. The model was fed by daily time-series for a suite of climate variables, namely, precipitation, maximum and minimum temperature, and solar radiation, acquired from different synoptic weather stations in the study area. The downscaling performance of the LARS-WG6 model was assessed using the Nash Sutcliffe efficiency metric, bias, and Root Mean Squared Error (RMSE). Further, the climatic variables were projected for the periods 2041–2060 (2050s) and 2061–2080 (2070s) using various Atmosphere-Ocean General Circulation Models (AOGCMs), including EC-EARTH, GFDL-CM3, HadGEM2-ES, MIROC5, and MPI-ESM-MR. The projected climatic variables as dynamic drivers were used in combination with the previous static factors as the new set of inputs into the superior hybridized spatial prediction model to investigate the distribution (be it decreasing or increasing pattern) of aufeis across the study area under the considered climate change scenarios. The results revealed that the hybridized SVM-GWO has comparatively higher learning and prediction performance, followed by the other two counterparts, SVM-ICA and SVM-Bat. Based on the applied downscaling performance metrics, the LARS-WG6 and implemented models have proved successful. The projection models' results for the two periods in the future attested to an increasing temperature followed by an antithetical pattern for precipitation. The results of SVM-GWO fused to projected climatic variables revealed a transparent pattern based on which the aufeis distribution area will be diminished. The study is supported by Russian Fund for Basic Research (projects 19-55-80028, 19-35-90090 and 20-05-00666).</p>


Sign in / Sign up

Export Citation Format

Share Document