soil particle size fractions
Recently Published Documents


TOTAL DOCUMENTS

80
(FIVE YEARS 24)

H-INDEX

21
(FIVE YEARS 6)

2021 ◽  
Author(s):  
Mo Zhang ◽  
Wenjiao Shi

Abstract. Digital soil mapping of soil particle-size fractions (PSFs) using log-ratio methods is a widely used technique. As a hybrid interpolator, regression kriging (RK) provides a way to improve prediction accuracy. However, there have been few comparisons with other techniques when RK is applied for compositional data, and it is not known if its performance based on different balances of isometric log-ratio (ILR) transformation is robust. Here, we compared the generalized linear model (GLM), random forest (RF), and their hybrid patterns (RK) using different transformed data based on three ILR balances, with 29 environmental covariables (ECs) for the prediction of soil PSFs in the upper reaches of the Heihe River Basin (HRB), China. The results showed that the RF performed best, with more accurate predictions, but the GLM produced a more unbiased prediction. As a hybrid interpolator, RK was recommended because it widened the data ranges of the prediction values, and modified the bias and accuracy of most models, especially the RF. The prediction maps generated from RK revealed more details of the soil sampling points than the other models. Different data distributions were produced for the three ILR balances. Using the most abundant component of the compositional data as the first component of the permutations was not considered to be the right choice because it produced the worst performance. Based on the relative abundance of the components, we recommend that the focus should be on data distribution. This study provides a reference for the mapping of soil PSFs combined with transformed data at the regional scale.


Geoderma ◽  
2020 ◽  
Vol 376 ◽  
pp. 114552 ◽  
Author(s):  
R. Taghizadeh-Mehrjardi ◽  
M. Mahdianpari ◽  
F. Mohammadimanesh ◽  
T. Behrens ◽  
N. Toomanian ◽  
...  

2020 ◽  
Author(s):  
Mo Zhang ◽  
Wenjiao Shi

Abstract. Digital soil mapping of soil particle-size fractions (PSFs) using log-ratio methods has been widely used. As a hybrid interpolator, regression kriging (RK) is an alternative way to improve prediction accuracy. However, there is still a lack of systematic comparison and recommendation when RK was applied for compositional data. Whether performance based on different balances of isometric log-ratio (ILR) transformation is robust. Here, we systematically compared the generalized linear model (GLM), random forest (RF), and their hybrid pattern (RK) using different balances of ILR transformed data of soil PSFs with 29 environmental covariables for prediction of soil PSFs on the upper reaches of the Heihe River Basin. The results showed that RF had better performance with more accurate predictions, but GLM had a more unbiased prediction. For the hybrid interpolators, RK was recommended because it widened data ranges of the prediction results, and modified bias and accuracy for most models, especially for RF. The drawback, however, existed due to the data distributions and model algorithms. Moreover, prediction maps generated from RK demonstrated more details of soil sampling points. Three ILR transformed data based on sequential binary partitions (SBP) made different distributions, and it is not recommended to use the most abundant component of compositions as the first component of permutations. This study can reference spatial simulation of soil PSFs combined with environmental covariables and transformed data at a regional scale.


2020 ◽  
Vol 24 (5) ◽  
pp. 2505-2526
Author(s):  
Mo Zhang ◽  
Wenjiao Shi ◽  
Ziwei Xu

Abstract. Soil texture and soil particle size fractions (PSFs) play an increasing role in physical, chemical, and hydrological processes. Many previous studies have used machine-learning and log-ratio transformation methods for soil texture classification and soil PSF interpolation to improve the prediction accuracy. However, few reports have systematically compared their performance with respect to both classification and interpolation. Here, five machine-learning models – K-nearest neighbour (KNN), multilayer perceptron neural network (MLP), random forest (RF), support vector machines (SVM), and extreme gradient boosting (XGB) – combined with the original data and three log-ratio transformation methods – additive log ratio (ALR), centred log ratio (CLR), and isometric log ratio (ILR) – were applied to evaluate soil texture and PSFs using both raw and log-ratio-transformed data from 640 soil samples in the Heihe River basin (HRB) in China. The results demonstrated that the log-ratio transformations decreased the skewness of soil PSF data. For soil texture classification, RF and XGB showed better performance with a higher overall accuracy and kappa coefficient. They were also recommended to evaluate the classification capacity of imbalanced data according to the area under the precision–recall curve (AUPRC). For soil PSF interpolation, RF delivered the best performance among five machine-learning models with the lowest root-mean-square error (RMSE; sand had a RMSE of 15.09 %, silt was 13.86 %, and clay was 6.31 %), mean absolute error (MAE; sand had a MAD of 10.65 %, silt was 9.99 %, and clay was 5.00 %), Aitchison distance (AD; 0.84), and standardized residual sum of squares (STRESS; 0.61), and the highest Spearman rank correlation coefficient (RCC; sand was 0.69, silt was 0.67, and clay was 0.69). STRESS was improved by using log-ratio methods, especially for CLR and ILR. Prediction maps from both direct and indirect classification were similar in the middle and upper reaches of the HRB. However, indirect classification maps using log-ratio-transformed data provided more detailed information in the lower reaches of the HRB. There was a pronounced improvement of 21.3 % in the kappa coefficient when using indirect methods for soil texture classification compared with direct methods. RF was recommended as the best strategy among the five machine-learning models, based on the accuracy evaluation of the soil PSF interpolation and soil texture classification, and ILR was recommended for component-wise machine-learning models without multivariate treatment, considering the constrained nature of compositional data. In addition, XGB was preferred over other models when the trade-off between the accuracy and runtime was considered. Our findings provide a reference for future works with respect to the spatial prediction of soil PSFs and texture using machine-learning models with skewed distributions of soil PSF data over a large area.


Sign in / Sign up

Export Citation Format

Share Document