Gully erosion susceptibility mapping (GESM) using machine learning methods optimized by the multi‑collinearity analysis and K-fold cross-validation

Abstract Landslide susceptibility mapping (LSM) is a useful tool to estimate the probability of landslide occurrence, providing a scientific basis for natural hazards prevention, land use planning, and economic development in landslide-prone areas. To date, a large number of machine learning methods have been applied to LSM, and recently the advanced Convolutional Neural Network (CNN) has been gradually adopted to enhance the prediction accuracy of LSM. The objective of this study is to introduce a CNN based model in LSM and systematically compare its overall performance with the conventional machine learning models of random forest, logistic regression, and support vector machine. Herein, we selected the Jiuzhaigou region in Sichuan Province, China as the study area. A total number of 710 landslides and 12 predisposing factors were stacked to form spatial datasets for LSM. The ROC analysis and several statistical metrics, such as accuracy, root mean square error (RMSE), Kappa coefficient, sensitivity, and specificity were used to evaluate the performance of the models in the training and validation datasets. Finally, the trained models were calculated and the landslide susceptibility zones were mapped. Results suggest that both CNN and conventional machine-learning based models have a satisfactory performance (AUC: 85.72% − 90.17%). The CNN based model exhibits excellent good-of-fit and prediction capability, and achieves the highest performance (AUC: 90.17%) but also significantly reduces the salt-of-pepper effect, which indicates its great potential of application to LSM.

Download Full-text

Effects of Brain Atlases and Machine Learning Methods on the Discrimination of Schizophrenia Patients: A Multimodal MRI Study

Frontiers in Neuroscience ◽

10.3389/fnins.2021.697168 ◽

2021 ◽

Vol 15 ◽

Author(s):

Jinyu Zang ◽

Yuanyuan Huang ◽

Lingyin Kong ◽

Bingye Lei ◽

Pengfei Ke ◽

...

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Cross Validation ◽

Integrated Model ◽

Machine Learning Techniques ◽

Brain Atlas ◽

First Episode ◽

Learning Methods ◽

Machine Learning Methods ◽

Brain Atlases

Recently, machine learning techniques have been widely applied in discriminative studies of schizophrenia (SZ) patients with multimodal magnetic resonance imaging (MRI); however, the effects of brain atlases and machine learning methods remain largely unknown. In this study, we collected MRI data for 61 first-episode SZ patients (FESZ), 79 chronic SZ patients (CSZ) and 205 normal controls (NC) and calculated 4 MRI measurements, including regional gray matter volume (GMV), regional homogeneity (ReHo), amplitude of low-frequency fluctuation and degree centrality. We systematically analyzed the performance of two classifications (SZ vs NC; FESZ vs CSZ) based on the combinations of three brain atlases, five classifiers, two cross validation methods and 3 dimensionality reduction algorithms. Our results showed that the groupwise whole-brain atlas with 268 ROIs outperformed the other two brain atlases. In addition, the leave-one-out cross validation was the best cross validation method to select the best hyperparameter set, but the classification performances by different classifiers and dimensionality reduction algorithms were quite similar. Importantly, the contributions of input features to both classifications were higher with the GMV and ReHo features of brain regions in the prefrontal and temporal gyri. Furthermore, an ensemble learning method was performed to establish an integrated model, in which classification performance was improved. Taken together, these findings indicated the effects of these factors in constructing effective classifiers for psychiatric diseases and showed that the integrated model has the potential to improve the clinical diagnosis and treatment evaluation of SZ.

Download Full-text

Landslide susceptibility mapping using machine learning for Wenchuan County, Sichuan province, China

E3S Web of Conferences ◽

10.1051/e3sconf/202019803023 ◽

2020 ◽

Vol 198 ◽

pp. 03023

Author(s):

Xin Yang ◽

Rui Liu ◽

Luyao Li ◽

Mei Yang ◽

Yuantao Yang

Keyword(s):

Machine Learning ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Roc Curve Analysis ◽

Learning Methods ◽

Machine Learning Methods ◽

Boosted Decision Tree

Landslide susceptibility mapping is a method used to assess the probability and spatial distribution of landslide occurrences. Machine learning methods have been widely used in landslide susceptibility in recent years. In this paper, six popular machine learning algorithms namely logistic regression, multi-layer perceptron, random forests, support vector machine, Adaboost, and gradient boosted decision tree were leveraged to construct landslide susceptibility models with a total of 1365 landslide points and 14 predisposing factors. Subsequently, the landslide susceptibility maps (LSM) were generated by the trained models. LSM shows the main landslide zone is concentrated in the southeastern area of Wenchuan County. The result of ROC curve analysis shows that all models fitted the training datasets and achieved satisfactory results on validation datasets. The results of this paper reveal that machine learning methods are feasible to build robust landslide susceptibility models.

Download Full-text

Comparison of Different Machine Learning Methods for Debris Flow Susceptibility Mapping: A Case Study in the Sichuan Province, China

Remote Sensing ◽

10.3390/rs12020295 ◽

2020 ◽

Vol 12 (2) ◽

pp. 295 ◽

Cited By ~ 6

Author(s):

Ke Xiong ◽

Basanta Raj Adhikari ◽

Constantine A. Stamatopoulos ◽

Yu Zhan ◽

Shaolin Wu ◽

...

Keyword(s):

Machine Learning ◽

Debris Flow ◽

Sichuan Province ◽

Susceptibility Mapping ◽

Boosted Regression Trees ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Susceptibility Maps ◽

Debris Flow Susceptibility

Debris flow susceptibility mapping is considered to be useful for hazard prevention and mitigation. As a frequent debris flow area, many hazardous events have occurred annually and caused a lot of damage in the Sichuan Province, China. Therefore, this study attempted to evaluate and compare the performance of four state-of-the-art machine-learning methods, namely Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Boosted Regression Trees (BRT), for debris flow susceptibility mapping in this region. Four models were constructed based on the debris flow inventory and a range of causal factors. A variety of datasets was obtained through the combined application of remote sensing (RS) and geographic information system (GIS). The mean altitude, altitude difference, aridity index, and groove gradient played the most important role in the assessment. The performance of these modes was evaluated using predictive accuracy (ACC) and the area under the receiver operating characteristic curve (AUC). The results of this study showed that all four models were capable of producing accurate and robust debris flow susceptibility maps (ACC and AUC values were well above 0.75 and 0.80 separately). With an excellent spatial prediction capability and strong robustness, the BRT model (ACC = 0.781, AUC = 0.852) outperformed other models and was the ideal choice. Our results also exhibited the importance of selecting suitable mapping units and optimal predictors. Furthermore, the debris flow susceptibility maps of the Sichuan Province were produced, which can provide helpful data for assessing and mitigating debris flow hazards.

Download Full-text

Prediction Success of Machine Learning Methods for Flash Flood Susceptibility Mapping in the Tafresh Watershed, Iran

Sustainability ◽

10.3390/su11195426 ◽

2019 ◽

Vol 11 (19) ◽

pp. 5426 ◽

Cited By ~ 47

Author(s):

Saeid Janizadeh ◽

Mohammadtaghi Avand ◽

Abolfazl Jaafari ◽

Tran Van Phong ◽

Mahmoud Bayat ◽

...

Keyword(s):

Machine Learning ◽

Flash Flood ◽

Susceptibility Mapping ◽

Annual Rainfall ◽

Slope Aspect ◽

Flood Events ◽

Learning Methods ◽

Machine Learning Methods ◽

Flood Susceptibility ◽

Flood Susceptibility Mapping

Floods are some of the most destructive and catastrophic disasters worldwide. Development of management plans needs a deep understanding of the likelihood and magnitude of future flood events. The purpose of this research was to estimate flash flood susceptibility in the Tafresh watershed, Iran, using five machine learning methods, i.e., alternating decision tree (ADT), functional tree (FT), kernel logistic regression (KLR), multilayer perceptron (MLP), and quadratic discriminant analysis (QDA). A geospatial database including 320 historical flood events was constructed and eight geo-environmental variables—elevation, slope, slope aspect, distance from rivers, average annual rainfall, land use, soil type, and lithology—were used as flood influencing factors. Based on a variety of performance metrics, it is revealed that the ADT method was dominant over the other methods. The FT method was ranked as the second-best method, followed by the KLR, MLP, and QDA. Given a few differences between the goodness-of-fit and prediction success of the methods, we concluded that all these five machine-learning-based models are applicable for flood susceptibility mapping in other areas to protect societies from devastating floods.

Download Full-text

Predictions of chalcospinels with composition ABCX4 (X – S or Se)

Perspektivnye Materialy ◽

10.30791/1028-978x-2020-7-5-18 ◽

2020 ◽

pp. 5-18

Author(s):

N. N. Kiselyova ◽

◽

V. A. Dudarev ◽

V. V. Ryazanov ◽

O. V. Sen’ko ◽

...

Keyword(s):

Machine Learning ◽

Crystal Lattice ◽

Prediction Accuracy ◽

Cross Validation ◽

Chemical Elements ◽

Optical Memory ◽

Support Vector ◽

Learning Methods ◽

Linear Discriminant ◽

Machine Learning Methods

New chalcospinels of the most common compositions were predicted: AIBIIICIVX4 (X — S or Se) and AIIBIIICIIIS4 (A, B, and C are various chemical elements). They are promising for the search for new materials for magneto-optical memory elements, sensors and anodes in sodium-ion batteries. The parameter “a” values of their crystal lattice are estimated. When predicting only the values of chemical elements properties were used. The calculations were carried out using machine learning programs that are part of the information-analytical system developed by the authors (various ensembles of algorithms of: the binary decision trees, the linear machine, the search for logical regularities of classes, the support vector machine, Fisher linear discriminant, the k-nearest neighbors, the learning a multilayer perceptron and a neural network), — for predicting chalcospinels not yet obtained, as well as an extensive family of regression methods, presented in the scikit-learn package for the Python language, and multilevel machine learning methods that were proposed by the authors — for estimation of the new chalcospinels lattice parameter value). The prediction accuracy of new chalcospinels according to the results of the cross-validation is not lower than 80%, and the prediction accuracy of the parameter of their crystal lattice (according to the results of calculating the mean absolute error (when cross-validation in the leave-one-out mode)) is ± 0.1 Å. The effectiveness of using multilevel machine learning methods to predict the physical properties of substances was shown.

Download Full-text

GIS-Based Landslide Susceptibility Mapping Using Remote Sensing Data and Machine Learning Methods

Cartography from Pole to Pole - Lecture Notes in Geoinformation and Cartography ◽

10.1007/978-3-642-32618-9_23 ◽

2013 ◽

pp. 319-333

Author(s):

Fu Ren ◽

Xueling Wu

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Landslide Susceptibility ◽

Remote Sensing Data ◽

Susceptibility Mapping ◽

Landslide Susceptibility Mapping ◽

Learning Methods ◽

Sensing Data ◽

Machine Learning Methods

Download Full-text

Mathematical Model for Choosing Counterparty When Assessing Information Security Risks

Risks ◽

10.3390/risks9070133 ◽

2021 ◽

Vol 9 (7) ◽

pp. 133

Author(s):

Andrey Koltays ◽

Anton Konev ◽

Alexander Shelupanov

Keyword(s):

Machine Learning ◽

Mathematical Model ◽

Software Design ◽

Cross Validation ◽

Accurate Result ◽

Economic Security ◽

Security Risks ◽

Learning Methods ◽

Machine Learning Methods ◽

Comprehensive Study

The need to assess the risks of the trustworthiness of counterparties is increasing every year. The identification of increasing cases of unfair behavior among counterparties only confirms the relevance of this topic. The existing work in the field of information and economic security does not create a reasonable methodology that allows for a comprehensive study and an adequate assessment of a counterparty (for example, a developer company) in the field of software design and development. The purpose of this work is to assess the risks of a counterparty’s trustworthiness in the context of the digital transformation of the economy, which in turn will reduce the risk of offenses and crimes that constitute threats to the security of organizations. This article discusses the main methods used in the construction of a mathematical model for assessing the trustworthiness of a counterparty. The main difficulties in assessing the accuracy and completeness of the model are identified. The use of cross-validation to eliminate difficulties in building a model is described. The developed model, using machine learning methods, gives an accurate result with a small number of compared counterparties, which corresponds to the order of checking a counterparty in a real system. The results of calculations in this model show the possibility of using machine learning methods in assessing the risks of counterparty trustworthiness.

Download Full-text

Comparative Study of Convolutional Neural Network and Conventional Machine Learning Methods for Landslide Susceptibility Mapping

Remote Sensing ◽

10.3390/rs14020321 ◽

2022 ◽

Vol 14 (2) ◽

pp. 321

Author(s):

Rui Liu ◽

Xin Yang ◽

Chong Xu ◽

Liangshuai Wei ◽

Xiangqiang Zeng

Keyword(s):

Neural Network ◽

Machine Learning ◽

Convolutional Neural Network ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Conventional Machine

Landslide susceptibility mapping (LSM) is a useful tool to estimate the probability of landslide occurrence, providing a scientific basis for natural hazards prevention, land use planning, and economic development in landslide-prone areas. To date, a large number of machine learning methods have been applied to LSM, and recently the advanced convolutional neural network (CNN) has been gradually adopted to enhance the prediction accuracy of LSM. The objective of this study is to introduce a CNN-based model in LSM and systematically compare its overall performance with the conventional machine learning models of random forest, logistic regression, and support vector machine. Herein, we selected Zhangzha Town in Sichuan Province, China, and Lantau Island in Hong Kong, China, as the study areas. Each landslide inventory and corresponding predisposing factors were stacked to form spatial datasets for LSM. The receiver operating characteristic analysis, area under the curve (AUC), and several statistical metrics, such as accuracy, root mean square error, Kappa coefficient, sensitivity, and specificity, were used to evaluate the performance of the models. Finally, the trained models were calculated, and the landslide susceptibility zones were mapped. Results suggest that both CNN and conventional machine learning-based models have a satisfactory performance. The CNN-based model exhibits an excellent prediction capability and achieves the highest performance but also significantly reduces the salt-of-pepper effect, which indicates its great potential for application to LSM.

Download Full-text

Machine Learning Integrates Genomic Signatures for Subclassification Beyond Primary and Secondary Acute Myeloid Leukemia

Blood ◽

10.1182/blood.2020010603 ◽

2021 ◽

Author(s):

Hassan Awada ◽

Arda Durmaz ◽

Carmelo Gurnari ◽

Ashwin Kishtagari ◽

Manja Meggendorfer ◽

...

Keyword(s):

Machine Learning ◽

Acute Myeloid Leukemia ◽

Myeloid Leukemia ◽

Latent Class ◽

Cross Validation ◽

De Novo ◽

Sequencing Data ◽

Learning Methods ◽

Machine Learning Methods ◽

Acute Myeloid

While genomic alterations drive the pathogenesis of acute myeloid leukemia (AML), traditional classifications are largely based on morphology and prototypic genetic founder lesions define only a small proportion of AML patients. The historical subdivision of primary/de novo AML (pAML) and secondary AML (sAML) has shown to variably correlate with genetic patterns. Perhaps, the combinatorial complexity and heterogeneity of AML genomic architecture have precluded, so far, the genomic-based subclassification to identify distinct molecularly-defined subtypes more reflective of shared pathogenesis. We integrated cytogenetic and gene sequencing data from a multicenter cohort of 6,788 AML patients that were analyzed using standard and machine learning methods to generate a novel AML molecular subclassification with biological correlates corresponding to underlying pathogenesis. Standard supervised analyses resulted in modest cross-validation accuracy when attempting to use molecular patterns to predict traditional pathomorphological AML classifications. We performed unsupervised analysis by applying Bayesian Latent Class method that identified 4 unique genomic clusters of distinct prognoses. Invariant genomic features driving each cluster were extracted and resulted in 97% cross-validation accuracy when used for genomic subclassification. Subclasses of AML defined by molecular signatures overlapped current pathomorphological and clinically-defined AML subtypes. We internally and externally validated our results and share an open-access molecular classification scheme for AML patients. Although the heterogeneity inherent in the genomic changes across nearly 7,000 AML patients is too vast for traditional prediction methods, however, machine learning methods allowed for the definition of novel genomic AML subclasses indicating that traditional pathomorphological definitions may be less reflective of overlapping pathogenesis.

Download Full-text