scholarly journals Comparison of Different Machine Learning Methods for Debris Flow Susceptibility Mapping: A Case Study in the Sichuan Province, China

2020 ◽  
Vol 12 (2) ◽  
pp. 295 ◽  
Author(s):  
Ke Xiong ◽  
Basanta Raj Adhikari ◽  
Constantine A. Stamatopoulos ◽  
Yu Zhan ◽  
Shaolin Wu ◽  
...  

Debris flow susceptibility mapping is considered to be useful for hazard prevention and mitigation. As a frequent debris flow area, many hazardous events have occurred annually and caused a lot of damage in the Sichuan Province, China. Therefore, this study attempted to evaluate and compare the performance of four state-of-the-art machine-learning methods, namely Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Boosted Regression Trees (BRT), for debris flow susceptibility mapping in this region. Four models were constructed based on the debris flow inventory and a range of causal factors. A variety of datasets was obtained through the combined application of remote sensing (RS) and geographic information system (GIS). The mean altitude, altitude difference, aridity index, and groove gradient played the most important role in the assessment. The performance of these modes was evaluated using predictive accuracy (ACC) and the area under the receiver operating characteristic curve (AUC). The results of this study showed that all four models were capable of producing accurate and robust debris flow susceptibility maps (ACC and AUC values were well above 0.75 and 0.80 separately). With an excellent spatial prediction capability and strong robustness, the BRT model (ACC = 0.781, AUC = 0.852) outperformed other models and was the ideal choice. Our results also exhibited the importance of selecting suitable mapping units and optimal predictors. Furthermore, the debris flow susceptibility maps of the Sichuan Province were produced, which can provide helpful data for assessing and mitigating debris flow hazards.

2021 ◽  
Author(s):  
Rui Liu ◽  
Xin Yang ◽  
Chong Xu ◽  
Luyao Li ◽  
Xiangqiang Zeng

Abstract Landslide susceptibility mapping (LSM) is a useful tool to estimate the probability of landslide occurrence, providing a scientific basis for natural hazards prevention, land use planning, and economic development in landslide-prone areas. To date, a large number of machine learning methods have been applied to LSM, and recently the advanced Convolutional Neural Network (CNN) has been gradually adopted to enhance the prediction accuracy of LSM. The objective of this study is to introduce a CNN based model in LSM and systematically compare its overall performance with the conventional machine learning models of random forest, logistic regression, and support vector machine. Herein, we selected the Jiuzhaigou region in Sichuan Province, China as the study area. A total number of 710 landslides and 12 predisposing factors were stacked to form spatial datasets for LSM. The ROC analysis and several statistical metrics, such as accuracy, root mean square error (RMSE), Kappa coefficient, sensitivity, and specificity were used to evaluate the performance of the models in the training and validation datasets. Finally, the trained models were calculated and the landslide susceptibility zones were mapped. Results suggest that both CNN and conventional machine-learning based models have a satisfactory performance (AUC: 85.72% − 90.17%). The CNN based model exhibits excellent good-of-fit and prediction capability, and achieves the highest performance (AUC: 90.17%) but also significantly reduces the salt-of-pepper effect, which indicates its great potential of application to LSM.


2020 ◽  
Vol 198 ◽  
pp. 03023
Author(s):  
Xin Yang ◽  
Rui Liu ◽  
Luyao Li ◽  
Mei Yang ◽  
Yuantao Yang

Landslide susceptibility mapping is a method used to assess the probability and spatial distribution of landslide occurrences. Machine learning methods have been widely used in landslide susceptibility in recent years. In this paper, six popular machine learning algorithms namely logistic regression, multi-layer perceptron, random forests, support vector machine, Adaboost, and gradient boosted decision tree were leveraged to construct landslide susceptibility models with a total of 1365 landslide points and 14 predisposing factors. Subsequently, the landslide susceptibility maps (LSM) were generated by the trained models. LSM shows the main landslide zone is concentrated in the southeastern area of Wenchuan County. The result of ROC curve analysis shows that all models fitted the training datasets and achieved satisfactory results on validation datasets. The results of this paper reveal that machine learning methods are feasible to build robust landslide susceptibility models.


2020 ◽  
Vol 12 (18) ◽  
pp. 2933
Author(s):  
Feng Qing ◽  
Yan Zhao ◽  
Xingmin Meng ◽  
Xiaojun Su ◽  
Tianjun Qi ◽  
...  

The China–Pakistan Karakoram Highway is an important land route from China to South Asia and the Middle East via Pakistan. Due to the extremely hazardous geological environment around the highway, landslides, debris flows, collapses, and subsidence are frequent. Among them, debris flows are one of the most serious geological hazards on the Karakoram Highway, and they often cause interruptions to traffic and casualties. Therefore, the development of debris flow susceptibility mapping along the highway can potentially facilitate its safe operation. In this study, we used remote sensing, GIS, and machine learning techniques to map debris flow susceptibility along the Karakoram Highway in areas where observation data are scarce and difficult to obtain by field survey. First, the distribution of 544 catchments which are prone to debris flow were identified through visual interpretation of remote sensing images. The factors influencing debris flow susceptibility were then analyzed, and a total of 17 parameters related to geomorphology, soil materials, and triggering conditions were selected. Model training was based on multiple common machine learning methods, including Ensemble Methods, Gaussian Processes, Generalized Linear models, Navies Bayes, Nearest Neighbors, Support Vector Machines, Trees, Discriminant Analysis, and eXtreme Gradient Boosting. Support Vector Classification (SVC) was chosen as the final model after evaluation; its accuracy (ACC) was 0.91, and the area under the ROC curve (AUC) was 0.96. Among the factors involved in SVC, the Melton Ratio (MR) was the most important, followed by drainage density (DD), Hypsometric Integral (HI), and average slope (AS), indicating that geomorphic conditions play an important role in predicting debris flow susceptibility in the study area. SVC was used to map debris flow susceptibility in the study area, and the results will potentially facilitate the safe operation of the highway.


2019 ◽  
Vol 11 (23) ◽  
pp. 2801 ◽  
Author(s):  
Yonghong Zhang ◽  
Taotao Ge ◽  
Wei Tian ◽  
Yuei-An Liou

Debris flows have been always a serious problem in the mountain areas. Research on the assessment of debris flows susceptibility (DFS) is useful for preventing and mitigating debris flow risks. The main purpose of this work is to study the DFS in the Shigatse area of Tibet, by using machine learning methods, after assessing the main triggering factors of debris flows. Remote sensing and geographic information system (GIS) are used to obtain datasets of topography, vegetation, human activities and soil factors for local debris flows. The problem of debris flow susceptibility level imbalances in datasets is addressed by the Borderline-SMOTE method. Five machine learning methods, i.e., back propagation neural network (BPNN), one-dimensional convolutional neural network (1D-CNN), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost) have been used to analyze and fit the relationship between debris flow triggering factors and occurrence, and to evaluate the weight of each triggering factor. The ANOVA and Tukey HSD tests have revealed that the XGBoost model exhibited the best mean accuracy (0.924) on ten-fold cross-validation and the performance was significantly better than that of the BPNN (0.871), DT (0.816), and RF (0.901). However, the performance of the XGBoost did not significantly differ from that of the 1D-CNN (0.914). This is also the first comparison experiment between XGBoost and 1D-CNN methods in the DFS study. The DFS maps have been verified by five evaluation methods: Precision, Recall, F1 score, Accuracy and area under the curve (AUC). Experiments show that the XGBoost has the best score, and the factors that have a greater impact on debris flows are aspect, annual average rainfall, profile curvature, and elevation.


2022 ◽  
Vol 14 (2) ◽  
pp. 321
Author(s):  
Rui Liu ◽  
Xin Yang ◽  
Chong Xu ◽  
Liangshuai Wei ◽  
Xiangqiang Zeng

Landslide susceptibility mapping (LSM) is a useful tool to estimate the probability of landslide occurrence, providing a scientific basis for natural hazards prevention, land use planning, and economic development in landslide-prone areas. To date, a large number of machine learning methods have been applied to LSM, and recently the advanced convolutional neural network (CNN) has been gradually adopted to enhance the prediction accuracy of LSM. The objective of this study is to introduce a CNN-based model in LSM and systematically compare its overall performance with the conventional machine learning models of random forest, logistic regression, and support vector machine. Herein, we selected Zhangzha Town in Sichuan Province, China, and Lantau Island in Hong Kong, China, as the study areas. Each landslide inventory and corresponding predisposing factors were stacked to form spatial datasets for LSM. The receiver operating characteristic analysis, area under the curve (AUC), and several statistical metrics, such as accuracy, root mean square error, Kappa coefficient, sensitivity, and specificity, were used to evaluate the performance of the models. Finally, the trained models were calculated, and the landslide susceptibility zones were mapped. Results suggest that both CNN and conventional machine learning-based models have a satisfactory performance. The CNN-based model exhibits an excellent prediction capability and achieves the highest performance but also significantly reduces the salt-of-pepper effect, which indicates its great potential for application to LSM.


Author(s):  
Yumiao Wang ◽  
Xueling Wu ◽  
Zhangjian Chen ◽  
Fu Ren ◽  
Luwei Feng ◽  
...  

The main goal of this study was to use the synthetic minority oversampling technique (SMOTE) to expand the quantity of landslide samples for machine learning methods (i.e., support vector machine (SVM), logistic regression (LR), artificial neural network (ANN), and random forest (RF)) to produce high-quality landslide susceptibility maps for Lishui City in Zhejiang Province, China. Landslide-related factors were extracted from topographic maps, geological maps, and satellite images. Twelve factors were selected as independent variables using correlation coefficient analysis and the neighborhood rough set (NRS) method. In total, 288 soil landslides were mapped using field surveys, historical records, and satellite images. The landslides were randomly divided into two datasets: 70% of all landslides were selected as the original training dataset and 30% were used for validation. Then, SMOTE was employed to generate datasets with sizes ranging from two to thirty times that of the training dataset to establish and compare the four machine learning methods for landslide susceptibility mapping. In addition, we used slope units to subdivide the terrain to determine the landslide susceptibility. Finally, the landslide susceptibility maps were validated using statistical indexes and the area under the curve (AUC). The results indicated that the performances of the four machine learning methods showed different levels of improvement as the sample sizes increased. The RF model exhibited a more substantial improvement (AUC improved by 24.12%) than did the ANN (18.94%), SVM (17.77%), and LR (3.00%) models. Furthermore, the ANN model achieved the highest predictive ability (AUC = 0.98), followed by the RF (AUC = 0.96), SVM (AUC = 0.94), and LR (AUC = 0.79) models. This approach significantly improves the performance of machine learning techniques for landslide susceptibility mapping, thereby providing a better tool for reducing the impacts of landslide disasters.


2019 ◽  
Vol 19 (25) ◽  
pp. 2301-2317 ◽  
Author(s):  
Ruirui Liang ◽  
Jiayang Xie ◽  
Chi Zhang ◽  
Mengying Zhang ◽  
Hai Huang ◽  
...  

In recent years, the successful implementation of human genome project has made people realize that genetic, environmental and lifestyle factors should be combined together to study cancer due to the complexity and various forms of the disease. The increasing availability and growth rate of ‘big data’ derived from various omics, opens a new window for study and therapy of cancer. In this paper, we will introduce the application of machine learning methods in handling cancer big data including the use of artificial neural networks, support vector machines, ensemble learning and naïve Bayes classifiers.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Jing Xu ◽  
Xiangdong Liu ◽  
Qiming Dai

Abstract Background Hypertrophic cardiomyopathy (HCM) represents one of the most common inherited heart diseases. To identify key molecules involved in the development of HCM, gene expression patterns of the heart tissue samples in HCM patients from multiple microarray and RNA-seq platforms were investigated. Methods The significant genes were obtained through the intersection of two gene sets, corresponding to the identified differentially expressed genes (DEGs) within the microarray data and within the RNA-Seq data. Those genes were further ranked using minimum-Redundancy Maximum-Relevance feature selection algorithm. Moreover, the genes were assessed by three different machine learning methods for classification, including support vector machines, random forest and k-Nearest Neighbor. Results Outstanding results were achieved by taking exclusively the top eight genes of the ranking into consideration. Since the eight genes were identified as candidate HCM hallmark genes, the interactions between them and known HCM disease genes were explored through the protein–protein interaction (PPI) network. Most candidate HCM hallmark genes were found to have direct or indirect interactions with known HCM diseases genes in the PPI network, particularly the hub genes JAK2 and GADD45A. Conclusions This study highlights the transcriptomic data integration, in combination with machine learning methods, in providing insight into the key hallmark genes in the genetic etiology of HCM.


2021 ◽  
Vol 10 (4) ◽  
pp. 199
Author(s):  
Francisco M. Bellas Aláez ◽  
Jesus M. Torres Palenzuela ◽  
Evangelos Spyrakos ◽  
Luis González Vilas

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.


Sign in / Sign up

Export Citation Format

Share Document