Data Mining and Statistical Approaches in Debris-Flow Susceptibility Modelling Using Airborne LiDAR Data

Cameron Highland is a popular tourist hub in the mountainous area of Peninsular Malaysia. Most communities in this area suffer frequent incidence of debris flow, especially during monsoon seasons. Despite the loss of lives and properties recorded annually from debris flow, most studies in the region concentrate on landslides and flood susceptibilities. In this study, debris-flow susceptibility prediction was carried out using two data mining techniques; Multivariate Adaptive Regression Splines (MARS) and Support Vector Regression (SVR) models. The existing inventory of debris-flow events (640 points) were selected for training 70% (448) and validation 30% (192). Twelve conditioning factors namely; elevation, plan-curvature, slope angle, total curvature, slope aspect, Stream Transport Index (STI), profile curvature, roughness index, Stream Catchment Area (SCA), Stream Power Index (SPI), Topographic Wetness Index (TWI) and Topographic Position Index (TPI) were selected from Light Detection and Ranging (LiDAR)-derived Digital Elevation Model (DEM) data. Multi-collinearity was checked using Information Factor, Cramer’s V, and Gini Index to identify the relative importance of conditioning factors. The susceptibility models were produced and categorized into five classes; not-susceptible, low, moderate, high and very-high classes. Models performances were evaluated using success and prediction rates where the area under the curve (AUC) showed a higher performance of MARS (93% and 83%) over SVR (76% and 72%). The result of this study will be important in contingency hazards and risks management plans to reduce the loss of lives and properties in the area.

Download Full-text

A COMPARISON BETWEEN THREE CONDITIONING FACTORS DATASET FOR LANDSLIDE PREDICTION IN THE SAJADROOD CATCHMENT OF IRAN

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-v-3-2020-625-2020 ◽

2020 ◽

Vol V-3-2020 ◽

pp. 625-632

Author(s):

B. Kalantar ◽

N. Ueda ◽

H. A. H. Al-Najjar ◽

V. Saeidi ◽

M. B. A. Gibril ◽

...

Keyword(s):

Land Use ◽

Total Curvature ◽

The Other ◽

Slope Aspect ◽

Support Vector ◽

Topographic Wetness Index ◽

Mazandaran Province ◽

Landslide Prediction ◽

Conditioning Factors ◽

Other Hand

Abstract. This study investigates the effectiveness of three datasets for the prediction of landslides in the Sajadrood catchment (Babol County, Mazandaran Province, Iran). The three datasets (D1, D2 and D3) are constructed based on fourteen conditioning factors (CFs) obtained from Digital Elevation Model (DEM) derivatives, topography maps, land use maps and geological maps. Precisely, D1 consists of all 14 CFs namely altitude, slope, aspect, topographic wetness index (TWI), terrain roughness index (TRI), distance to fault, distance to stream, distance to road, total curvature, profile curvatures, plan curvature, land use, steam power index (SPI) and geology. D2, on the other hand, is a subset of D1, consisting of eight CFs. This reduction was achieved by exploiting the Variance Inflation Factor, Gini Importance Indices and Chi-Square factor optimization methods. Dataset D3 includes only selected factors derived from the DEM. Three supervised classification algorithms were trained for landslide prediction namely the Support Vector Machine (SVM), Logistic Regression (LR), and Artificial Neural Network (ANN). Experimental results indicate that D2 performed the best for landslide prediction with the SVM producing the best overall accuracy at 82.81%, followed by LR (81.71%) and ANN (80.18%). Extensive investigations on the results of factor optimization analysis indicate that the CFs distance to road, altitude, and geology were significant contributors to the prediction results. Land use map, slope, total-, plan-, and profile curvature and TRI, on the other hand, were deemed redundant. The analysis also revealed that sole reliance on Gini Indices could lead to inefficient optimization.

Download Full-text

A New Hybrid Firefly–PSO Optimized Random Subspace Tree Intelligence for Torrential Rainfall-Induced Flash Flood Susceptible Mapping

Remote Sensing ◽

10.3390/rs12172688 ◽

2020 ◽

Vol 12 (17) ◽

pp. 2688 ◽

Cited By ~ 2

Author(s):

Viet-Ha Nhu ◽

Phuong-Thao Thi Ngo ◽

Tien Dat Pham ◽

Jie Dou ◽

Xuan Song ◽

...

Keyword(s):

Soil Type ◽

Flash Flood ◽

Predictive Performance ◽

Slope Aspect ◽

Support Vector ◽

Landsat 8 ◽

Mountainous Area ◽

Random Subspace ◽

Topographic Wetness Index ◽

Inventory Map

Flash flood is one of the most dangerous natural phenomena because of its high magnitudes and sudden occurrence, resulting in huge damages for people and properties. Our work aims to propose a state-of-the-art model for susceptibility mapping of the flash flood using the decision tree random subspace ensemble optimized by hybrid firefly–particle swarm optimization (HFPS), namely the HFPS-RSTree model. In this work, we used data from a flood inventory map consisting of 1866 polygons derived from Sentinel-1 C-band synthetic aperture radar (SAR) data and a field survey conducted in the northwest mountainous area of the Van Ban district, Lao Cai Province in Vietnam. A total of eleven flooding conditioning factors (soil type, geology, rainfall, river density, elevation, slope, aspect, topographic wetness index (TWI), normalized difference vegetation index (NDVI), plant curvature, and profile curvature) were used as explanatory variables. These indicators were compiled from a geological and mineral resources map, soil type map, and topographic map, ALOS PALSAR DEM 30 m, and Landsat-8 imagery. The HFPS-RSTree model was trained and verified using the inventory map and the eleven conditioning variables and then compared with four machine learning algorithms, i.e., the support vector machine (SVM), the random forests (RF), the C4.5 decision trees (C4.5 DT), and the logistic model trees (LMT) models. We employed a range of statistical standard metrics to assess the predictive performance of the proposed model. The results show that the HFPS-RSTree model had the best predictive performance and achieved better results than those of other benchmarks with the ability to predict flash flood, reaching an overall accuracy of over 90%. It can be concluded that the proposed approach provides new insights into flash flood prediction in mountainous regions.

Download Full-text

A Semi-Automated Object-Based Gully Networks Detection Using Different Machine Learning Models: A Case Study of Bowen Catchment, Queensland, Australia

Sensors ◽

10.3390/s19224893 ◽

2019 ◽

Vol 19 (22) ◽

pp. 4893 ◽

Cited By ~ 22

Author(s):

Hejar Shahabi ◽

Ben Jarihani ◽

Sepideh Tavakkoli Piralilou ◽

David Chittleborough ◽

Mohammadtaghi Avand ◽

...

Keyword(s):

Machine Learning ◽

Image Segmentation ◽

Vegetation Index ◽

Scale Parameter ◽

Slope Aspect ◽

Support Vector ◽

Topographic Wetness Index ◽

Slope Length ◽

Optimal Scale ◽

Object Based

Gully erosion is a dominant source of sediment and particulates to the Great Barrier Reef (GBR) World Heritage area. We selected the Bowen catchment, a tributary of the Burdekin Basin, as our area of study; the region is associated with a high density of gully networks. We aimed to use a semi-automated object-based gully networks detection process using a combination of multi-source and multi-scale remote sensing and ground-based data. An advanced approach was employed by integrating geographic object-based image analysis (GEOBIA) with current machine learning (ML) models. These included artificial neural networks (ANN), support vector machines (SVM), and random forests (RF), and an ensemble ML model of stacking to deal with the spatial scaling problem in gully networks detection. Spectral indices such as the normalized difference vegetation index (NDVI) and topographic conditioning factors, such as elevation, slope, aspect, topographic wetness index (TWI), slope length (SL), and curvature, were generated from Sentinel 2A images and the ALOS 12-m digital elevation model (DEM), respectively. For image segmentation, the ESP2 tool was used to obtain three optimal scale factors. On using object pureness index (OPI), object matching index (OMI), and object fitness index (OFI), the accuracy of each scale in image segmentation was evaluated. The scale parameter of 45 with OFI of 0.94, which is a combination of OPI and OMI indices, proved to be the optimal scale parameter for image segmentation. Furthermore, segmented objects based on scale 45 were overlaid with 70% and 30% of a prepared gully inventory map to select the ML models’ training and testing objects, respectively. The quantitative accuracy assessment methods of Precision, Recall, and an F1 measure were used to evaluate the model’s performance. Integration of GEOBIA with the stacking model using a scale of 45 resulted in the highest accuracy in detection of gully networks with an F1 measure value of 0.89. Here, we conclude that the adoption of optimal scale object definition in the GEOBIA and application of the ensemble stacking of ML models resulted in higher accuracy in the detection of gully networks.

Download Full-text

Spatial Modelling of Gully Erosion Using GIS and R Programing: A Comparison among Three Data Mining Algorithms

Applied Sciences ◽

10.3390/app8081369 ◽

2018 ◽

Vol 8 (8) ◽

pp. 1369 ◽

Cited By ~ 52

Author(s):

Alireza Arabameri ◽

Biswajeet Pradhan ◽

Hamid Reza Pourghasemi ◽

Khalil Rezaei ◽

Norman Kerle

Keyword(s):

Data Mining ◽

Spatial Relationship ◽

Area Under The Curve ◽

Regression Tree ◽

Drainage Density ◽

Gully Erosion ◽

Slope Aspect ◽

Topographic Wetness Index ◽

Boosted Regression Tree ◽

Area Index

Gully erosion triggers land degradation and restricts the use of land. This study assesses the spatial relationship between gully erosion (GE) and geo-environmental variables (GEVs) using Weights-of-Evidence (WoE) Bayes theory, and then applies three data mining methods—Random Forest (RF), boosted regression tree (BRT), and multivariate adaptive regression spline (MARS)—for gully erosion susceptibility mapping (GESM) in the Shahroud watershed, Iran. Gully locations were identified by extensive field surveys, and a total of 172 GE locations were mapped. Twelve gully-related GEVs: Elevation, slope degree, slope aspect, plan curvature, convergence index, topographic wetness index (TWI), lithology, land use/land cover (LU/LC), distance from rivers, distance from roads, drainage density, and NDVI were selected to model GE. The results of variables importance by RF and BRT models indicated that distance from road, elevation, and lithology had the highest effect on GE occurrence. The area under the curve (AUC) and seed cell area index (SCAI) methods were used to validate the three GE maps. The results showed that AUC for the three models varies from 0.911 to 0.927, whereas the RF model had a prediction accuracy of 0.927 as per SCAI values, when compared to the other models. The findings will be of help for planning and developing the studied region.

Download Full-text

Comparison of Different Machine Learning Methods for Debris Flow Susceptibility Mapping: A Case Study in the Sichuan Province, China

Remote Sensing ◽

10.3390/rs12020295 ◽

2020 ◽

Vol 12 (2) ◽

pp. 295 ◽

Cited By ~ 6

Author(s):

Ke Xiong ◽

Basanta Raj Adhikari ◽

Constantine A. Stamatopoulos ◽

Yu Zhan ◽

Shaolin Wu ◽

...

Keyword(s):

Machine Learning ◽

Debris Flow ◽

Sichuan Province ◽

Susceptibility Mapping ◽

Boosted Regression Trees ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Susceptibility Maps ◽

Debris Flow Susceptibility

Debris flow susceptibility mapping is considered to be useful for hazard prevention and mitigation. As a frequent debris flow area, many hazardous events have occurred annually and caused a lot of damage in the Sichuan Province, China. Therefore, this study attempted to evaluate and compare the performance of four state-of-the-art machine-learning methods, namely Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Boosted Regression Trees (BRT), for debris flow susceptibility mapping in this region. Four models were constructed based on the debris flow inventory and a range of causal factors. A variety of datasets was obtained through the combined application of remote sensing (RS) and geographic information system (GIS). The mean altitude, altitude difference, aridity index, and groove gradient played the most important role in the assessment. The performance of these modes was evaluated using predictive accuracy (ACC) and the area under the receiver operating characteristic curve (AUC). The results of this study showed that all four models were capable of producing accurate and robust debris flow susceptibility maps (ACC and AUC values were well above 0.75 and 0.80 separately). With an excellent spatial prediction capability and strong robustness, the BRT model (ACC = 0.781, AUC = 0.852) outperformed other models and was the ideal choice. Our results also exhibited the importance of selecting suitable mapping units and optimal predictors. Furthermore, the debris flow susceptibility maps of the Sichuan Province were produced, which can provide helpful data for assessing and mitigating debris flow hazards.

Download Full-text

Harris Hawks Optimization: A Novel Swarm Intelligence Technique for Spatial Assessment of Landslide Susceptibility

Sensors ◽

10.3390/s19163590 ◽

2019 ◽

Vol 19 (16) ◽

pp. 3590 ◽

Cited By ~ 38

Author(s):

Bui ◽

Moayedi ◽

Kalantar ◽

Osouli ◽

Gör ◽

...

Keyword(s):

Landslide Susceptibility ◽

Characteristic Curve ◽

Absolute Error ◽

Slope Aspect ◽

Stream Power ◽

Topographic Wetness Index ◽

Conditioning Factors ◽

Performance Error ◽

Stream Power Index ◽

Artificial Neural Network Ann

In this research, the novel metaheuristic algorithm Harris hawks optimization (HHO) is applied to landslide susceptibility analysis in Western Iran. To this end, the HHO is synthesized with an artificial neural network (ANN) to optimize its performance. A spatial database comprising 208 historical landslides, as well as 14 landslide conditioning factors—elevation, slope aspect, plan curvature, profile curvature, soil type, lithology, distance to the river, distance to the road, distance to the fault, land cover, slope degree, stream power index (SPI), topographic wetness index (TWI), and rainfall—is prepared to develop the ANN and HHO–ANN predictive tools. Mean square error and mean absolute error criteria are defined to measure the performance error of the models, and area under the receiving operating characteristic curve (AUROC) is used to evaluate the accuracy of the generated susceptibility maps. The findings showed that the HHO algorithm effectively improved the performance of ANN in both recognizing (AUROCANN = 0.731 and AUROCHHO–ANN = 0.777) and predicting (AUROCANN = 0.720 and AUROCHHO–ANN = 0.773) the landslide pattern.

Download Full-text

Debris-flow susceptibility analysis using fluvio-morphological parameters and data mining: application to the Central-Eastern Pyrenees

Natural Hazards ◽

10.1007/s11069-013-0568-3 ◽

2013 ◽

Vol 67 (2) ◽

pp. 213-238 ◽

Cited By ~ 17

Author(s):

G. G. Chevalier ◽

V. Medina ◽

M. Hürlimann ◽

A. Bateman

Keyword(s):

Data Mining ◽

Debris Flow ◽

Morphological Parameters ◽

Susceptibility Analysis ◽

Eastern Pyrenees ◽

Central Eastern ◽

Data Mining Application ◽

Debris Flow Susceptibility

Download Full-text

Landslide Susceptibility Mapping using Bivariate Statistical Models and GIS in Chattagram District, Bangladesh

10.21203/rs.3.rs-531858/v1 ◽

2021 ◽

Author(s):

Md. Sharafat Chowdhury ◽

Bibi Hafsa

Keyword(s):

Statistical Models ◽

Landslide Susceptibility ◽

Slope Angle ◽

Information Value ◽

Weight Of Evidence ◽

Landslide Susceptibility Mapping ◽

Slope Aspect ◽

Susceptibility Map ◽

Topographic Wetness Index ◽

Conditioning Factors

Abstract This study attempts to produce Landslide Susceptibility Map for Chattagram District of Bangladesh by using five GIS based bivariate statistical models, namely the Frequency Ratio (FR), Shanon’s Entropy (SE), Weight of Evidence (WofE), Information Value (IV) and Certainty Factor (CF). A secondary landslide inventory database was used to correlate the previous landslides with the landslide conditioning factors. Sixteen landslide conditioning factors of Slope Aspect, Slope Angle, Geology, Elevation, Plan Curvature, Profile Curvature, General Curvature, Topographic Wetness Index, Stream Power Index, Sediment Transport Index, Topographic Roughness Index, Distance to Stream, Distance to Anticline, Distance to Fault, Distance to Road and NDVI were used. The Area Under Curve (AUC) was used for validation of the LSMs. The predictive rate of AUC for FR, SE, WofE, IV and CF were 76.11%, 70.11%, 78.93%, 76.57% and 80.43% respectively. CF model indicates 15.04% of areas are highly susceptible to landslide. All the models showed that the high elevated areas are more susceptible to landslide where the low-lying river basin areas have a low probability of landslide occurrence. The findings of this research will contribute to land use planning, management and hazard mitigation of the CHT region.

Download Full-text

Flood Susceptibility Prediction Using Hybrid-Based Approaches of Support Vector Regression Model and Meta-Heuristic Algorithms

10.21203/rs.3.rs-876711/v1 ◽

2021 ◽

Author(s):

Hossein Hamedi Sorajar ◽

Ali Asghar Alesheikh ◽

Mahdi Panahi ◽

Saro Lee

Keyword(s):

Land Use ◽

Critical Role ◽

Influential Factor ◽

Slope Aspect ◽

Support Vector ◽

Human Beings ◽

Operating Characteristics ◽

Topographic Wetness Index ◽

Support Vector Regression Model ◽

Testing Dataset

Abstract Landslides are one of the most destructive natural phenomena in the world, which occur mostly in mountainous areas and cause damage to the economic sectors, agricultural lands, residential areas and infrastructures of any country, and also threaten the lives and property of human beings. Therefore, landslide susceptibility mapping (LSM) can play a critical role in identifying prone areas and reducing the damage caused by landslides in each area. In the present study, deep learning algorithms including convolutional neural network (CNN) and long short-term memory (LSTM) were used to identify landslide prone areas in Ardabil province, Iran. Equql to 312 landslide locations were identified and randomly divided into train and test datasets at 70–30% ratios. Then, according to previous studies and environmental conditions in the study area, twelve factors affecting the occurrence of landslides were selected, namely altitude, slope angle, slope aspect, topographic wetness index (TWI), profile curvature, plan curvature, land-use, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. The ratio of the importance of each influential factor in landslide occurrence was obtained through information gain ranking filter (IGRF) method and it was found that land-use and profile curvature had the highest and lowest impacts, respectively. Afterwards, LSMs were generated using CNN and LSTM algorithms. In the next step, the performance of the models was evaluated based on the area under curve (AUC) value of receiver operating characteristics curve and the root mean square error (RMSE) method. The AUC values for CNN and LSTM models were 0.821 and 0.832, respectively. Furthermore, the RMSE values in the CNN model for each of the training and testing dataset were 0.121 and 0.132, respectively. The RMSE values in the LSTM model for each of the training and testing dataset were 0.185 and 0.188, respectively. Therefore, it can be concluded that CNN performance is slightly better than LSTM; but in general, both models have close performance and the accuracy of both models is acceptable.

Download Full-text

Debris flow susceptibility mapping using a qualitative heuristic method and Flow-R along the Yukon Alaska Highway Corridor, Canada

Natural Hazards and Earth System Science ◽

10.5194/nhess-16-449-2016 ◽

2016 ◽

Vol 16 (2) ◽

pp. 449-462 ◽

Cited By ~ 11

Author(s):

A. Blais-Stevens ◽

P. Behnia

Keyword(s):

Debris Flow ◽

Slope Angle ◽

Heuristic Method ◽

Northern Region ◽

Slope Aspect ◽

Research Activity ◽

Susceptibility Model ◽

Alaska Highway Corridor ◽

Kluane Lake ◽

Debris Flow Susceptibility

Abstract. This research activity aimed at reducing risk to infrastructure, such as a proposed pipeline route roughly parallel to the Yukon Alaska Highway Corridor (YAHC), by filling geoscience knowledge gaps in geohazards. Hence, the Geological Survey of Canada compiled an inventory of landslides including debris flow deposits, which were subsequently used to validate two different debris flow susceptibility models. A qualitative heuristic debris flow susceptibility model was produced for the northern region of the YAHC, from Kluane Lake to the Alaska border, by integrating data layers with assigned weights and class ratings. These were slope angle, slope aspect, surficial geology, plan curvature, and proximity to drainage system. Validation of the model was carried out by calculating a success rate curve which revealed a good correlation with the susceptibility model and the debris flow deposit inventory compiled from air photos, high-resolution satellite imagery, and field verification. In addition, the quantitative Flow-R method was tested in order to define the potential source and debris flow susceptibility for the southern region of Kluane Lake, an area where documented debris flow events have blocked the highway in the past (e.g. 1988). Trial and error calculations were required for this method because there was not detailed information on the debris flows for the YAHC to allow us to define threshold values for some parameters when calculating source areas, spreading, and runout distance. Nevertheless, correlation with known documented events helped define these parameters and produce a map that captures most of the known events and displays debris flow susceptibility in other, usually smaller, steep channels that had not been previously documented.

Download Full-text