scholarly journals Landslide susceptibility modelling based on decision tree classification model of machine learning: A case study of Kullu-Rohtang Pass transport corridor (Himachal Pradesh), India

Author(s):  
Nirbhav Sharma ◽  
Ram Babu Singh ◽  
Anand Malik ◽  
Maheshwar Sharma

Abstract Landslide hazards are responsible for causing substantial destruction and losses in mountainous region. In order to lessen the damage in these vulnerable areas, the key challenge is to predict the landslide events with accuracy and precision. The principal objective of the study conducted is to assess the landslide susceptibility along the transport corridor from Kullu to Rohtang Pass in Himachal Pradesh, India. To achieve this objective, a detailed landslide inventory has been prepared based on the imagery data and frequent field visits. A total of 197 landslides were taken under consideration including 153 rock slides and 44 debris slides. Nine landslide factors were prepared initially and their relationships with each other and with the type of landslide was analysed. Later, information gain ratio measure was used to identify the triggering factors having best score for eliminating the unimportant factors. Train_test_split method was used to classify the dataset into training and testing groups. Decision tree classification model of machine learning was applied for landslide susceptibility model (LSM). The performance was evaluated using classification report and receiver operating characteristic (ROC) curve. Results obtained have proved that the decision tree classification model of machine learning performed well and have a good accuracy in forecasting landslide susceptibility in the area considered for this study.

Author(s):  
Tsehay Admassu Assegie ◽  
Pramod Sekharan Nair

Handwritten digits recognition is an area of machine learning, in which a machine is trained to identify handwritten digits. One method of achieving this is with decision tree classification model. A decision tree classification is a machine learning approach that uses the predefined labels from the past known sets to determine or predict the classes of the future data sets where the class labels are unknown. In this paper we have used the standard kaggle digits dataset for recognition of handwritten digits using a decision tree classification approach. And we have evaluated the accuracy of the model against each digit from 0 to 9.


2012 ◽  
Vol 532-533 ◽  
pp. 1685-1690 ◽  
Author(s):  
Zhi Kang Luo ◽  
Huai Ying Sun ◽  
De Wang

This paper presents an improved SPRINT algorithm. The original SPRINT algorithm is a scalable and parallelizable decision tree algorithm, which is a popular algorithm in data mining and machine learning communities. To improve the algorithm's efficiency, we propose an improved algorithm. Firstly, we select the splitting attributes and obtain the best splitting attribute from them by computing the information gain ratio of each attribute. After that, we calculate the best splitting point of the best splitting attribute. Since it avoids a lot of calculations of other attributes, the improved algorithm can effectively reduce the computation.


Author(s):  
N. REN ◽  
M. ZARGHAM ◽  
S. RAHIMI

Stock selection rules are extensively utilized as the guideline to construct high performance stock portfolios. However, the predictive performance of the rules developed by some economic experts in the past has decreased dramatically for the current stock market. In this paper, C4.5 decision tree classification method was adopted to construct a model for stock prediction based on the fundamental stock data, from which a set of stock selection rules was derived. The experimental results showed that the generated rules have exceptional predictive performance. Moreover, it also demonstrated that the C4.5 decision tree classification model can work efficiently on the high noise stock data domain.


Entropy ◽  
2019 ◽  
Vol 21 (2) ◽  
pp. 106 ◽  
Author(s):  
Qingfeng He ◽  
Zhihao Xu ◽  
Shaojun Li ◽  
Renwei Li ◽  
Shuai Zhang ◽  
...  

Landslides are a major geological hazard worldwide. Landslide susceptibility assessments are useful to mitigate human casualties, loss of property, and damage to natural resources, ecosystems, and infrastructures. This study aims to evaluate landslide susceptibility using a novel hybrid intelligence approach with the rotation forest-based credal decision tree (RF-CDT) classifier. First, 152 landslide locations and 15 landslide conditioning factors were collected from the study area. Then, these conditioning factors were assigned values using an entropy method and subsequently optimized using correlation attribute evaluation (CAE). Finally, the performance of the proposed hybrid model was validated using the receiver operating characteristic (ROC) curve and compared with two well-known ensemble models, bagging (bag-CDT) and MultiBoostAB (MB-CDT). Results show that the proposed RF-CDT model had better performance than the single CDT model and hybrid bag-CDT and MB-CDT models. The findings in the present study overall confirm that a combination of the meta model with a decision tree classifier could enhance the prediction power of the single landslide model. The resulting susceptibility maps could be effective for enforcement of land management regulations to reduce landslide hazards in the study area and other similar areas in the world.


2020 ◽  
Vol 21 (15) ◽  
pp. 5280
Author(s):  
Irini Furxhi ◽  
Finbarr Murphy

The practice of non-testing approaches in nanoparticles hazard assessment is necessary to identify and classify potential risks in a cost effective and timely manner. Machine learning techniques have been applied in the field of nanotoxicology with encouraging results. A neurotoxicity classification model for diverse nanoparticles is presented in this study. A data set created from multiple literature sources consisting of nanoparticles physicochemical properties, exposure conditions and in vitro characteristics is compiled to predict cell viability. Pre-processing techniques were applied such as normalization methods and two supervised instance methods, a synthetic minority over-sampling technique to address biased predictions and production of subsamples via bootstrapping. The classification model was developed using random forest and goodness-of-fit with additional robustness and predictability metrics were used to evaluate the performance. Information gain analysis identified the exposure dose and duration, toxicological assay, cell type, and zeta potential as the five most important attributes to predict neurotoxicity in vitro. This is the first tissue-specific machine learning tool for neurotoxicity prediction caused by nanoparticles in in vitro systems. The model performs better than non-tissue specific models.


2020 ◽  
Author(s):  
Lanbing Yu ◽  
Yang Wang ◽  
Yujie Zhang

<p>The landslide development laws vary in different landslide-prone areas, hence the susceptibility models often perform in varied ways in different regions. Due to the periodic regulation of reservoir water level, a large number of landslides occur in the Three Gorges Reservoir area (TGRA). These landslides seriously threaten the safety of local residents and their property. It is crucial to find the model that can generate a landslide susceptibility map with higher accuracy in the TGRA. The main objective of this study was to explore the preference of machine learning models for landslide susceptibility mapping in the TGRA.</p><p>The Wushan segment of TGRA was selected as a case study, which is located in the middle reaches of the TGRA, the southwest of China. In this study, 165 landslides were identified and 14 landslide causal factors were constructed from different data sources at first, including altitude, slope, aspect, curvature, plan curvature, profile curvature, stream power index, topographic wetness index (TWI), terrain roughness index, lithology, bedding structure, distance to faults, distance to rivers, and distance to gully. Subsequently, multicollinearity analysis and information gain ratio model were applied to select landslide causal factors. After removing five factors (altitude, TWI, profile curvature, plan curvature, curvature), the landslide susceptibility mapping using the calculated results of four models, which were support vector machines (SVM), artificial neural networks, classification and regression tree, and logistic regression. Finally, the accuracy of the four models was evaluated and compared using the accuracy statistic methods and the receiver operating characteristic (ROC). The results of accuracy analysis showed that the SVM model performed the best. At the same time, the SVM performance behavior for susceptibility modelling in other areas were collected. In these regions, the accuracy of SVM was always larger than 0.8. We could see that SVM performed acceptably in different regions, and thus it can be used as a recommended model in TGRA and other landslide-prone regions.</p><p>In this study area, a total of 62% of the landslides were within 300 m from the Yangtze River, and the distance to rivers was the most important factor. The impoundment of the TGRA impacted the landslide development in three aspects: (1) the long-term immersion of reservoir water gradually reducing the strength of rock (soil) at the saturated zone (mostly near the Yangtze river), reducing the resistance force of landslide; (2) the strong dynamic action of water enhancing the lateral erosion on the bank slope, changing the slope shape, and thus reducing the slope stability; (3) the periodic fluctuation of the reservoir water making the self-weight, static, and dynamic water pressure of the landslide change, which could increase the resistance force or reduce the sliding force of the landslide and even cause overall instability and damage. Hence, in order to reduce the losses caused by landslides in TGRA, we should pay more attention to the early warning of reservoir bank landslides.</p>


2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Teguh Budi Santoso ◽  
Dela Sekardiana

<p><em>Current credit giving in KOPERIA (Koperasi Warga Komplek Gandaria) is still based on an objective process. Difficulties in determining the feasibility of giving credit are often experienced by cooperative managers, so that problems arise in the cooperative is a default payment of credit installments of customers in KOPERIA. This study aims to form a decision tree classification model to determine the customer's credit worthiness. In this study the application of C4.5 Algorithm, based on the Sets and Attributes used in this study, namely, the amount of income divided into 2 categories&gt; 5 million and 3-5 million, the amount of balance divided into three, namely&gt; 3 million, 1-3 million and &lt;1 Million, The Loan Amount is divided into three, namely 1-4 Months, 5-8 months, and 9-12 Months and Requirements with attributes of Business Capital, buying goods and others. In this study determine the appropriate root nodes, the classification results using C4.5 Algorithm shows that the accuracy of 97.5% is obtained, based on the results obtained shows that the c4.5 algorithm is suitable to be used to determine the feasibility of lending customers to KOPERIA.</em></p><p><strong><em>Keywords</em></strong><em>: Data Mining, C4.5 Algorithm</em><em>, loan feasibility</em></p>


Bariatrics is the branch of science which deals with obesity and its related surgical procedures. A person’s physical indolence, unhealthy food habits and genetic constitution emanates as the fons et origo of health gremlins. Multifaceted indagations have been worked on the diverse and heterogenous obstinate concerns caused due to obesity. Anatomization of body fat percentage has become a rudimentary regimen for every individual to be done in a fastidious manner. The whilom work anent body fat percentage entailed Body Mass Index (BMI) with respect to age and gender of a person. The anatomical conformation of an individual unraveling the fat constitution and the muscle tissue composition is not computationally enumerated using BMI. Thus, the formula using BMI dempers the veracity for a person having more muscle mass than fat mass and speciously vitiates the fat percentage of that person. The proposed novel formula is analyzed by cross-validated classification model using decision tree, and is effectuated by implementing information gain. This accentuates the coherence, efficacy and accuracy of the derived body fat percentage for a person. The Ethical Committee approval for this study has been obtained from the Institutional Ethics Committee, Madras Medical College, Chennai. The empirical study has been simulated using Matlab and the results have been successfully acquired in the GUI mode.


Sign in / Sign up

Export Citation Format

Share Document