logistic model tree
Recently Published Documents





2021 ◽  
pp. 154-169
Abdullateef O. Balogun ◽  
Noah O. Akande ◽  
Fatimah E. Usman-Hamza ◽  
Victor E. Adeyemo ◽  
Modinat A. Mabayoje ◽  

2020 ◽  
Vol 5 (9) ◽  
pp. 1097-1101
Folake Akinbohun ◽  
Ambrose Akinbohun ◽  
Adekunle Daniel ◽  
Oghenerukevwe Elohor Ojajuni

Head and neck cancers (HNC) are indicated when cells grow abnormally.  The incidence of HNC is on the increase owing to several factors. There is often late presentation that can result in loss of lives (mortality) especially in Africa due to paucity of specialists. These challenges prompted the development of a stacked ensemble model for diagnosis of HNC to facilitate prompt referral.  The data were collected which consists of 1473 instances with 18 features.   Information Gain was used for selecting important features and three supervised learning algorithms were deployed for the base learners: Decision Tree (C4.5), K-Nearest Neighbors and Naïve Bayes. The predictions of the base learners were combined and passed to meta learners: Logistic Model Tree (LMT). The result showed that Information Gain method with stacked LMTwas 95.11%. It was deduced that both Information Gain with stacked MLR produced higher accuracy that the base learners’ results. Hence, this stacked model can be used for diagnosis of HNC in healthcare systems.

2020 ◽  
Vol 12 (17) ◽  
pp. 2742
Ehsan Kamali Maskooni ◽  
Seyed Amir Naghibi ◽  
Hossein Hashemi ◽  
Ronny Berndtsson

Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their efficiency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we firstly assembled different topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, profile curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use/land cover (LULC), normalized difference vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by different indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, specificity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The findings confirm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative influence, as well as high influence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative influence, respectively, on GW potential.

Forests ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 830 ◽  
Viet-Ha Nhu ◽  
Ayub Mohammadi ◽  
Himan Shahabi ◽  
Baharin Bin Ahmad ◽  
Nadhir Al-Ansari ◽  

We used remote sensing techniques and machine learning to detect and map landslides, and landslide susceptibility in the Cameron Highlands, Malaysia. We located 152 landslides using a combination of interferometry synthetic aperture radar (InSAR), Google Earth (GE), and field surveys. Of the total slide locations, 80% (122 landslides) were utilized for training the selected algorithms, and the remaining 20% (30 landslides) were applied for validation purposes. We employed 17 conditioning factors, including slope angle, aspect, elevation, curvature, profile curvature, stream power index (SPI), topographic wetness index (TWI), lithology, soil type, land cover, normalized difference vegetation index (NDVI), distance to river, distance to fault, distance to road, river density, fault density, and road density, which were produced from satellite imageries, geological map, soil maps, and a digital elevation model (DEM). We used these factors to produce landslide susceptibility maps using logistic regression (LR), logistic model tree (LMT), and random forest (RF) models. To assess prediction accuracy of the models we employed the following statistical measures: negative predictive value (NPV), sensitivity, positive predictive value (PPV), specificity, root-mean-squared error (RMSE), accuracy, and area under the receiver operating characteristic (ROC) curve (AUC). Our results indicated that the AUC was 92%, 90%, and 88% for the LMT, LR, and RF algorithms, respectively. To assess model performance, we also applied non-parametric statistical tests of Friedman and Wilcoxon, where the results revealed that there were no practical differences among the used models in the study area. While landslide mapping in tropical environment such as Cameron Highlands remains difficult, the remote sensing (RS) along with machine learning techniques, such as the LMT model, show promise for landslide susceptibility mapping in the study area.

2020 ◽  
Vol 12 (14) ◽  
pp. 2180 ◽  
Xia Zhao ◽  
Wei Chen

This paper focuses on landslide susceptibility prediction in Nanchuan, a high-risk landslide disaster area. The evidential belief function (EBF)-based function tree (FT), logistic regression (LR), and logistic model tree (LMT) were applied to Nanchuan District, China. Firstly, an inventory with 298 landslides was compiled and separated into two parts (70%: 209; 30%: 89) as training and validation datasets. Then, based on the EBF method, the Bel values of 16 conditioning factors related to landslide occurrence were calculated, and these Bel values were used as input data for building other models. The receiver operating characteristic (ROC) curve and the values of the area under the ROC curve (AUC) were used to evaluate and compare the prediction ability of the four models. All the models achieved good results and performed well. In particular, the LMT model had the best performance (0.847 and 0.765, obtained from the training and validation datasets, respectively). This paper also demonstrates the superiority of integration and optimization of models in landslide susceptibility evaluation. Finally, the best classification method was selected to draw landslide susceptibility maps, which may be helpful for government administrators and engineers to carry out land design and planning.

Talha Mahboob Alam ◽  
Kamran Shaukat ◽  
Mubbashar Mushtaq ◽  
Yasir Ali ◽  
Matloob Khushi ◽  

Abstract The area of corporate bankruptcy prediction attains high economic importance, as it affects many stakeholders. The prediction of corporate bankruptcy has been extensively studied in economics, accounting and decision sciences over the past two decades. The corporate bankruptcy prediction has been a matter of talk among academic literature and professional researchers throughout the world. Different traditional approaches were suggested based on hypothesis testing and statistical modeling. Therefore, the primary purpose of the research is to come up with a model that can estimate the probability of corporate bankruptcy by evaluating its occurrence of failure using different machine learning models. As the dataset was not well prepared and contains missing values, various data mining and data pre-processing techniques were utilized for data preparation. Within this research, the task of resolving the issues induced by the imbalance between the two classes is approached by applying different data balancing techniques. We address the problem of imbalanced data with the random undersampling and Synthetic Minority Over Sampling Technique (SMOTE). We used five machine learning models (support vector machine, J48 decision tree, Logistic model tree, random forest and decision forest) to predict corporate bankruptcy earlier to the occurrence. We use data from 2009 to 2013 on Poland manufacturing corporates and selected the 64 financial indicators to be broken down. The main finding of the study is a significant improvement in predictive accuracy using machine learning techniques. We also include other economic indicators ratios, along with Altman’s Z-score variables related to profitability, liquidity, leverage and solvency (short/long term) to propose an efficient model. Machine learning models give better results while balancing the data through SMOTE as compared to random undersampling. The machine learning technique related to decision forest led to 99% accuracy, whereas support vector machine (SVM), J48 decision tree, Logistic Model Tree (LMT) and Random Forest (RF) led to 92%, 92.3%, 93.8% and 98.7% accuracy, respectively, with all predictive financial indicators. We find that the decision forest outperforms the other techniques and previous techniques discussed in the literature. The proposed method is also deployed on the web to assist regulators, investors, creditors and scholars to predict corporate bankruptcy.

Viet-Ha Nhu ◽  
Ataollah Shirzadi ◽  
Himan Shahabi ◽  
Sushant K. Singh ◽  
Nadhir Al-Ansari ◽  

Shallow landslides damage buildings and other infrastructure, disrupt agriculture practices, and can cause social upheaval and loss of life. As a result, many scientists study the phenomenon, and some of them have focused on producing landslide susceptibility maps that can be used by land-use managers to reduce injury and damage. This paper contributes to this effort by comparing the power and effectiveness of five machine learning, benchmark algorithms—Logistic Model Tree, Logistic Regression, Naïve Bayes Tree, Artificial Neural Network, and Support Vector Machine—in creating a reliable shallow landslide susceptibility map for Bijar City in Kurdistan province, Iran. Twenty conditioning factors were applied to 111 shallow landslides and tested using the One-R attribute evaluation (ORAE) technique for modeling and validation processes. The performance of the models was assessed by statistical-based indexes including sensitivity, specificity, accuracy, mean absolute error (MAE), root mean square error (RMSE), and area under the receiver operatic characteristic curve (AUC). Results indicate that all the five machine learning models performed well for shallow landslide susceptibility assessment, but the Logistic Model Tree model (AUC = 0.932) had the highest goodness-of-fit and prediction accuracy, followed by the Logistic Regression (AUC = 0.932), Naïve Bayes Tree (AUC = 0.864), ANN (AUC = 0.860), and Support Vector Machine (AUC = 0.834) models. Therefore, we recommend the use of the Logistic Model Tree model in shallow landslide mapping programs in semi-arid regions to help decision makers, planners, land-use managers, and government agencies mitigate the hazard and risk.

Sign in / Sign up

Export Citation Format

Share Document