Logistic Model Tree and Decision Tree J48 Algorithms for Predicting the Length of Study Period

One point to be assessed in the accreditation process in an institution is the length of the student's study period. The Informatics department in XYZ college has been accredited by the national accreditation bureau for higher education (BAN-PT), but the accreditation has the potential to be improved. One thing that affects the accreditation value is many students did not graduate on time. Therefore, the current study used available student data, both academic and non-academic, using data mining. Two model classifications were used, i.e. Logistic Model Tree (LMT) and Decision Tree J48. The study was aimed to compare LMT and Decision Tree J48 algorithm in predicting the length of student’s study and to find out the influence factors. The data were Informatics Engineering students who have graduated in February 2018 to February 2019 (135 records). Results showed that the LMT algorithm produced an accuracy rate of 71% better than Decision Tree J48 (62.8% accuracy) in predicting the length of the student’s study. The factors influencing the length of study of students are temporary grade point average (GPA) of the first semester, temporary GPA of the second semester, organizational status, and employment status.

Download Full-text

Speeding Up Logistic Model Tree Induction

Knowledge Discovery in Databases: PKDD 2005 - Lecture Notes in Computer Science ◽

10.1007/11564126_72 ◽

2005 ◽

pp. 675-683 ◽

Cited By ~ 87

Author(s):

Marc Sumner ◽

Eibe Frank ◽

Mark Hall

Keyword(s):

Logistic Model ◽

Model Tree ◽

Logistic Model Tree

Download Full-text

Application of Advanced Machine Learning Algorithms to Assess Groundwater Potential Using Remote Sensing-Derived Data

Remote Sensing ◽

10.3390/rs12172742 ◽

2020 ◽

Vol 12 (17) ◽

pp. 2742

Author(s):

Ehsan Kamali Maskooni ◽

Seyed Amir Naghibi ◽

Hossein Hashemi ◽

Ronny Berndtsson

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Logistic Model ◽

Nearest Neighbors ◽

Driving Factors ◽

Machine Learning Algorithms ◽

Boosted Regression Trees ◽

K Nearest Neighbors ◽

Model Tree ◽

Logistic Model Tree

Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their efficiency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we firstly assembled different topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, profile curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use/land cover (LULC), normalized difference vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by different indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, specificity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The findings confirm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative influence, as well as high influence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative influence, respectively, on GW potential.

Download Full-text

New Combined S-transform and Logistic Model Tree Technique for Recognition and Classification of Power Quality Disturbances

Electric Power Components and Systems ◽

10.1080/15325008.2010.513364 ◽

2011 ◽

Vol 39 (1) ◽

pp. 80-98 ◽

Cited By ~ 12

Author(s):

Z. Moravej ◽

A. A. Abdoos ◽

M. Pazoki

Keyword(s):

Power Quality ◽

Logistic Model ◽

Model Tree ◽

S Transform ◽

Logistic Model Tree ◽

Power Quality Disturbances

Download Full-text

Logistic Model Tree Classifier for Condition Monitoring of Wind Turbine Blades

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1033.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 202-209

Keyword(s):

Feature Selection ◽

Wind Energy ◽

Condition Monitoring ◽

Logistic Model ◽

Turbine Blades ◽

Fault Classification ◽

Renewable Energy Resources ◽

Model Tree ◽

Tree Classifier ◽

Logistic Model Tree

Wind energy is one of the essential renewable energy resources because of its consistency due to the development of the technology and relative cost affordability. The wind energy is converted into electrical energy using rotating blades which are connected to the generator. Due to environmental conditions and large construction, the blades are subjected to various faults and cause the lack of productivity. The downtime can be reduced when they are diagnosed periodically using condition monitoring technique. These are considered as a machine learning problem which consists of three phases, namely feature extraction, feature selection and fault classification. In this study, statistical features are extracted from vibration signals, feature selection are carried out using J48 algorithm and the fault classification was carried out using logistic model tree algorithm.

Download Full-text

PEMODELAN DAYA TAHAN BELAJAR MAHASISWA PENDIDIKAN TINGGI JARAK JAUH DENGAN PENDEKATAN REGRESI LOGISTIK BINER (STUDI KASUS: MAHASISWA FAKULTAS EKONOMI JURUSAN MANAJEMEN)

Jurnal Matematika Sains dan Teknologi ◽

10.33830/jmst.v12i2.512.2011 ◽

2011 ◽

Vol 12 (2) ◽

pp. 57-67

Author(s):

Dewi Juliah Ratnaningsih

Keyword(s):

Logistic Regression ◽

Binary Data ◽

Study Time ◽

Statistical Data Analysis ◽

Grade Point ◽

First Semester ◽

Explanatory Variables ◽

The Relationship ◽

Length Of Study

Students’ persistence is the ability of students to survive in carrying out the study. In Universitas Terbuka (UT), there are no real dropped out student, but there are considered as non-active or non persistence students. Length of study time among UT’s students can be divided into binary data categories, which are valued as persistence (1) and non persistence (0). Logistic regression analysis is one type of statistical data analysis to be used for binary data. The purposes of writing this article are to identify the factors which influence the length of study time among students of the Department of Management, Faculty of Economics in UT, and to determine appropriate model in order to explain the relationship between the response variables (length of study time) with explanatory variables using logistic regression. The method used in this research is a case study with a number of samples as 2,936 college students. The result of the study shows that the factors influence the length of study time with alpha levels 0.05 are: age, the number of the courses taken, the employment status of the student, the participation in tutorials, the first semester achievement index, and the cumulative grade point.

Download Full-text

Implementation of Principal Component Analysis for Diagnosing Lung Cancer Using Logistic Model Tree Algorithm & J48 Algorithm

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2019.6152 ◽

2019 ◽

Vol 7 (6) ◽

pp. 886-892

Author(s):

Firoz Sajad

Keyword(s):

Lung Cancer ◽

Principal Component Analysis ◽

Logistic Model ◽

Principal Component ◽

Component Analysis ◽

Tree Algorithm ◽

Model Tree ◽

Logistic Model Tree

Download Full-text

Groundwater Potential Mapping Using an Integrated Ensemble of Three Bivariate Statistical Models with Random Forest and Logistic Model Tree Models

Water ◽

10.3390/w11081596 ◽

2019 ◽

Vol 11 (8) ◽

pp. 1596 ◽

Cited By ~ 12

Author(s):

S. Vahid Razavi-Termeh ◽

Abolghasem Sadeghi-Niaraki ◽

Soo-Mi Choi

Keyword(s):

Data Mining ◽

Random Forest ◽

Statistical Models ◽

Logistic Model ◽

Characteristic Curve ◽

Groundwater Potential ◽

Model Tree ◽

Potential Map ◽

Groundwater Potential Map ◽

Logistic Model Tree

In the future, groundwater will be the major source of water for agriculture, drinking and food production as a result of global climate change. With increasing population growth, demand for groundwater has increased. Therefore, sustainable groundwater storage management has become a major challenge. This study introduces a new ensemble data mining approach with bivariate statistical models, using FR (frequency ratio), CF (certainty factor), EBF (evidential belief function), RF (random forest) and LMT (logistic model tree) to prepare a groundwater potential map (GPM) for the Booshehr plain. In the first step, 339 wells were chosen and randomly split into two groups with groundwater yields above 11 m3/h. A total of 238 wells (70%) were used for model training, and 101 wells (30%) were used for model validation. Then, 15 effective factors, including topographic and hydrologic factors, were selected for the modeling. The accuracy of the groundwater potential maps was determined using the ROC (receiver operating characteristic) curve and the AUC (area under the curve). The results show that the AUC obtained using the CF-RF, EBF-RF, FR-RF, CF-LMT, EBF-LMT and FR-LMT methods were 0.927, 0.924, 0.917, 0.906, 0.885 and 0.83, respectively. Therefore, it can be inferred that the ensemble of bivariate statistic and data mining models can improve the effectiveness of the methods in developing a groundwater potential map.

Download Full-text