scholarly journals Logistic Model Tree and Decision Tree J48 Algorithms for Predicting the Length of Study Period

2020 ◽  
Vol 8 (1) ◽  
pp. 39-48
Author(s):  
Mohamad Firman Maulana ◽  
Meriska Defriani

One point to be assessed in the accreditation process in an institution is the length of the student's study period. The Informatics department in XYZ college has been accredited by the national accreditation bureau for higher education (BAN-PT), but the accreditation has the potential to be improved. One thing that affects the accreditation value is many students did not graduate on time. Therefore, the current study used available student data, both academic and non-academic, using data mining. Two model classifications were used, i.e. Logistic Model Tree (LMT) and Decision Tree J48. The study was aimed to compare LMT and Decision Tree J48 algorithm in predicting the length of student’s study and to find out the influence factors. The data were Informatics Engineering students who have graduated in February 2018 to February 2019 (135 records). Results showed that the LMT algorithm produced an accuracy rate of 71% better than Decision Tree J48 (62.8% accuracy) in predicting the length of the student’s study. The factors influencing the length of study of students are temporary grade point average (GPA) of the first semester, temporary GPA of the second semester, organizational status, and employment status.

2020 ◽  
Vol 12 (17) ◽  
pp. 2742
Author(s):  
Ehsan Kamali Maskooni ◽  
Seyed Amir Naghibi ◽  
Hossein Hashemi ◽  
Ronny Berndtsson

Groundwater (GW) is being uncontrollably exploited in various parts of the world resulting from huge needs for water supply as an outcome of population growth and industrialization. Bearing in mind the importance of GW potential assessment in reaching sustainability, this study seeks to use remote sensing (RS)-derived driving factors as an input of the advanced machine learning algorithms (MLAs), comprising deep boosting and logistic model trees to evaluate their efficiency. To do so, their results are compared with three benchmark MLAs such as boosted regression trees, k-nearest neighbors, and random forest. For this purpose, we firstly assembled different topographical, hydrological, RS-based, and lithological driving factors such as altitude, slope degree, aspect, slope length, plan curvature, profile curvature, relative slope position, distance from rivers, river density, topographic wetness index, land use/land cover (LULC), normalized difference vegetation index (NDVI), distance from lineament, lineament density, and lithology. The GW spring indicator was divided into two classes for training (434 springs) and validation (186 springs) with a proportion of 70:30. The training dataset of the springs accompanied by the driving factors were incorporated into the MLAs and the outputs were validated by different indices such as accuracy, kappa, receiver operating characteristics (ROC) curve, specificity, and sensitivity. Based upon the area under the ROC curve, the logistic model tree (87.813%) generated similar performance to deep boosting (87.807%), followed by boosted regression trees (87.397%), random forest (86.466%), and k-nearest neighbors (76.708%) MLAs. The findings confirm the great performance of the logistic model tree and deep boosting algorithms in modelling GW potential. Thus, their application can be suggested for other areas to obtain an insight about GW-related barriers toward sustainability. Further, the outcome based on the logistic model tree algorithm depicts the high impact of the RS-based factor, such as NDVI with 100 relative influence, as well as high influence of the distance from river, altitude, and RSP variables with 46.07, 43.47, and 37.20 relative influence, respectively, on GW potential.


Wind energy is one of the essential renewable energy resources because of its consistency due to the development of the technology and relative cost affordability. The wind energy is converted into electrical energy using rotating blades which are connected to the generator. Due to environmental conditions and large construction, the blades are subjected to various faults and cause the lack of productivity. The downtime can be reduced when they are diagnosed periodically using condition monitoring technique. These are considered as a machine learning problem which consists of three phases, namely feature extraction, feature selection and fault classification. In this study, statistical features are extracted from vibration signals, feature selection are carried out using J48 algorithm and the fault classification was carried out using logistic model tree algorithm.


2011 ◽  
Vol 12 (2) ◽  
pp. 57-67
Author(s):  
Dewi Juliah Ratnaningsih

Students’ persistence is the ability of students to survive in carrying out the study. In Universitas Terbuka (UT), there are no real dropped out student, but there are considered as non-active or non persistence students. Length of study time among UT’s students can be divided into binary data categories, which are valued as persistence (1) and non persistence (0). Logistic regression analysis is one type of statistical data analysis to be used for binary data. The purposes of writing this article are to identify the factors which influence the length of study time among students of the Department of Management, Faculty of Economics in UT, and to determine appropriate model in order to explain the relationship between the response variables (length of study time) with explanatory variables using logistic regression. The method used in this research is a case study with a number of samples as 2,936 college students. The result of the study shows that the factors influence the length of study time with alpha levels 0.05 are: age, the number of the courses taken, the employment status of the student, the participation in tutorials, the first semester achievement index, and the cumulative grade point.


Water ◽  
2019 ◽  
Vol 11 (8) ◽  
pp. 1596 ◽  
Author(s):  
S. Vahid Razavi-Termeh ◽  
Abolghasem Sadeghi-Niaraki ◽  
Soo-Mi Choi

In the future, groundwater will be the major source of water for agriculture, drinking and food production as a result of global climate change. With increasing population growth, demand for groundwater has increased. Therefore, sustainable groundwater storage management has become a major challenge. This study introduces a new ensemble data mining approach with bivariate statistical models, using FR (frequency ratio), CF (certainty factor), EBF (evidential belief function), RF (random forest) and LMT (logistic model tree) to prepare a groundwater potential map (GPM) for the Booshehr plain. In the first step, 339 wells were chosen and randomly split into two groups with groundwater yields above 11 m3/h. A total of 238 wells (70%) were used for model training, and 101 wells (30%) were used for model validation. Then, 15 effective factors, including topographic and hydrologic factors, were selected for the modeling. The accuracy of the groundwater potential maps was determined using the ROC (receiver operating characteristic) curve and the AUC (area under the curve). The results show that the AUC obtained using the CF-RF, EBF-RF, FR-RF, CF-LMT, EBF-LMT and FR-LMT methods were 0.927, 0.924, 0.917, 0.906, 0.885 and 0.83, respectively. Therefore, it can be inferred that the ensemble of bivariate statistic and data mining models can improve the effectiveness of the methods in developing a groundwater potential map.


Sign in / Sign up

Export Citation Format

Share Document