Oversampling technique in student performance classification from engineering course

<span>The first year of an engineering student was important to take proper academic planning. All subjects in the first year were essential for an engineering basis. Student performance prediction helped academics improve their performance better. Students checked performance by themselves. If they were aware that their performance are low, then they could make some improvement for their better performance. This research focused on combining the oversampling minority class data with various kinds of classifier models. Oversampling techniques were SMOTE, Borderline-SMOTE, SVMSMOTE, and ADASYN and four classifiers were applied using MLP, gradient boosting, AdaBoost and random forest in this research. The results represented that Borderline-SMOTE gave the best result for minority class prediction with several classifiers.</span>

Download Full-text

Using Decision Trees and Random Forest Algorithms to Predict and Determine Factors Contributing to First-Year University Students’ Learning Performance

Algorithms ◽

10.3390/a14110318 ◽

2021 ◽

Vol 14 (11) ◽

pp. 318

Author(s):

Thao-Trang Huynh-Cam ◽

Long-Sheng Chen ◽

Huynh Le

Keyword(s):

Random Forest ◽

Performance Prediction ◽

Prediction Models ◽

Family Background ◽

Educational Practice ◽

Poor Performance ◽

Learning Performance ◽

First Year ◽

First Year Students ◽

Early Performance

First-year students’ learning performance has received much attention in educational practice and theory. Previous works used some variables, which should be obtained during the course or in the progress of the semester through questionnaire surveys and interviews, to build prediction models. These models cannot provide enough timely support for the poor performance students, caused by economic factors. Therefore, other variables are needed that allow us to reach prediction results earlier. This study attempts to use family background variables that can be obtained prior to the start of the semester to build learning performance prediction models of freshmen using random forest (RF), C5.0, CART, and multilayer perceptron (MLP) algorithms. The real sample of 2407 freshmen who enrolled in 12 departments of a Taiwan vocational university will be employed. The experimental results showed that CART outperforms C5.0, RF, and MLP algorithms. The most important features were mother’s occupations, department, father’s occupations, main source of living expenses, and admission status. The extracted knowledge rules are expected to be indicators for students’ early performance prediction so that strategic intervention can be planned before students begin the semester.

Download Full-text

A Statistical Evaluation of the Value of Pre-Engineering Curricula on First-Year Civil Engineering Student Performance

AEI 2017 ◽

10.1061/9780784480502.001 ◽

2017 ◽

Author(s):

Christopher H. Raebel ◽

Blake Wentz ◽

Frank Mahuta

Keyword(s):

Student Performance ◽

Civil Engineering ◽

Statistical Evaluation ◽

First Year ◽

Engineering Student

Download Full-text

Self-Efficacy Development among Students Enrolled in an Engineering Service-Learning Section

International Journal for Service Learning in Engineering Humanitarian Engineering and Social Entrepreneurship ◽

10.24908/ijsle.v13i2.11483 ◽

2018 ◽

Vol 13 (2) ◽

pp. 25-44 ◽

Cited By ~ 1

Author(s):

Lauren Dent ◽

Patricia Maloney ◽

Tanja Karp

Keyword(s):

Service Learning ◽

Student Performance ◽

Self Efficacy ◽

Interpersonal Skills ◽

Skills Assessment ◽

First Year ◽

Quantified Self ◽

Engineering Student ◽

Student Responses ◽

Psychometric Tool

Service-learning presents exciting new ways for students to enhance their learning. Educators and scholars agree that service-learning is connected to self-efficacy, which affects student performance. This research tests the development of self-efficacy in students enrolled in service-learning and traditional sections of a first-year engineering course. Using a previously developed metric, the Engineering Skills Assessment (ESA), students enrolled in service-learning (SL) and “traditional” (non-SL) sections quantified self-efficacy on 11 skills previously deemed important for engineering. Student responses were compared between SL and non-SL students at the beginning and end of the semester. Analysis of the collected data using exploratory factor analysis (EFA) grouped self-efficacy ratings for the 11 skills into three meaningful constructs: (1) Job-related skills (2) Interpersonal skills and (3) Life skills. Mean self-efficacy scores were significantly better at the end of the course for non-SL students in all areas and for SL students in four of the 11 skills and two of the three constructs. Self-efficacy growth was significantly higher for non-SL students, which may be due to the Dunning-Kruger effect. However, similar percentages of both populations self-reported that their skills were improved at the end of the semester due to the class. This research also supports the use of the ESA as a reliable psychometric tool to evaluate student self-efficacy and its relationship to service-learning.

Download Full-text

Four Grade Levels-Based Models with Random Forest for Student Performance Prediction at a Multidisciplinary University

Complex, Intelligent and Software Intensive Systems - Lecture Notes in Networks and Systems ◽

10.1007/978-3-030-79725-6_1 ◽

2021 ◽

pp. 1-12

Author(s):

Tran Thanh Dien ◽

Le Duy-Anh ◽

Nguyen Hong-Phat ◽

Nguyen Van-Tuan ◽

Trinh Thanh-Chanh ◽

...

Keyword(s):

Random Forest ◽

Student Performance ◽

Performance Prediction ◽

Grade Levels

Download Full-text

Can a Five-Minute, Three-Question Survey Foretell First-Year Engineering Student Performance and Retention?

10.18260/p.26427 ◽

2016 ◽

Cited By ~ 1

Author(s):

Stephanie Gratiano ◽

William Palm

Keyword(s):

Student Performance ◽

First Year ◽

Engineering Student ◽

Question Survey

Download Full-text

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

10.26678/abcm.cobem2017.cob17-1600 ◽

2017 ◽

Author(s):

Luis Felipe Nogoseke ◽

Gabriel Herman Bernardim Andrade ◽

Marco Boaretto ◽

Leandro Coelho

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Fault Detection ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine

Download Full-text

Estimation of modified expansive soil CBR with multivariate adaptive regression splines, random forest and gradient boosting machine

Innovative Infrastructure Solutions ◽

10.1007/s41062-021-00568-z ◽

2021 ◽

Vol 6 (4) ◽

Author(s):

Chijioke Christopher Ikeagwuani

Keyword(s):

Random Forest ◽

Expansive Soil ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Regression Splines ◽

Gradient Boosting Machine ◽

Adaptive Regression ◽

Adaptive Regression Splines

Download Full-text

Evaluation of Three Different Machine Learning Methods for Object-Based Artificial Terrace Mapping—A Case Study of the Loess Plateau, China

Remote Sensing ◽

10.3390/rs13051021 ◽

2021 ◽

Vol 13 (5) ◽

pp. 1021

Author(s):

Hu Ding ◽

Jiaming Na ◽

Shangjing Jiang ◽

Jie Zhu ◽

Kai Liu ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Loess Plateau ◽

Water Conservation ◽

Nearest Neighbor ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

The Loess Plateau ◽

Object Based ◽

Extreme Gradient Boosting

Artificial terraces are of great importance for agricultural production and soil and water conservation. Automatic high-accuracy mapping of artificial terraces is the basis of monitoring and related studies. Previous research achieved artificial terrace mapping based on high-resolution digital elevation models (DEMs) or imagery. As a result of the importance of the contextual information for terrace mapping, object-based image analysis (OBIA) combined with machine learning (ML) technologies are widely used. However, the selection of an appropriate classifier is of great importance for the terrace mapping task. In this study, the performance of an integrated framework using OBIA and ML for terrace mapping was tested. A catchment, Zhifanggou, in the Loess Plateau, China, was used as the study area. First, optimized image segmentation was conducted. Then, features from the DEMs and imagery were extracted, and the correlations between the features were analyzed and ranked for classification. Finally, three different commonly-used ML classifiers, namely, extreme gradient boosting (XGBoost), random forest (RF), and k-nearest neighbor (KNN), were used for terrace mapping. The comparison with the ground truth, as delineated by field survey, indicated that random forest performed best, with a 95.60% overall accuracy (followed by 94.16% and 92.33% for XGBoost and KNN, respectively). The influence of class imbalance and feature selection is discussed. This work provides a credible framework for mapping artificial terraces.

Download Full-text

Behavior-driven Student Performance Prediction with Tri-branch Convolutional Neural Network

Proceedings of the 29th ACM International Conference on Information & Knowledge Management ◽

10.1145/3340531.3412110 ◽

2020 ◽

Author(s):

Jian Zong ◽

Chaoran Cui ◽

Yuling Ma ◽

Li Yao ◽

Meng Chen ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Student Performance ◽

Performance Prediction

Download Full-text

Improving the performance of a radio-frequency localization system in adverse outdoor applications

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-021-02001-6 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Marcelo N. de Sousa ◽

Ricardo Sant’Ana ◽

Rigel P. Fernandes ◽

Julio Cesar Duarte ◽

José A. Apolinário ◽

...

Keyword(s):

Random Forest ◽

Ray Tracing ◽

Real World ◽

Practical Implication ◽

Real Life ◽

Simulated Data ◽

Real Data ◽

Gradient Boosting ◽

Real World Data ◽

Localization Accuracy

AbstractIn outdoor RF localization systems, particularly where line of sight can not be guaranteed or where multipath effects are severe, information about the terrain may improve the position estimate’s performance. Given the difficulties in obtaining real data, a ray-tracing fingerprint is a viable option. Nevertheless, although presenting good simulation results, the performance of systems trained with simulated features only suffer degradation when employed to process real-life data. This work intends to improve the localization accuracy when using ray-tracing fingerprints and a few field data obtained from an adverse environment where a large number of measurements is not an option. We employ a machine learning (ML) algorithm to explore the multipath information. We selected algorithms random forest and gradient boosting; both considered efficient tools in the literature. In a strict simulation scenario (simulated data for training, validating, and testing), we obtained the same good results found in the literature (error around 2 m). In a real-world system (simulated data for training, real data for validating and testing), both ML algorithms resulted in a mean positioning error around 100 ,m. We have also obtained experimental results for noisy (artificially added Gaussian noise) and mismatched (with a null subset of) features. From the simulations carried out in this work, our study revealed that enhancing the ML model with a few real-world data improves localization’s overall performance. From the machine ML algorithms employed herein, we also observed that, under noisy conditions, the random forest algorithm achieved a slightly better result than the gradient boosting algorithm. However, they achieved similar results in a mismatch experiment. This work’s practical implication is that multipath information, once rejected in old localization techniques, now represents a significant source of information whenever we have prior knowledge to train the ML algorithm.

Download Full-text