scholarly journals Student Performance Prediction Model Based on Discriminative Feature Selection

Author(s):  
Haixia Lu ◽  
Jinsong Yuan

It is a hot issue to be widely studied to determine the factors affecting students' performance from the perspective of data mining. In order to find the key factors that significantly affect students' performance from complex data, this paper pro-poses an integrated Optimized Ensemble Feature Selection Algorithm by Density Peaks (DPEFS). This algorithm is applied to the education data collected by two high schools in China, and the selected discriminative features are used to con-struct a student performance prediction model based on support vector machine (SVM). The results of the 10-fold cross-validation experiment show that, com-pared with various feature selection algorithms such as mRMR, Relief, SVM-RFE and AVC, the SVM student performance prediction model based on the fea-ture selection algorithm proposed in this paper has better prediction performance. In addition, some factors and rules affecting student performance can be extracted from the discriminative features selected by the feature selection algorithm in this paper, which provides a methodological and technical reference for teachers, edu-cation management staffs and schools to predict and analyze the students’ per-formances.

2021 ◽  
Vol 30 (1) ◽  
pp. 511-523
Author(s):  
Ephrem Admasu Yekun ◽  
Abrahaley Teklay Haile

Abstract One of the important measures of quality of education is the performance of students in academic settings. Nowadays, abundant data is stored in educational institutions about students which can help to discover insight on how students are learning and to improve their performance ahead of time using data mining techniques. In this paper, we developed a student performance prediction model that predicts the performance of high school students for the next semester for five courses. We modeled our prediction system as a multi-label classification task and used support vector machine (SVM), Random Forest (RF), K-nearest Neighbors (KNN), and Multi-layer perceptron (MLP) as base-classifiers to train our model. We further improved the performance of the prediction model using a state-of-the-art partitioning scheme to divide the label space into smaller spaces and used Label Powerset (LP) transformation method to transform each labelset into a multi-class classification task. The proposed model achieved better performance in terms of different evaluation metrics when compared to other multi-label learning tasks such as binary relevance and classifier chains.


2008 ◽  
Vol 15 (2) ◽  
pp. 203-218
Author(s):  
Luiz E. S. Oliveira ◽  
Paulo R. Cavalin ◽  
Alceu S. Britto Jr ◽  
Alessandro L. Koerich

This paper addresses the issue of detecting defects in Pine wood using features extracted from grayscale images. The feature set proposed here is based on the concept of texture and it is computed from the co-occurrence matrices. The features provide measures of properties such as smoothness, coarseness, and regularity. Comparative experiments using a color image based feature set extracted from percentile histograms are carried to demonstrate the efficiency of the proposed feature set. Two different learning paradigms, neural networks and support vector machines, and a feature selection algorithm based on multi-objective genetic algorithms were considered in our experiments. The experimental results show that after feature selection, the grayscale image based feature set achieves very competitive performance for the problem of wood defect detection relative to the color image based features.


2006 ◽  
Vol 15 (06) ◽  
pp. 893-915 ◽  
Author(s):  
JIANG LI ◽  
JIANHUA YAO ◽  
RONALD M. SUMMERS ◽  
NICHOLAS PETRICK ◽  
MICHAEL T. MANRY ◽  
...  

We present an efficient feature selection algorithm for computer aided detection (CAD) computed tomographic (CT) colonography. The algorithm (1) determines an appropriate piecewise linear network (PLN) model by cross validation, (2) applies the orthonormal least square (OLS) procedure to the PLN model utilizing a Modified Schmidt procedure, and (3) uses a floating search algorithm to select features that minimize the output variance. The undesirable "nesting effect" is prevented by the floating search approach, and the piecewise linear OLS procedure makes this algorithm very computationally efficient because the Modified Schmidt procedure only requires one data pass during the whole searching process. The selected features are compared to those obtained by other methods, through cross validation with support vector machines (SVMs).


Author(s):  
Maryam Zaffar ◽  
Manzoor Ahmad Hashmani ◽  
K.S. Savita ◽  
Syed Sajjad Hussain Rizvi ◽  
Mubashar Rehman

The Educational Data Mining (EDM) is a very vigorous area of Data Mining (DM), and it is helpful in predicting the performance of students. Student performance prediction is not only important for the student but also helpful for academic organization to detect the causes of success and failures of students. Furthermore, the features selected through the students’ performance prediction models helps in developing action plans for academic welfare. Feature selection can increase the prediction accuracy of the prediction model. In student performance prediction model, where every feature is very important, as a neglection of any important feature can cause the wrong development of academic action plans. Moreover, the feature selection is a very important step in the development of student performance prediction models. There are different types of feature selection algorithms. In this paper, Fast Correlation-Based Filter (FCBF) is selected as a feature selection algorithm. This paper is a step on the way to identifying the factors affecting the academic performance of the students. In this paper performance of FCBF is being evaluated on three different student’s datasets. The performance of FCBF is detected well on a student dataset with greater no of features.


Sign in / Sign up

Export Citation Format

Share Document