Application of Feature Selection Methods in Educational Data Mining

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.

Download Full-text

Benchmarking relief-based feature selection methods for bioinformatics data mining

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2018.07.015 ◽

2018 ◽

Vol 85 ◽

pp. 168-188 ◽

Cited By ~ 45

Author(s):

Ryan J. Urbanowicz ◽

Randal S. Olson ◽

Peter Schmitt ◽

Melissa Meeker ◽

Jason H. Moore

Keyword(s):

Data Mining ◽

Feature Selection ◽

Selection Methods

Download Full-text

Feature Selection Methods on Biological Knowledge Discovery and Data Mining: A Survey

2014 25th International Workshop on Database and Expert Systems Applications ◽

10.1109/dexa.2014.26 ◽

2014 ◽

Cited By ~ 4

Author(s):

Hanen Mhamdi ◽

Faouzi Mhamdi

Keyword(s):

Data Mining ◽

Feature Selection ◽

Knowledge Discovery ◽

Biological Knowledge ◽

Selection Methods

Download Full-text

Developed third iterative dichotomizer based on feature decisive values for educational data mining

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i1.pp209-217 ◽

2020 ◽

Vol 18 (1) ◽

pp. 209

Author(s):

Saja Taha Ahmed ◽

Rafah Al-Hamdani ◽

Muayad Sadik Croock

Keyword(s):

Data Mining ◽

Feature Selection ◽

Decision Tree ◽

Predictive Analytics ◽

Educational Data Mining ◽

Target Class ◽

Id3 Algorithm ◽

Feature Weight ◽

Holdout Validation ◽

Fold Cross Validation

Recently, the decision trees have been adopted among the preeminent utilized classification models. They acquire their fame from their efficiency in predictive analytics, easy to interpret and implicitly perform feature selection. This latter perspective is one of essential significance in Educational Data Mining (EDM), in which selecting the most relevant features has a major impact on classification accuracy enhancement. The main contribution is to build a new multi-objective decision tree, which can be used for feature selection and classification. The proposed Decisive Decision Tree (DDT) is introduced and constructed based on a decisive feature value as a feature weight related to the target class label. The traditional Iterative Dichotomizer 3 (ID3) algorithm and the proposed DDT are compared using three datasets in terms of some ID3 issues, including logarithmic calculation complexity and multi-values featuresselection. The results indicated that the proposed DDT outperforms the ID3 in the developing time. The accuracy of the classification is improved on the basis of 10-fold cross-validation for all datasets with the highest accuracy achieved by the proposed method is 92% for the student.por dataset and holdout validation for two datasets, i.e. Iraqi and Student-Math. The experiment also shows that the proposed DDT tends to select attributes that are important rather than multi-value.

Download Full-text

Role of FCBF Feature Selection in Educational Data Mining

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2004.09 ◽

2020 ◽

Vol 39 (4) ◽

pp. 772-778

Author(s):

Maryam Zaffar ◽

Manzoor Ahmad Hashmani ◽

K.S. Savita ◽

Syed Sajjad Hussain Rizvi ◽

Mubashar Rehman

Keyword(s):

Data Mining ◽

Feature Selection ◽

Prediction Model ◽

Student Performance ◽

Performance Prediction ◽

Prediction Models ◽

Educational Data Mining ◽

Action Plans ◽

Factors Affecting ◽

Academic Organization

The Educational Data Mining (EDM) is a very vigorous area of Data Mining (DM), and it is helpful in predicting the performance of students. Student performance prediction is not only important for the student but also helpful for academic organization to detect the causes of success and failures of students. Furthermore, the features selected through the students’ performance prediction models helps in developing action plans for academic welfare. Feature selection can increase the prediction accuracy of the prediction model. In student performance prediction model, where every feature is very important, as a neglection of any important feature can cause the wrong development of academic action plans. Moreover, the feature selection is a very important step in the development of student performance prediction models. There are different types of feature selection algorithms. In this paper, Fast Correlation-Based Filter (FCBF) is selected as a feature selection algorithm. This paper is a step on the way to identifying the factors affecting the academic performance of the students. In this paper performance of FCBF is being evaluated on three different student’s datasets. The performance of FCBF is detected well on a student dataset with greater no of features.

Download Full-text

An Optimal Categorization of Feature Selection Methods for Knowledge Discovery

Visual Analytics and Interactive Technologies ◽

10.4018/978-1-60960-102-7.ch006 ◽

2011 ◽

pp. 94-108 ◽

Cited By ~ 4

Author(s):

Harleen Kaur ◽

Ritu Chauhan ◽

M. Alam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Discriminant Analysis ◽

Medical Data ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Medical Databases ◽

Active Research ◽

Potential Improvement ◽

Large Effort

With the continuous availability of massive experimental medical data has given impetus to a large effort in developing mathematical, statistical and computational intelligent techniques to infer models from medical databases. Feature selection has been an active research area in pattern recognition, statistics, and data mining communities. However, there have been relatively few studies on preprocessing data used as input for data mining systems in medical data. In this chapter, the authors focus on several feature selection methods as to their effectiveness in preprocessing input medical data. They evaluate several feature selection algorithms such as Mutual Information Feature Selection (MIFS), Fast Correlation-Based Filter (FCBF) and Stepwise Discriminant Analysis (STEPDISC) with machine learning algorithm naive Bayesian and Linear Discriminant analysis techniques. The experimental analysis of feature selection technique in medical databases has enable the authors to find small number of informative features leading to potential improvement in medical diagnosis by reducing the size of data set, eliminating irrelevant features, and decreasing the processing time.

Download Full-text