Feature Selection in Credit Scoring Model for Credit Card Applicants in XYZ Bank: A Comparative Study

Author(s):  
Mediana Aryuni ◽  
Evaristus Didik Madyatmadja
2019 ◽  
Vol 35 (2) ◽  
pp. 371-394 ◽  
Author(s):  
Diwakar Tripathi ◽  
Damodar Reddy Edla ◽  
Ramalingaswamy Cheruku ◽  
Venkatanareshbabu Kuppili

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Dayu Xu ◽  
Xuyao Zhang ◽  
Junguo Hu ◽  
Jiahao Chen

This paper mainly discusses the hybrid application of ensemble learning, classification, and feature selection (FS) algorithms simultaneously based on training data balancing for helping the proposed credit scoring model perform more effectively, which comprises three major stages. Firstly, it conducts preprocessing for collected credit data. Then, an efficient feature selection algorithm based on adaptive elastic net is employed to reduce the weakly related or uncorrelated variables to get high-quality training data. Thirdly, a novel ensemble strategy is proposed to make the imbalanced training data set balanced for each extreme learning machine (ELM) classifier. Finally, a new weighting method for single ELM classifiers in the ensemble model is established with respect to their classification accuracy based on generalized fuzzy soft sets (GFSS) theory. A novel cosine-based distance measurement algorithm of GFSS is also proposed to calculate the weights of each ELM classifier. To confirm the efficiency of the proposed ensemble credit scoring model, we implemented experiments with real-world credit data sets for comparison. The process of analysis, outcomes, and mathematical tests proved that the proposed model is capable of improving the effectiveness of classification in average accuracy, area under the curve (AUC), H-measure, and Brier’s score compared to all other single classifiers and ensemble approaches.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 746
Author(s):  
Juan Laborda ◽  
Seyong Ryoo

This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.


Sign in / Sign up

Export Citation Format

Share Document