Building a Credit Scoring Model Based on Data Mining Approaches

Nowadays, one of the biggest challenges in banking sector, certainly, is assessment of the client’s creditworthiness. In order to improve the decision-making process and risk management, banks resort to using data mining techniques for hidden patterns recognition within a wide data. The main objective of this study is to build a high-performance customized credit scoring model. The model named Reliable client is based on Bank’s real dataset and originally built by applying four different classification algorithms: decision tree (DT), naive Bayes (NB), generalized linear model (GLM) and support vector machine (SVM). Since it showed the greatest results, but also seemed as the most appropriate algorithm, the adopted model is based on GLM algorithm. The results of this model are presented based on many performance measures that showed great predictive confidence and accuracy, but we also demonstrated significant impact of data pre-processing on model performance. Statistical analysis of the model identified the most significant parameters on the model outcome. In the end, created credit scoring model was evaluated using another set of real data of the same Bank.

Download Full-text

Credit concession through credit scoring: Analysis and application proposal

Intangible Capital ◽

10.3926/ic.903 ◽

2017 ◽

Vol 13 (1) ◽

pp. 51 ◽

Cited By ~ 5

Author(s):

Oriol Amat ◽

Raffaele Manini ◽

Marcos Antón Renart

Keyword(s):

Financial Institutions ◽

Design Methodology ◽

Credit Scoring ◽

Real Data ◽

Statistical Techniques ◽

Probability Of Default ◽

Relative Importance ◽

Linear Discriminant ◽

Scoring Model ◽

Credit Scoring Model

Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1) Which ratios better discriminate the companies based on their being solvent or insolvent? and (2) What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models). Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations: This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations.Practical implications: Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit.Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.

Download Full-text

A credit scoring model using support vector machine

Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788) ◽

10.1109/wcica.2004.1341919 ◽

2004 ◽

Author(s):

Xiang Tian ◽

Feiqi Deng

Keyword(s):

Support Vector Machine ◽

Credit Scoring ◽

Support Vector ◽

Scoring Model ◽

Credit Scoring Model

Download Full-text

A Hybrid Credit Scoring Model Based on Genetic Programming and Support Vector Machines

2008 Fourth International Conference on Natural Computation ◽

10.1109/icnc.2008.205 ◽

2008 ◽

Cited By ~ 13

Author(s):

Defu Zhang ◽

Mhand Hifi ◽

Qingshan Chen ◽

Weiguo Ye

Keyword(s):

Support Vector Machines ◽

Genetic Programming ◽

Credit Scoring ◽

Support Vector ◽

Scoring Model ◽

Model Based ◽

Vector Machines ◽

Credit Scoring Model

Download Full-text

Methodology for the validation of the credit scoring model of the retail portfolio

10.18411/lj-05-2021-265 ◽

2021 ◽

Vol 73 (7) ◽

pp. 41-44

Author(s):

Y.S. Zhieru

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Final Stage ◽

Logistic Regression Model ◽

Credit Scoring ◽

Real Data ◽

Scoring Model ◽

Credit Scoring Model

The final stage of constructing a logistic regression model is checking its validity and testing it on real data. The degree of validity of a logistic regression model is evidenced by its ability to correctly classify borrowers, the model's ability to distinguish "good" borrowers from "bad" borrowers.

Download Full-text

Using data mining approaches to build credit scoring model: Case study — Implementation of credit scoring model in microfinance institution

2018 17th International Symposium INFOTEH-JAHORINA (INFOTEH) ◽

10.1109/infoteh.2018.8345543 ◽

2018 ◽

Cited By ~ 3

Author(s):

Jasmina Nalic ◽

Amar Svraka

Keyword(s):

Data Mining ◽

Credit Scoring ◽

Model Case ◽

Scoring Model ◽

Study Implementation ◽

Microfinance Institution ◽

Using Data ◽

Credit Scoring Model

Download Full-text

Credit Scoring Model based on Kernel Density Estimation and Support Vector Machine for Group Feature Selection

2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2018.8554524 ◽

2018 ◽

Author(s):

Xingzhi Zhang ◽

Zhurong Zhou

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Density Estimation ◽

Kernel Density Estimation ◽

Credit Scoring ◽

Kernel Density ◽

Support Vector ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

Multi-Class Support Vector Machine for Credit Scoring

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.235.419 ◽

2012 ◽

Vol 235 ◽

pp. 419-422 ◽

Cited By ~ 1

Author(s):

Bo Tang ◽

Sai Bing Qiu

Keyword(s):

Support Vector Machine ◽

Credit Scoring ◽

Real Life ◽

Assessment Model ◽

Behavior Assessment ◽

Support Vector ◽

Classification Problems ◽

Scoring Model ◽

Multiple Classification ◽

Credit Scoring Model

The general credit scoring model is to solve the two classification problems, but in real life we often encounter multiple classification problems. This paper proposes a multi-class support vector machine, which can solve multiple classification problems in the behavior assessment model.

Download Full-text

Improved credit scoring model using XGBoost with Bayesian hyper-parameter optimization

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i6.pp5477-5487 ◽

2021 ◽

Vol 11 (6) ◽

pp. 5477

Author(s):

Wirot Yotsawat ◽

Pakaket Wattuya ◽

Anongnart Srivihok

Keyword(s):

Parameter Optimization ◽

Missing Values ◽

Credit Scoring ◽

Gradient Boosting ◽

Support Vector ◽

Scoring Model ◽

Ensemble Models ◽

Proposed Model ◽

Extreme Gradient Boosting ◽

Credit Scoring Model

<span>Several credit-scoring models have been developed using ensemble classifiers in order to improve the accuracy of assessment. However, among the ensemble models, little consideration has been focused on the hyper-parameters tuning of base learners, although these are crucial to constructing ensemble models. This study proposes an improved credit scoring model based on the extreme gradient boosting (XGB) classifier using Bayesian hyper-parameters optimization (XGB-BO). The model comprises two steps. Firstly, data pre-processing is utilized to handle missing values and scale the data. Secondly, Bayesian hyper-parameter optimization is applied to tune the hyper-parameters of the XGB classifier and used to train the model. The model is evaluated on four widely public datasets, i.e., the German, Australia, lending club, and Polish datasets. Several state-of-the-art classification algorithms are implemented for predictive comparison with the proposed method. The results of the proposed model showed promising results, with an improvement in accuracy of 4.10%, 3.03%, and 2.76% on the German, lending club, and Australian datasets, respectively. The proposed model outperformed commonly used techniques, e.g., decision tree, support vector machine, neural network, logistic regression, random forest, and bagging, according to the evaluation results. The experimental results confirmed that the XGB-BO model is suitable for assessing the creditworthiness of applicants.</span>

Download Full-text

Automated credit decision process - an insight into developing a credit-scoring model within the Nepalese banking sector

International Journal of Decision Sciences Risk and Management ◽

10.1504/ijdsrm.2012.053380 ◽

2012 ◽

Vol 4 (3/4) ◽

pp. 233 ◽

Cited By ~ 1

Author(s):

Satish Sharma ◽

Jackie Harvey ◽

Andrew Robson

Keyword(s):

Decision Process ◽

Banking Sector ◽

Credit Scoring ◽

Scoring Model ◽

Credit Scoring Model ◽

Insight Into

Download Full-text

Feature Selection in a Credit Scoring Model

Mathematics ◽

10.3390/math9070746 ◽

2021 ◽

Vol 9 (7) ◽

pp. 746

Author(s):

Juan Laborda ◽

Seyong Ryoo

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Superior Performance ◽

Filter Method ◽

Support Vector ◽

Classification Algorithms ◽

Scoring Model ◽

Stepwise Selection ◽

Forward Stepwise ◽

Credit Scoring Model

This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.

Download Full-text