Multi-Layer Hybrid Credit Scoring Model Based on Feature Selection, Ensemble Learning, and Ensemble Classifier

Author(s):  
Diwakar Tripathi ◽  
Alok Kumar Shukla ◽  
Ramchandra Reddy B. ◽  
Ghanshyam S. Bopche

Credit scoring is a process to calculate the risk associated with a credit product, and it directly affects the profitability of that industry. Periodically, financial institutions apply credit scoring in various steps. The main focus of this study is to improve the predictive performance of the credit scoring model. To improve the predictive performance of the model, this study proposes a multi-layer hybrid credit scoring model. The first stage concerns pre-processing, which includes treatment for missing values, data-transformation, and reduction of irrelevant and noisy features because they may affect predictive performance of model. The second stage applies various ensemble learning approaches such as Bagging, Adaboost, etc. At the last layer, it applies ensemble classifiers approach, which combines three heterogeneous classifiers, namely: random forest (RF), logistic regression (LR), and sequential minimal optimization (SMO) approaches for classification. Further, the proposed multi-layer model is validated on various real-world credit scoring datasets.

Author(s):  
Wirot Yotsawat ◽  
Pakaket Wattuya ◽  
Anongnart Srivihok

<span>Several credit-scoring models have been developed using ensemble classifiers in order to improve the accuracy of assessment. However, among the ensemble models, little consideration has been focused on the hyper-parameters tuning of base learners, although these are crucial to constructing ensemble models. This study proposes an improved credit scoring model based on the extreme gradient boosting (XGB) classifier using Bayesian hyper-parameters optimization (XGB-BO). The model comprises two steps. Firstly, data pre-processing is utilized to handle missing values and scale the data. Secondly, Bayesian hyper-parameter optimization is applied to tune the hyper-parameters of the XGB classifier and used to train the model. The model is evaluated on four widely public datasets, i.e., the German, Australia, lending club, and Polish datasets. Several state-of-the-art classification algorithms are implemented for predictive comparison with the proposed method. The results of the proposed model showed promising results, with an improvement in accuracy of 4.10%, 3.03%, and 2.76% on the German, lending club, and Australian datasets, respectively. The proposed model outperformed commonly used techniques, e.g., decision tree, support vector machine, neural network, logistic regression, random forest, and bagging, according to the evaluation results. The experimental results confirmed that the XGB-BO model is suitable for assessing the creditworthiness of applicants.</span>


2018 ◽  
Vol 10 (7) ◽  
pp. 56
Author(s):  
Jie Li ◽  
Zhenyu Sheng

Chinese microfinance institutions need to measure and manage credit risk in a quantitative way in order to improve competitiveness. To establish a credit scoring model (CSM) with sound predictive power, they should examine various models carefully, identify variables, assign values to variables and reduce variable dimensions in an appropriate way. Microfinance institutions could employ both CSM and loan officer’s subjective appraisals to improve risk management level gradually. The paper sets up a CSM based on the data of a microfinance company running from October 2009 to June 2014 in Jiangsu province. As for establishing the model, the paper uses Linear Discriminant Analysis (LDA) method, selects 16 initial variables, employs direct method to assign variables and adopts all the variables into the model. Ten samples are constructed by randomly selecting records. Based on the samples, the coefficients are determined and the final none-standardized discriminant function is established. It is found that Bank credit, Education, Old client and Rate variables have the greatest impact on the discriminant effect. Compared with the same international models, this model’s classification effect is fine. The paper displays the key technical points to build a credit scoring model based on a practical application, which provides help and references for Chinese microfinance institutions to measure and manage credit risk quantitatively.


2021 ◽  
pp. 1-16
Author(s):  
Fang He ◽  
Wenyu Zhang ◽  
Zhijia Yan

Credit scoring has become increasingly important for financial institutions. With the advancement of artificial intelligence, machine learning methods, especially ensemble learning methods, have become increasingly popular for credit scoring. However, the problems of imbalanced data distribution and underutilized feature information have not been well addressed sufficiently. To make the credit scoring model more adaptable to imbalanced datasets, the original model-based synthetic sampling method is extended herein to balance the datasets by generating appropriate minority samples to alleviate class overlap. To enable the credit scoring model to extract inherent correlations from features, a new bagging-based feature transformation method is proposed, which transforms features using a tree-based algorithm and selects features using the chi-square statistic. Furthermore, a two-layer ensemble method that combines the advantages of dynamic ensemble selection and stacking is proposed to improve the classification performance of the proposed multi-stage ensemble model. Finally, four standardized datasets are used to evaluate the performance of the proposed ensemble model using six evaluation metrics. The experimental results confirm that the proposed ensemble model is effective in improving classification performance and is superior to other benchmark models.


2017 ◽  
Vol 13 (1) ◽  
pp. 51 ◽  
Author(s):  
Oriol Amat ◽  
Raffaele Manini ◽  
Marcos Antón Renart

Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1) Which ratios better discriminate the companies based on their being solvent or insolvent? and (2) What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models). Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations:  This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations.Practical implications:  Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit.Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Pranith Kumar Roy ◽  
Krishnendu Shaw

AbstractSmall- and medium-sized enterprises (SMEs) have a crucial influence on the economic development of every nation, but access to formal finance remains a barrier. Similarly, financial institutions encounter challenges in the assessment of SMEs’ creditworthiness for the provision of financing. Financial institutions employ credit scoring models to identify potential borrowers and to determine loan pricing and collateral requirements. SMEs are perceived as unorganized in terms of financial data management compared to large corporations, making the assessment of credit risk based on inadequate financial data a cause for financial institutions’ concern. The majority of existing models are data-driven and have faced criticism for failing to meet their assumptions. To address the issue of limited financial record keeping, this study developed and validated a system to predict SMEs’ credit risk by introducing a multicriteria credit scoring model. The model was constructed using a hybrid best–worst method (BWM) and the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). Initially, the BWM determines the weight criteria, and TOPSIS is applied to score SMEs. A real-life case study was examined to demonstrate the effectiveness of the proposed model, and a sensitivity analysis varying the weight of the criteria was performed to assess robustness against unpredictable financial situations. The findings indicated that SMEs’ credit history, cash liquidity, and repayment period are the most crucial factors in lending, followed by return on capital, financial flexibility, and integrity. The proposed credit scoring model outperformed the existing commercial model in terms of its accuracy in predicting defaults. This model could assist financial institutions, providing a simple means for identifying potential SMEs to grant credit, and advance further research using alternative approaches.


Sign in / Sign up

Export Citation Format

Share Document