A novel multi-stage ensemble model for credit scoring based on synthetic sampling and feature transformation

2021 ◽  
pp. 1-16
Author(s):  
Fang He ◽  
Wenyu Zhang ◽  
Zhijia Yan

Credit scoring has become increasingly important for financial institutions. With the advancement of artificial intelligence, machine learning methods, especially ensemble learning methods, have become increasingly popular for credit scoring. However, the problems of imbalanced data distribution and underutilized feature information have not been well addressed sufficiently. To make the credit scoring model more adaptable to imbalanced datasets, the original model-based synthetic sampling method is extended herein to balance the datasets by generating appropriate minority samples to alleviate class overlap. To enable the credit scoring model to extract inherent correlations from features, a new bagging-based feature transformation method is proposed, which transforms features using a tree-based algorithm and selects features using the chi-square statistic. Furthermore, a two-layer ensemble method that combines the advantages of dynamic ensemble selection and stacking is proposed to improve the classification performance of the proposed multi-stage ensemble model. Finally, four standardized datasets are used to evaluate the performance of the proposed ensemble model using six evaluation metrics. The experimental results confirm that the proposed ensemble model is effective in improving classification performance and is superior to other benchmark models.

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


2018 ◽  
Vol 10 (7) ◽  
pp. 56
Author(s):  
Jie Li ◽  
Zhenyu Sheng

Chinese microfinance institutions need to measure and manage credit risk in a quantitative way in order to improve competitiveness. To establish a credit scoring model (CSM) with sound predictive power, they should examine various models carefully, identify variables, assign values to variables and reduce variable dimensions in an appropriate way. Microfinance institutions could employ both CSM and loan officer’s subjective appraisals to improve risk management level gradually. The paper sets up a CSM based on the data of a microfinance company running from October 2009 to June 2014 in Jiangsu province. As for establishing the model, the paper uses Linear Discriminant Analysis (LDA) method, selects 16 initial variables, employs direct method to assign variables and adopts all the variables into the model. Ten samples are constructed by randomly selecting records. Based on the samples, the coefficients are determined and the final none-standardized discriminant function is established. It is found that Bank credit, Education, Old client and Rate variables have the greatest impact on the discriminant effect. Compared with the same international models, this model’s classification effect is fine. The paper displays the key technical points to build a credit scoring model based on a practical application, which provides help and references for Chinese microfinance institutions to measure and manage credit risk quantitatively.


2017 ◽  
Vol 13 (1) ◽  
pp. 51 ◽  
Author(s):  
Oriol Amat ◽  
Raffaele Manini ◽  
Marcos Antón Renart

Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1) Which ratios better discriminate the companies based on their being solvent or insolvent? and (2) What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models). Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations:  This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations.Practical implications:  Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit.Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Pranith Kumar Roy ◽  
Krishnendu Shaw

AbstractSmall- and medium-sized enterprises (SMEs) have a crucial influence on the economic development of every nation, but access to formal finance remains a barrier. Similarly, financial institutions encounter challenges in the assessment of SMEs’ creditworthiness for the provision of financing. Financial institutions employ credit scoring models to identify potential borrowers and to determine loan pricing and collateral requirements. SMEs are perceived as unorganized in terms of financial data management compared to large corporations, making the assessment of credit risk based on inadequate financial data a cause for financial institutions’ concern. The majority of existing models are data-driven and have faced criticism for failing to meet their assumptions. To address the issue of limited financial record keeping, this study developed and validated a system to predict SMEs’ credit risk by introducing a multicriteria credit scoring model. The model was constructed using a hybrid best–worst method (BWM) and the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). Initially, the BWM determines the weight criteria, and TOPSIS is applied to score SMEs. A real-life case study was examined to demonstrate the effectiveness of the proposed model, and a sensitivity analysis varying the weight of the criteria was performed to assess robustness against unpredictable financial situations. The findings indicated that SMEs’ credit history, cash liquidity, and repayment period are the most crucial factors in lending, followed by return on capital, financial flexibility, and integrity. The proposed credit scoring model outperformed the existing commercial model in terms of its accuracy in predicting defaults. This model could assist financial institutions, providing a simple means for identifying potential SMEs to grant credit, and advance further research using alternative approaches.


Sign in / Sign up

Export Citation Format

Share Document