A novel multi-stage ensemble model for credit scoring based on synthetic sampling and feature transformation

Credit scoring has become increasingly important for financial institutions. With the advancement of artificial intelligence, machine learning methods, especially ensemble learning methods, have become increasingly popular for credit scoring. However, the problems of imbalanced data distribution and underutilized feature information have not been well addressed sufficiently. To make the credit scoring model more adaptable to imbalanced datasets, the original model-based synthetic sampling method is extended herein to balance the datasets by generating appropriate minority samples to alleviate class overlap. To enable the credit scoring model to extract inherent correlations from features, a new bagging-based feature transformation method is proposed, which transforms features using a tree-based algorithm and selects features using the chi-square statistic. Furthermore, a two-layer ensemble method that combines the advantages of dynamic ensemble selection and stacking is proposed to improve the classification performance of the proposed multi-stage ensemble model. Finally, four standardized datasets are used to evaluate the performance of the proposed ensemble model using six evaluation metrics. The experimental results confirm that the proposed ensemble model is effective in improving classification performance and is superior to other benchmark models.

Download Full-text

A novel multi-stage ensemble model with multiple K-means-based selective undersampling: An application in credit scoring

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201954 ◽

2021 ◽

Vol 40 (5) ◽

pp. 9471-9484

Author(s):

Yilun Jin ◽

Yanan Liu ◽

Wenyu Zhang ◽

Shuai Zhang ◽

Yu Lou

Keyword(s):

Machine Learning ◽

Predictive Accuracy ◽

Credit Scoring ◽

Imbalanced Data ◽

Ensemble Model ◽

Selective Sampling ◽

Machine Learning Methods ◽

Multi Stage ◽

Proposed Model ◽

New Feature

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.

Download Full-text

A novel multi-stage ensemble model with a hybrid genetic algorithm for credit scoring on imbalanced data

IEEE Access ◽

10.1109/access.2021.3120086 ◽

2021 ◽

pp. 1-1

Author(s):

Yilun Jin ◽

Wenyu Zhang ◽

Xin Wu ◽

Yanan Liu ◽

Zeqian Hu

Keyword(s):

Genetic Algorithm ◽

Credit Scoring ◽

Hybrid Genetic Algorithm ◽

Imbalanced Data ◽

Ensemble Model ◽

Multi Stage

Download Full-text

A heterogeneous ensemble credit scoring model based on adaptive classifier selection: An application on imbalanced data

International Journal of Finance & Economics ◽

10.1002/ijfe.2019 ◽

2020 ◽

Author(s):

Tong Zhang ◽

Guotai Chi

Keyword(s):

Credit Scoring ◽

Imbalanced Data ◽

Scoring Model ◽

Model Based ◽

Classifier Selection ◽

Heterogeneous Ensemble ◽

Credit Scoring Model

Download Full-text

Two‐Stage Credit Scoring Model Based on Evolutionary Feature Selection and Ensemble Neural Networks

Machine Learning Algorithms and Applications ◽

10.1002/9781119769262.ch6 ◽

2021 ◽

pp. 99-115

Author(s):

Diwakar Tripathi ◽

Damodar Reddy Edla ◽

Annushree Bablani ◽

Venkatanareshbabu Kuppili

Keyword(s):

Neural Networks ◽

Feature Selection ◽

Credit Scoring ◽

Two Stage ◽

Scoring Model ◽

Model Based ◽

Ensemble Neural Networks ◽

Credit Scoring Model

Download Full-text

Credit scoring model based on multilayer perceptron

Proceedings of the 2003 IEEE International Symposium on Intelligent Control ISIC-03 ◽

10.1109/isic.2003.1254686 ◽

2003 ◽

Author(s):

Sulin Pang ◽

Yanming Wang ◽

Yuanhuai Bai

Keyword(s):

Multilayer Perceptron ◽

Credit Scoring ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

Measuring and Managing Credit Risk for Chinese Microfinance Institutions

International Journal of Economics and Finance ◽

10.5539/ijef.v10n7p56 ◽

2018 ◽

Vol 10 (7) ◽

pp. 56

Author(s):

Jie Li ◽

Zhenyu Sheng

Keyword(s):

Credit Risk ◽

Predictive Power ◽

Direct Method ◽

Credit Scoring ◽

Microfinance Institutions ◽

Linear Discriminant ◽

Management Level ◽

Scoring Model ◽

Subjective Appraisals ◽

Credit Scoring Model

Chinese microfinance institutions need to measure and manage credit risk in a quantitative way in order to improve competitiveness. To establish a credit scoring model (CSM) with sound predictive power, they should examine various models carefully, identify variables, assign values to variables and reduce variable dimensions in an appropriate way. Microfinance institutions could employ both CSM and loan officer’s subjective appraisals to improve risk management level gradually. The paper sets up a CSM based on the data of a microfinance company running from October 2009 to June 2014 in Jiangsu province. As for establishing the model, the paper uses Linear Discriminant Analysis (LDA) method, selects 16 initial variables, employs direct method to assign variables and adopts all the variables into the model. Ten samples are constructed by randomly selecting records. Based on the samples, the coefficients are determined and the final none-standardized discriminant function is established. It is found that Bank credit, Education, Old client and Rate variables have the greatest impact on the discriminant effect. Compared with the same international models, this model’s classification effect is fine. The paper displays the key technical points to build a credit scoring model based on a practical application, which provides help and references for Chinese microfinance institutions to measure and manage credit risk quantitatively.

Download Full-text

Credit concession through credit scoring: Analysis and application proposal

Intangible Capital ◽

10.3926/ic.903 ◽

2017 ◽

Vol 13 (1) ◽

pp. 51 ◽

Cited By ~ 5

Author(s):

Oriol Amat ◽

Raffaele Manini ◽

Marcos Antón Renart

Keyword(s):

Financial Institutions ◽

Design Methodology ◽

Credit Scoring ◽

Real Data ◽

Statistical Techniques ◽

Probability Of Default ◽

Relative Importance ◽

Linear Discriminant ◽

Scoring Model ◽

Credit Scoring Model

Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1) Which ratios better discriminate the companies based on their being solvent or insolvent? and (2) What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models). Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations: This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations.Practical implications: Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit.Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.

Download Full-text

Credit scoring model for SME customer assessment in a telco company

Understanding Digital Industry ◽

10.1201/9780367814557-35 ◽

2020 ◽

pp. 141-144

Author(s):

L.A. Baranti ◽

A. Lutfi

Keyword(s):

Credit Scoring ◽

Scoring Model ◽

Credit Scoring Model

Download Full-text

A multicriteria credit scoring model for SMEs using hybrid BWM and TOPSIS

Financial Innovation ◽

10.1186/s40854-021-00295-5 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Pranith Kumar Roy ◽

Krishnendu Shaw

Keyword(s):

Credit Risk ◽

Financial Institutions ◽

Credit Scoring ◽

Real Life ◽

Financial Data ◽

Loan Pricing ◽

Scoring Model ◽

Record Keeping ◽

Return On Capital ◽

Credit Scoring Model

AbstractSmall- and medium-sized enterprises (SMEs) have a crucial influence on the economic development of every nation, but access to formal finance remains a barrier. Similarly, financial institutions encounter challenges in the assessment of SMEs’ creditworthiness for the provision of financing. Financial institutions employ credit scoring models to identify potential borrowers and to determine loan pricing and collateral requirements. SMEs are perceived as unorganized in terms of financial data management compared to large corporations, making the assessment of credit risk based on inadequate financial data a cause for financial institutions’ concern. The majority of existing models are data-driven and have faced criticism for failing to meet their assumptions. To address the issue of limited financial record keeping, this study developed and validated a system to predict SMEs’ credit risk by introducing a multicriteria credit scoring model. The model was constructed using a hybrid best–worst method (BWM) and the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). Initially, the BWM determines the weight criteria, and TOPSIS is applied to score SMEs. A real-life case study was examined to demonstrate the effectiveness of the proposed model, and a sensitivity analysis varying the weight of the criteria was performed to assess robustness against unpredictable financial situations. The findings indicated that SMEs’ credit history, cash liquidity, and repayment period are the most crucial factors in lending, followed by return on capital, financial flexibility, and integrity. The proposed credit scoring model outperformed the existing commercial model in terms of its accuracy in predicting defaults. This model could assist financial institutions, providing a simple means for identifying potential SMEs to grant credit, and advance further research using alternative approaches.

Download Full-text