Explainable Machine Learning Models of Consumer Credit Risk

2022 ◽  
Author(s):  
Randall Davis ◽  
Andrew W. Lo ◽  
Sudhanshu Mishra ◽  
Arash Nourian ◽  
Manish Singh ◽  
...  
2020 ◽  
Vol 34 (08) ◽  
pp. 13396-13401
Author(s):  
Wei Wang ◽  
Christopher Lesner ◽  
Alexander Ran ◽  
Marko Rukonic ◽  
Jason Xue ◽  
...  

Machine learning applied to financial transaction records can predict how likely a small business is to repay a loan. For this purpose we compared a traditional scorecard credit risk model against various machine learning models and found that XGBoost with monotonic constraints outperformed scorecard model by 7% in K-S statistic. To deploy such a machine learning model in production for loan application risk scoring it must comply with lending industry regulations that require lenders to provide understandable and specific reasons for credit decisions. Thus we also developed a loan decision explanation technique based on the ideas of WoE and SHAP. Our research was carried out using a historical dataset of tens of thousands of loans and millions of associated financial transactions. The credit risk scoring model based on XGBoost with monotonic constraints and SHAP explanations described in this paper have been deployed by QuickBooks Capital to assess incoming loan applications since July 2019.


2021 ◽  
Vol 14 (3) ◽  
pp. 138
Author(s):  
Fisnik Doko ◽  
Slobodan Kalajdziski ◽  
Igor Mishkovski

Data science and machine-learning techniques help banks to optimize enterprise operations, enhance risk analyses and gain competitive advantage. There is a vast amount of research in credit risk, but to our knowledge, none of them uses credit registry as a data source to model the probability of default for individual clients. The goal of this paper is to evaluate different machine-learning models to create accurate model for credit risk assessment using the data from the real credit registry dataset of the Central Bank of Republic of North Macedonia. We strongly believe that the model developed in this research will be an additional source of valuable information to commercial banks, by leveraging historical data for all the population of the country in all the commercial banks. Thus, in this research, we compare five machine-learning models to classify credit risk data, i.e., logistic regression, decision tree, random forest, support vector machines (SVM) and neural network. We evaluate the five models using different machine-learning metrics, and we propose a model based on credit registry data from the central bank with detailed methodology that can predict the credit risk based on credit history of the population in the country. Our results show that the best accuracy is achieved by using decision tree performing on imbalanced data with and without scaling, followed by random forest and linear regression.


2013 ◽  
Vol 40 (13) ◽  
pp. 5125-5131 ◽  
Author(s):  
Jochen Kruppa ◽  
Alexandra Schwarz ◽  
Gerhard Arminger ◽  
Andreas Ziegler

Sign in / Sign up

Export Citation Format

Share Document