IMPROVING CLASSIFICATION ACCURACY AND CAUSAL KNOWLEDGE FOR BETTER CREDIT DECISIONS

2011 ◽  
Vol 21 (04) ◽  
pp. 297-309 ◽  
Author(s):  
WEI-WEN WU

Numerous studies have contributed to efforts to boost the accuracy of the credit scoring model. Especially interesting are recent studies which have successfully developed the hybrid approach, which advances classification accuracy by combining different machine learning techniques. However, to achieve better credit decisions, it is not enough merely to increase the accuracy of the credit scoring model. It is necessary to conduct meaningful supplementary analyses in order to obtain knowledge of causal relations, particularly in terms of significant conceptual patterns or structures involving attributes used in the credit scoring model. This paper proposes a solution of integrating data preprocessing strategies and the Bayesian network classifier with the tree augmented Na"ıve Bayes search algorithm, in order to improve classification accuracy and to obtain improved knowledge of causal patterns, thus enhancing the validity of credit decisions.

Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 194
Author(s):  
Sarah Gonzalez ◽  
Paul Stegall ◽  
Harvey Edwards ◽  
Leia Stirling ◽  
Ho Chit Siu

The field of human activity recognition (HAR) often utilizes wearable sensors and machine learning techniques in order to identify the actions of the subject. This paper considers the activity recognition of walking and running while using a support vector machine (SVM) that was trained on principal components derived from wearable sensor data. An ablation analysis is performed in order to select the subset of sensors that yield the highest classification accuracy. The paper also compares principal components across trials to inform the similarity of the trials. Five subjects were instructed to perform standing, walking, running, and sprinting on a self-paced treadmill, and the data were recorded while using surface electromyography sensors (sEMGs), inertial measurement units (IMUs), and force plates. When all of the sensors were included, the SVM had over 90% classification accuracy using only the first three principal components of the data with the classes of stand, walk, and run/sprint (combined run and sprint class). It was found that sensors that were placed only on the lower leg produce higher accuracies than sensors placed on the upper leg. There was a small decrease in accuracy when the force plates are ablated, but the difference may not be operationally relevant. Using only accelerometers without sEMGs was shown to decrease the accuracy of the SVM.


2018 ◽  
Vol 10 (7) ◽  
pp. 56
Author(s):  
Jie Li ◽  
Zhenyu Sheng

Chinese microfinance institutions need to measure and manage credit risk in a quantitative way in order to improve competitiveness. To establish a credit scoring model (CSM) with sound predictive power, they should examine various models carefully, identify variables, assign values to variables and reduce variable dimensions in an appropriate way. Microfinance institutions could employ both CSM and loan officer’s subjective appraisals to improve risk management level gradually. The paper sets up a CSM based on the data of a microfinance company running from October 2009 to June 2014 in Jiangsu province. As for establishing the model, the paper uses Linear Discriminant Analysis (LDA) method, selects 16 initial variables, employs direct method to assign variables and adopts all the variables into the model. Ten samples are constructed by randomly selecting records. Based on the samples, the coefficients are determined and the final none-standardized discriminant function is established. It is found that Bank credit, Education, Old client and Rate variables have the greatest impact on the discriminant effect. Compared with the same international models, this model’s classification effect is fine. The paper displays the key technical points to build a credit scoring model based on a practical application, which provides help and references for Chinese microfinance institutions to measure and manage credit risk quantitatively.


2021 ◽  
pp. 1-16
Author(s):  
Fang He ◽  
Wenyu Zhang ◽  
Zhijia Yan

Credit scoring has become increasingly important for financial institutions. With the advancement of artificial intelligence, machine learning methods, especially ensemble learning methods, have become increasingly popular for credit scoring. However, the problems of imbalanced data distribution and underutilized feature information have not been well addressed sufficiently. To make the credit scoring model more adaptable to imbalanced datasets, the original model-based synthetic sampling method is extended herein to balance the datasets by generating appropriate minority samples to alleviate class overlap. To enable the credit scoring model to extract inherent correlations from features, a new bagging-based feature transformation method is proposed, which transforms features using a tree-based algorithm and selects features using the chi-square statistic. Furthermore, a two-layer ensemble method that combines the advantages of dynamic ensemble selection and stacking is proposed to improve the classification performance of the proposed multi-stage ensemble model. Finally, four standardized datasets are used to evaluate the performance of the proposed ensemble model using six evaluation metrics. The experimental results confirm that the proposed ensemble model is effective in improving classification performance and is superior to other benchmark models.


2017 ◽  
Vol 13 (1) ◽  
pp. 51 ◽  
Author(s):  
Oriol Amat ◽  
Raffaele Manini ◽  
Marcos Antón Renart

Purpose: The study herein develops and tests a credit scoring model which can help financial institutions in assessing credit requests. Design/methodology/approach: The empirical study has the objective of answering two questions: (1) Which ratios better discriminate the companies based on their being solvent or insolvent? and (2) What is the relative importance of these ratios? To do this, several statistical techniques with a multifactorial focus have been used (Multivariate Analysis of Variance, Linear Discriminant Analysis, Logit and Probit Models). Several samples of companies have been used in order to obtain and to test the model. Findings: Through the application of several statistical techniques, the credit scoring model has been proved to be effective in discriminating between good and bad creditors. Research limitations:  This study focuses on manufacturing, commercial and services companies of all sizes in Spain; Therefore, the conclusions may differ for other geographical locations.Practical implications:  Because credit is one of the main drivers of growth, a solid credit scoring model can help financial institutions assessing to whom to grant credit and to whom not to grant credit.Social implications: Because of the growing importance of credit for our society and the fear of granting it due to the latest financial turmoil, a solid credit scoring model can strengthen the trust toward the financial institutions assessment’s. Originality/value: There is already a stream of literature related to credit scoring. However, this paper focuses on Spanish firms and proves the results of our model based on real data. The application of the model to detect the probability of default in loans is original.


2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Pranith Kumar Roy ◽  
Krishnendu Shaw

AbstractSmall- and medium-sized enterprises (SMEs) have a crucial influence on the economic development of every nation, but access to formal finance remains a barrier. Similarly, financial institutions encounter challenges in the assessment of SMEs’ creditworthiness for the provision of financing. Financial institutions employ credit scoring models to identify potential borrowers and to determine loan pricing and collateral requirements. SMEs are perceived as unorganized in terms of financial data management compared to large corporations, making the assessment of credit risk based on inadequate financial data a cause for financial institutions’ concern. The majority of existing models are data-driven and have faced criticism for failing to meet their assumptions. To address the issue of limited financial record keeping, this study developed and validated a system to predict SMEs’ credit risk by introducing a multicriteria credit scoring model. The model was constructed using a hybrid best–worst method (BWM) and the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). Initially, the BWM determines the weight criteria, and TOPSIS is applied to score SMEs. A real-life case study was examined to demonstrate the effectiveness of the proposed model, and a sensitivity analysis varying the weight of the criteria was performed to assess robustness against unpredictable financial situations. The findings indicated that SMEs’ credit history, cash liquidity, and repayment period are the most crucial factors in lending, followed by return on capital, financial flexibility, and integrity. The proposed credit scoring model outperformed the existing commercial model in terms of its accuracy in predicting defaults. This model could assist financial institutions, providing a simple means for identifying potential SMEs to grant credit, and advance further research using alternative approaches.


2018 ◽  
Vol 6 (2) ◽  
pp. 129-141
Author(s):  
Anjali Chopra ◽  
Priyanka Bhilare

Loan default is a serious problem in banking industries. Banking systems have strong processes in place for identification of customers with poor credit risk scores; however, most of the credit scoring models need to be constantly updated with newer variables and statistical techniques for improved accuracy. While totally eliminating default is almost impossible, loan risk teams, however, minimize the rate of default, thereby protecting banks from the adverse effects of loan default. Credit scoring models have used logistic regression and linear discriminant analysis for identification of potential defaulters. Newer and contemporary machine learning techniques have the ability to outperform classic old age techniques. This article aims to conduct empirical analysis on publically available bank loan dataset to study banking loan default using decision tree as the base learner and comparing it with ensemble tree learning techniques such as bagging, boosting, and random forests. The results of the empirical analysis suggest that the gradient boosting model outperforms the base decision tree learner, indicating that ensemble model works better than individual models. The study recommends that the risk team should adopt newer contemporary techniques to achieve better accuracy resulting in effective loan recovery strategies.


Sign in / Sign up

Export Citation Format

Share Document