scholarly journals Can System Log Data Enhance the Performance of Credit Scoring?—Evidence from an Internet Bank in Korea

2021 ◽  
Vol 14 (1) ◽  
pp. 130
Author(s):  
Sunghyon Kyeong ◽  
Daehee Kim ◽  
Jinho Shin

The credit scoring model is one of the most important decision-making tools for the sustainability of banking systems. This study is the first to examine whether it can be improved by using system log data that are stoed extensively for system operation. We used the log data recorded by the mobile application system of KakaoBank, a leading internet bank used by more than 14 million people in Korea. After generating candidate variables from KakaoBank’s log data, we created a credit scoring model by utilizing variables with high information values and logistic regression, the most common method for developing credit scoring models in financial institutions. To prove our hypothesis on the improvement of credit scoring model performance, we performed an independent sample t-test using the simulation results of repeated model development and performance measurement based on randomly sampled data. Consequently, the discrimination power of the proposed model using logistic regression (neural network) compared to the credit bureau-based model significantly improved by 1.84 (2.22) percentage points based on the Kolmogorov–Smirnov statistics. The results of this study suggest that a bank can utilize the accumulated log data inside the bank to improve decision-making systems, including credit scoring, at a low cost.

Author(s):  
Sunghyun Kyeong ◽  
Daehee Kim ◽  
Jinho Shin

This study is the first to examine whether the performance of credit rating, one of the most important data-based decision-making of banks, can be improved by using banking system log data that is extensively accumulated inside the bank for system operation. This study uses the log data recorded for the mobile app system of Kakaobank, a leading internet bank used by more than 14 million people in Korea. After generating candidate variables from Kakaobank's vast log data, we develop a credit scoring model by utilizing variables with high information values. Consequently, the discrimination power of the new model compared to the credit bureau grades was significantly improved by 1.84% points based on the Kolmogorov–Smirnov statistics. Therefore, the results of this study imply that if a bank utilizes its log data that have already been extensively accumulated inside the bank, decision-making systems, including credit scoring, can be efficiently improved at a low cost.


2021 ◽  
Vol 73 (7) ◽  
pp. 41-44
Author(s):  
Y.S. Zhieru

The final stage of constructing a logistic regression model is checking its validity and testing it on real data. The degree of validity of a logistic regression model is evidenced by its ability to correctly classify borrowers, the model's ability to distinguish "good" borrowers from "bad" borrowers.


2019 ◽  
Vol 26 (2) ◽  
pp. 405-429 ◽  
Author(s):  
Feng Shen ◽  
Run Wang ◽  
Yu Shen

Credit scoring is an important process for peer-to-peer (P2P) lending companies as it determines whether loan applicants are likely to default. The aim of most credit scoring models is to minimize the classification error rate, which implies that all classification errors bear the same cost; however, in reality, there is a significant cost-sensitive problem in credit scoring methods. Therefore, in this paper, a new cost-sensitive logistic regression credit scoring model based on a multi-objective optimization approach is proposed that has two objectives in the cost-sensitive logistic regression process. The cost-sensitive logistic regression parameters are solved using a multiple objective particle swarm optimization (MOPSO) algorithm. In the empirical analysis, the proposed model was applied to the credit scoring of a Chinese famous P2P company, from which it was found that compared with other common credit scoring models, the proposed model was able to effectively reduce type II error rates and total classification error costs, and improve the AUC, the F1 values (reconciliation average of Recall and Precision), and the G-means. The proposed model was compared with other multi-objective optimization algorithms to further demonstrate that MOPSO is the best approach for cost-sensitive logistic regression credit scoring models.


2018 ◽  
Vol 1 (1) ◽  
pp. 43-56
Author(s):  
Rio Hendriadi ◽  
Anne Putri ◽  
Dona Amelia ◽  
Rany Syafrina

Objective – This research is conducted to design and to develop credit scoring model on conventional bank in order to determine individual loan, the research takes place in PT BPR Sungai Puar, Kabupaten Agam. This model tries to evaluate the credit risk of BPR Sungai Puar.Design/methodology – The data are considered as secondary sources as they are taken from BPR Sungai Puar database by classifying them into two analysis tools including discriminant analysis and logistic regression. Results – The resuts are presentes inform of model and credit scoring perfection on PT BPR Sungai Puar Kabupaten Agam.Keywords Credit Scoring Model, Conventional Banks, Individual Loan


2013 ◽  
Vol 7 (18) ◽  
pp. 1791-1805
Author(s):  
Chi Bo Wen ◽  
Hsu Chiun Chieh ◽  
Ho Mei Hung

2019 ◽  
Vol 16 (8) ◽  
pp. 3514-3518
Author(s):  
Kamya Eria ◽  
Preethi Subramanian

Credit scoring plays a vital role in assessing the creditworthiness of loan applicants thus speeding up the approval process. Credit score models however rely on the accuracy of classification models for their performance. This accuracy performance depends not only on the choice of data mining process; it is heavily influenced by the quality of data as well. Although no techniques can be favored over the other, it has been evidenced that logistic regression has been widely employed as an industrial technique for its comprehensive simplicity. This study proposes a SEMMA-based credit scoring model developed with an improved Logistic Regression (LR) model. Improvements are by exclusion of irrelevant features and adjusting the partition ratios. The model has been compared with the predominant models and proved to contain outstanding results with minimal credit decision errors.


Sign in / Sign up

Export Citation Format

Share Document