Fintech Credit Scoring Techniques for Evaluating P2P Loan Applications – A Python Machine Learning Ensemble Approach

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.

Download Full-text

Bayesian machine learning ensemble approach to quantify model uncertainty in predicting groundwater storage change

The Science of The Total Environment ◽

10.1016/j.scitotenv.2020.144715 ◽

2021 ◽

Vol 769 ◽

pp. 144715

Author(s):

Jina Yin ◽

Josué Medellín-Azuara ◽

Alvar Escriva-Bou ◽

Zhu Liu

Keyword(s):

Machine Learning ◽

Model Uncertainty ◽

Groundwater Storage ◽

Ensemble Approach ◽

Bayesian Machine Learning ◽

Storage Change

Download Full-text

Application of explainable machine learning based on Catboost in credit scoring

Journal of Physics Conference Series ◽

10.1088/1742-6596/1955/1/012039 ◽

2021 ◽

Vol 1955 (1) ◽

pp. 012039

Author(s):

Ji Qi ◽

Ruicheng Yang ◽

Pucong Wang

Keyword(s):

Machine Learning ◽

Credit Scoring

Download Full-text

Algorithmic fairness in credit scoring

Oxford Review of Economic Policy ◽

10.1093/oxrep/grab020 ◽

2021 ◽

Vol 37 (3) ◽

pp. 585-617

Author(s):

Teresa Bono ◽

Karen Croxson ◽

Adam Giles

Keyword(s):

Machine Learning ◽

Credit Scoring ◽

Large Data ◽

Error Rates ◽

The Past ◽

Ensemble Machine Learning ◽

Hidden Patterns ◽

Credit Scoring Model ◽

Distributional Impacts ◽

Modelling Approach

Abstract The use of machine learning as an input into decision-making is on the rise, owing to its ability to uncover hidden patterns in large data and improve prediction accuracy. Questions have been raised, however, about the potential distributional impacts of these technologies, with one concern being that they may perpetuate or even amplify human biases from the past. Exploiting detailed credit file data for 800,000 UK borrowers, we simulate a switch from a traditional (logit) credit scoring model to ensemble machine-learning methods. We confirm that machine-learning models are more accurate overall. We also find that they do as well as the simpler traditional model on relevant fairness criteria, where these criteria pertain to overall accuracy and error rates for population subgroups defined along protected or sensitive lines (gender, race, health status, and deprivation). We do observe some differences in the way credit-scoring models perform for different subgroups, but these manifest under a traditional modelling approach and switching to machine learning neither exacerbates nor eliminates these issues. The paper discusses some of the mechanical and data factors that may contribute to statistical fairness issues in the context of credit scoring.

Download Full-text

An ensemble approach to stabilize the features for multi-domain sentiment analysis using supervised machine learning

Journal Of Big Data ◽

10.1186/s40537-018-0152-5 ◽

2018 ◽

Vol 5 (1) ◽

Cited By ~ 4

Author(s):

Monalisa Ghosh ◽

Goutam Sanyal

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Supervised Machine Learning ◽

Ensemble Approach

Download Full-text

A Discretized Enriched Technique to Enhance Machine Learning Performance in Credit Scoring

Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management ◽

10.5220/0008377702020213 ◽

2019 ◽

Author(s):

Roberto Saia ◽

Salvatore Carta ◽

Diego Recupero ◽

Gianni Fenu ◽

Marco Saia

Keyword(s):

Machine Learning ◽

Credit Scoring ◽

Learning Performance

Download Full-text

Novel machine learning ensemble approach for landslide prediction

2019 International Research Conference on Smart Computing and Systems Engineering (SCSE) ◽

10.23919/scse.2019.8842762 ◽

2019 ◽

Author(s):

C. N. Madawala ◽

B. T. G. S. Kumara ◽

L. Indrathilaka

Keyword(s):

Machine Learning ◽

Landslide Prediction ◽

Ensemble Approach

Download Full-text

A Unified Definition of Mutual Information with Applications in Machine Learning

Mathematical Problems in Engineering ◽

10.1155/2015/201874 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 11

Author(s):

Guoping Zeng

Keyword(s):

Machine Learning ◽

Mutual Information ◽

Joint Distribution ◽

Credit Scoring ◽

Random Variables ◽

Joint Probability ◽

Class 1 ◽

Probability Spaces ◽

Definition Of ◽

Joint Probabilities

There are various definitions of mutual information. Essentially, these definitions can be divided into two classes: (1) definitions with random variables and (2) definitions with ensembles. However, there are some mathematical flaws in these definitions. For instance, Class 1 definitions either neglect the probability spaces or assume the two random variables have the same probability space. Class 2 definitions redefine marginal probabilities from the joint probabilities. In fact, the marginal probabilities are given from the ensembles and should not be redefined from the joint probabilities. Both Class 1 and Class 2 definitions assume a joint distribution exists. Yet, they all ignore an important fact that the joint or the joint probability measure is not unique. In this paper, we first present a new unified definition of mutual information to cover all the various definitions and to fix their mathematical flaws. Our idea is to define the joint distribution of two random variables by taking the marginal probabilities into consideration. Next, we establish some properties of the newly defined mutual information. We then propose a method to calculate mutual information in machine learning. Finally, we apply our newly defined mutual information to credit scoring.

Download Full-text

Credit Scoring Using Ensemble Machine Learning

2009 Ninth International Conference on Hybrid Intelligent Systems ◽

10.1109/his.2009.264 ◽

2009 ◽

Cited By ~ 2

Author(s):

Ping Yao

Keyword(s):

Machine Learning ◽

Credit Scoring ◽

Ensemble Machine Learning

Download Full-text

Review of Machine Learning models for Credit Scoring Analysis

Ingeniería solidaria ◽

10.16925/2357-6014.2020.01.11 ◽

2020 ◽

Vol 16 (1) ◽

Author(s):

Madapuri Rudra Kumar ◽

Vinit Kumar Gunjan

Keyword(s):

Machine Learning ◽

Financial Institutions ◽

Profile Analysis ◽

Credit Scoring ◽

Process Models ◽

Machine Learning Techniques ◽

Process Conditions ◽

Learning Models ◽

Credit Score ◽

Machine Learning Models

Introduction:Increase in computing power and the deeper usage of the robust computing systems in the financial system is propelling the business growth, improving the operational efficiency of the financial institutions, and increasing the effectiveness of the transaction processing solutions used by the organizations. Problem:Despite that the financial institutions are relying on the credit scoring patterns for analyzing the credit worthiness of the clients, still there are many factors that are imminent for improvement in the credit score evaluation patterns. Objective:Machine learning is offering immense potential in Fintech space and determining a personal credit score. Organizations by applying deep learning and machine learning techniques can tap individuals who are not being serviced by traditional financial institutions. Methodology:One of the major insights into the system is that the traditional models of banking intelligence solutions are predominantly the programmed models that can align with the information and banking systems that are used by the banks. But in the case of the machine-learning models that rely on algorithmic systems require more integral computation which is intrinsic. Results:The test analysis of the proposed machine learning model indicates effective and enhanced analysis process compared to the non-machine learning solutions. The model in terms of using various classifiers indicate potential ways in which the solution can be significant. Conclusion: If the systems can be developed to align with more pragmatic terms for analysis, it can help in improving the process conditions of customer profile analysis, wherein the process models have to be developed for comprehensive analysis and the ones that can make a sustainable solution for the credit system management. Originality:The proposed solution is effective and the one conceptualized to improve the credit scoring system patterns. Limitations: The model is tested in isolation and not in comparison to any of the existing credit scoring patterns.

Download Full-text