scholarly journals Credit scoring analysis using weighted k nearest neighbor

2018 ◽  
Vol 1025 ◽  
pp. 012114
Author(s):  
M A Mukid ◽  
T Widiharih ◽  
A Rusgiyono ◽  
A Prahutama
Author(s):  
Fei-Long Chen ◽  
Feng-Chia Li

Credit scoring is an important topic for businesses and socio-economic establishments collecting huge amounts of data, with the intention of making the wrong decision obsolete. In this paper, the authors propose four approaches that combine four well-known classifiers, such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Back-Propagation Network (BPN) and Extreme Learning Machine (ELM). These classifiers are used to find a suitable hybrid classifier combination featuring selection that retains sufficient information for classification purposes. In this regard, different credit scoring combinations are constructed by selecting features with four approaches and classifiers than would otherwise be chosen. Two credit data sets from the University of California, Irvine (UCI), are chosen to evaluate the accuracy of the various hybrid features selection models. In this paper, the procedures that are part of the proposed approaches are described and then evaluated for their performances.


Author(s):  
Abbas Keramati ◽  
Niloofar Yousefi ◽  
Amin Omidvar

Credit scoring has become a very important issue due to the recent growth of the credit industry. As the first objective, this chapter provides an academic database of literature between and proposes a classification scheme to classify the articles. The second objective of this chapter is to suggest the employing of the Optimally Weighted Fuzzy K-Nearest Neighbor (OWFKNN) algorithm for credit scoring. To show the performance of this method, two real world datasets from UCI database are used. In classification task, the empirical results demonstrate that the OWFKNN outperforms the conventional KNN and fuzzy KNN methods and also other methods. In the predictive accuracy of probability of default, the OWFKNN also show the best performance among the other methods. The results in this chapter suggest that the OWFKNN approach is mostly effective in estimating default probabilities and is a promising method to the fields of classification.


Author(s):  
Fei-Long Chen ◽  
Feng-Chia Li

Credit scoring is an important topic for businesses and socio-economic establishments collecting huge amounts of data, with the intention of making the wrong decision obsolete. In this paper, the authors propose four approaches that combine four well-known classifiers, such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Back-Propagation Network (BPN) and Extreme Learning Machine (ELM). These classifiers are used to find a suitable hybrid classifier combination featuring selection that retains sufficient information for classification purposes. In this regard, different credit scoring combinations are constructed by selecting features with four approaches and classifiers than would otherwise be chosen. Two credit data sets from the University of California, Irvine (UCI), are chosen to evaluate the accuracy of the various hybrid features selection models. In this paper, the procedures that are part of the proposed approaches are described and then evaluated for their performances.


2021 ◽  
Vol 15 (4) ◽  
pp. 735-744
Author(s):  
Putri Sri Astuti ◽  
Memi Nor Hayati ◽  
Rito Goejantoro

Classification is the process of grouping objects that have the same characteristics into several categories. This study applies a combination of classification algorithms, namely Bootstrap Aggregating K-Nearest Neighbor in credit scoring analysis. The aim is to classify the credit payment status of electronic goods and furniture at PT KB Finansia Multi Finance in 2020 and determine the level of accuracy produced. Credit payment status is grouped into 2 categories, namely smoothly and not smoothly. There are 7 independent variables that are used to describe the characteristics of the debtor, namely age, number of dependents, length of stay, years of service, income, amount of payment, and payment period. The application of the classification algorithm at the credit scoring analysis is expected to assist creditors in making decisions to accept or reject credit applications from prospective debtors. The results showed that the accuracy obtained from the Bootstrap Aggregating K-Nearest Neighbor algorithm with a proportion of 90:10, m=80%, C=73, and K=5 was the best, which was 92.308%.


2018 ◽  
pp. 1838-1874
Author(s):  
Abbas Keramati ◽  
Niloofar Yousefi ◽  
Amin Omidvar

Credit scoring has become a very important issue due to the recent growth of the credit industry. As the first objective, this chapter provides an academic database of literature between and proposes a classification scheme to classify the articles. The second objective of this chapter is to suggest the employing of the Optimally Weighted Fuzzy K-Nearest Neighbor (OWFKNN) algorithm for credit scoring. To show the performance of this method, two real world datasets from UCI database are used. In classification task, the empirical results demonstrate that the OWFKNN outperforms the conventional KNN and fuzzy KNN methods and also other methods. In the predictive accuracy of probability of default, the OWFKNN also show the best performance among the other methods. The results in this chapter suggest that the OWFKNN approach is mostly effective in estimating default probabilities and is a promising method to the fields of classification.


2019 ◽  
Vol 8 (1) ◽  
pp. 149-160
Author(s):  
Mei Sita Saraswati ◽  
Moch. Abdul Mukid ◽  
Abdul Hoyyi

Unsecured Credit is one of the credit facilities provided by banks, where the prospective debtor can borrow some amount of fund from the bank without having to provide collateral. Credit scoring is a process that aims to assess the worthiness of credit applications and classify the credit applicants into prospective debtors whose the credit application is worthy to be accepted and prospective debtors whose the credit application should be rejected. One of the statistical methods that can be applied in examining the analysis of credit scoring is the Generalized Mean Distance-Based k-Nearest Neighbor (GMDKNN) classifier. Empirical study on this method uses 23,337 data of prospective debtor of unsecured credit in 2018, with the dependent variable being the credit scoring final decision and seven independent variables, i.e. age, child dependent, length of employment, age of the company, income, loan proposed, and duration of credit. Based on the feature selection test, all independent variables are significantly taking effect on the credit scoring final decision. The best classification model is obtained in the parameters k = 137 and p = -1 with the classification performance metrics represented by the values of APER = 0,2580, accuracy = 74,20%, sensitivity = 0,6083, specificity = 0,8393, AUC = 0,7238, and G-Mean = 0,7146.Keywords: Unsecured Credit, credit scoring, classification, Generalized Mean Distance-Based k-Nearest Neighbor (GMDKNN).


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


2020 ◽  
Vol 17 (1) ◽  
pp. 319-328
Author(s):  
Ade Muchlis Maulana Anwar ◽  
Prihastuti Harsani ◽  
Aries Maesya

Population Data is individual data or aggregate data that is structured as a result of Population Registration and Civil Registration activities. Birth Certificate is a Civil Registration Deed as a result of recording the birth event of a baby whose birth is reported to be registered on the Family Card and given a Population Identification Number (NIK) as a basis for obtaining other community services. From the total number of integrated birth certificate reporting for the 2018 Population Administration Information System (SIAK) totaling 570,637 there were 503,946 reported late and only 66,691 were reported publicly. Clustering is a method used to classify data that is similar to others in one group or similar data to other groups. K-Nearest Neighbor is a method for classifying objects based on learning data that is the closest distance to the test data. k-means is a method used to divide a number of objects into groups based on existing categories by looking at the midpoint. In data mining preprocesses, data is cleaned by filling in the blank data with the most dominating data, and selecting attributes using the information gain method. Based on the k-nearest neighbor method to predict delays in reporting and the k-means method to classify priority areas of service with 10,000 birth certificate data on birth certificates in 2019 that have good enough performance to produce predictions with an accuracy of 74.00% and with K = 2 on k-means produces a index davies bouldin of 1,179.


Sign in / Sign up

Export Citation Format

Share Document