scholarly journals A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Baofeng Shi ◽  
Jing Wang ◽  
Junyan Qi ◽  
Yanqiu Cheng

We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

2020 ◽  
Vol 34 (04) ◽  
pp. 6680-6687
Author(s):  
Jian Yin ◽  
Chunjing Gan ◽  
Kaiqi Zhao ◽  
Xuan Lin ◽  
Zhe Quan ◽  
...  

Recently, imbalanced data classification has received much attention due to its wide applications. In the literature, existing researches have attempted to improve the classification performance by considering various factors such as the imbalanced distribution, cost-sensitive learning, data space improvement, and ensemble learning. Nevertheless, most of the existing methods focus on only part of these main aspects/factors. In this work, we propose a novel imbalanced data classification model that considers all these main aspects. To evaluate the performance of our proposed model, we have conducted experiments based on 14 public datasets. The results show that our model outperforms the state-of-the-art methods in terms of recall, G-mean, F-measure and AUC.


2017 ◽  
Vol 29 (9) ◽  
pp. 1806-1819 ◽  
Author(s):  
Miho Ohsaki ◽  
Peng Wang ◽  
Kenji Matsuda ◽  
Shigeru Katagiri ◽  
Hideyuki Watanabe ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document