A hybrid model with novel feature selection method and enhanced voting method for credit scoring

2021 ◽  
pp. 1-15
Author(s):  
Jianrong Yao ◽  
Zhongyi Wang ◽  
Lu Wang ◽  
Zhebin Zhang ◽  
Hui Jiang ◽  
...  

With the in-depth application of artificial intelligence technology in the financial field, credit scoring models constructed by machine learning algorithms have become mainstream. However, the high-dimensional and complex attribute features of the borrower pose challenges to the predictive competence of the model. This paper proposes a hybrid model with a novel feature selection method and an enhanced voting method for credit scoring. First, a novel feature selection combined method based on a genetic algorithm (FSCM-GA) is proposed, in which different classifiers are used to select features in combination with a genetic algorithm and combine them to generate an optimal feature subset. Furthermore, an enhanced voting method (EVM) is proposed to integrate classifiers, with the aim of improving the classification results in which the prediction probability values are close to the threshold. Finally, the predictive competence of the proposed model was validated on three public datasets and five evaluation metrics (accuracy, AUC, F-score, Log loss and Brier score). The comparative experiment and significance test results confirmed the good performance and robustness of the proposed model.

2018 ◽  
Vol 45 (5) ◽  
pp. 676-690 ◽  
Author(s):  
Ahmet Engin Bayrak ◽  
Faruk Polat

In this study, we investigated feature-based approaches for improving the link prediction performance for location-based social networks (LBSNs) and analysed their performances. We developed new features based on time, common friend detail and place category information of check-in data in order to make use of information in the data which cannot be utilised by the existing features from the literature. We proposed a feature selection method to determine a feature subset that enhances the prediction performance with the removal of redundant features by clustering them. After clustering features, a genetic algorithm is used to determine the ones to select from each cluster. A non-monotonic and feasible feature selection is ensured by the proposed genetic algorithm. Results depict that both new features and the proposed feature selection method improved link prediction performance for LBSNs.


2021 ◽  
pp. 102448
Author(s):  
Zahid Halim ◽  
Muhammad Nadeem Yousaf ◽  
Muhammad Waqas ◽  
Muhammad Suleman ◽  
Ghulam Abbas ◽  
...  

Author(s):  
Neesha Jothi ◽  
Wahidah Husain ◽  
Nur’Aini Abdul Rashid ◽  
Sharifah Mashita Syed-Mohamad

Author(s):  
ShuRui Li ◽  
Jing Jin ◽  
Ian Daly ◽  
Chang Liu ◽  
Andrzej Cichocki

Abstract Brain–computer interface (BCI) systems decode electroencephalogram signals to establish a channel for direct interaction between the human brain and the external world without the need for muscle or nerve control. The P300 speller, one of the most widely used BCI applications, presents a selection of characters to the user and performs character recognition by identifying P300 event-related potentials from the EEG. Such P300-based BCI systems can reach good levels of accuracy but are difficult to use in day-to-day life due to redundancy and noisy signal. A room for improvement should be considered. We propose a novel hybrid feature selection method for the P300-based BCI system to address the problem of feature redundancy, which combines the Menger curvature and linear discriminant analysis. First, selected strategies are applied separately to a given dataset to estimate the gain for application to each feature. Then, each generated value set is ranked in descending order and judged by a predefined criterion to be suitable in classification models. The intersection of the two approaches is then evaluated to identify an optimal feature subset. The proposed method is evaluated using three public datasets, i.e., BCI Competition III dataset II, BNCI Horizon dataset, and EPFL dataset. Experimental results indicate that compared with other typical feature selection and classification methods, our proposed method has better or comparable performance. Additionally, our proposed method can achieve the best classification accuracy after all epochs in three datasets. In summary, our proposed method provides a new way to enhance the performance of the P300-based BCI speller.


2020 ◽  
Vol 2020 ◽  
pp. 1-14 ◽  
Author(s):  
Yong Liu ◽  
Shenggen Ju ◽  
Junfeng Wang ◽  
Chong Su

Feature selection method is designed to select the representative feature subsets from the original feature set by different evaluation of feature relevance, which focuses on reducing the dimension of the features while maintaining the predictive accuracy of a classifier. In this study, we propose a feature selection method for text classification based on independent feature space search. Firstly, a relative document-term frequency difference (RDTFD) method is proposed to divide the features in all text documents into two independent feature sets according to the features’ ability to discriminate the positive and negative samples, which has two important functions: one is to improve the high class correlation of the features and reduce the correlation between the features and the other is to reduce the search range of feature space and maintain appropriate feature redundancy. Secondly, the feature search strategy is used to search the optimal feature subset in independent feature space, which can improve the performance of text classification. Finally, we evaluate several experiments conduced on six benchmark corpora, the experimental results show the RDTFD method based on independent feature space search is more robust than the other feature selection methods.


Sign in / Sign up

Export Citation Format

Share Document