A novel multi-stage ensemble model with fuzzy clustering and optimized classifier composition for corporate bankruptcy prediction

2020 ◽  
pp. 1-17
Author(s):  
Dongqi Yang ◽  
Wenyu Zhang ◽  
Xin Wu ◽  
Jose H. Ablanedo-Rosas ◽  
Lingxiao Yang ◽  
...  

With the rapid development of commercial credit mechanisms, credit funds have become fundamental in promoting the development of manufacturing corporations. However, large-scale, imbalanced credit application information poses a challenge to accurate bankruptcy predictions. A novel multi-stage ensemble model with fuzzy clustering and optimized classifier composition is proposed herein by combining the fuzzy clustering-based classifier selection method, the random subspace (RS)-based classifier composition method, and the genetic algorithm (GA)-based classifier compositional optimization method to achieve accuracy in predicting bankruptcy among corporates. To overcome the inherent inflexibility of traditional hard clustering methods, a new fuzzy clustering-based classifier selection method is proposed based on the mini-batch k-means algorithm to obtain the best performing base classifiers for generating classifier compositions. The RS-based classifier composition method was applied to enhance the robustness of candidate classifier compositions by randomly selecting several subspaces in the original feature space. The GA-based classifier compositional optimization method was applied to optimize the parameters of the promising classifier composition through the iterative mechanism of the GA. Finally, six datasets collected from the real world were tested with four evaluation indicators to assess the performance of the proposed model. The experimental results showed that the proposed model outperformed the benchmark models with higher predictive accuracy and efficiency.

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


Author(s):  
Junshu Wang ◽  
Guoming Zhang ◽  
Wei Wang ◽  
Ka Zhang ◽  
Yehua Sheng

AbstractWith the rapid development of hospital informatization and Internet medical service in recent years, most hospitals have launched online hospital appointment registration systems to remove patient queues and improve the efficiency of medical services. However, most of the patients lack professional medical knowledge and have no idea of how to choose department when registering. To instruct the patients to seek medical care and register effectively, we proposed CIDRS, an intelligent self-diagnosis and department recommendation framework based on Chinese medical Bidirectional Encoder Representations from Transformers (BERT) in the cloud computing environment. We also established a Chinese BERT model (CHMBERT) trained on a large-scale Chinese medical text corpus. This model was used to optimize self-diagnosis and department recommendation tasks. To solve the limited computing power of terminals, we deployed the proposed framework in a cloud computing environment based on container and micro-service technologies. Real-world medical datasets from hospitals were used in the experiments, and results showed that the proposed model was superior to the traditional deep learning models and other pre-trained language models in terms of performance.


2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Wen-Jun Li ◽  
Qiang Dong ◽  
Yan Fu

As the rapid development of mobile Internet and smart devices, more and more online content providers begin to collect the preferences of their customers through various apps on mobile devices. These preferences could be largely reflected by the ratings on the online items with explicit scores. Both of positive and negative ratings are helpful for recommender systems to provide relevant items to a target user. Based on the empirical analysis of three real-world movie-rating data sets, we observe that users’ rating criterions change over time, and past positive and negative ratings have different influences on users’ future preferences. Given this, we propose a recommendation model on a session-based temporal graph, considering the difference of long- and short-term preferences, and the different temporal effect of positive and negative ratings. The extensive experiment results validate the significant accuracy improvement of our proposed model compared with the state-of-the-art methods.


Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-12 ◽  
Author(s):  
Ágota Bányai ◽  
Tamás Bányai ◽  
Béla Illés

The globalization of economy and market led to increased networking in the field of manufacturing and services. These manufacturing and service processes including supply chain became more and more complex. The supply chain includes in many cases consignment stores. The design and operation of these complex supply chain processes can be described as NP-hard optimization problems. These problems can be solved using sophisticated models and methods based on metaheuristic algorithms. This research proposes an integrated supply model based on consignment stores. After a careful literature review, this paper introduces a mathematical model to formulate the problem of consignment-store-based supply chain optimization. The integrated model includes facility location and assignment problems to be solved. Next, an enhanced black hole algorithm dealing with multiobjective supply chain model is presented. The sensitivity analysis of the heuristic black hole optimization method is also described to check the efficiency of new operators to increase the convergence of the algorithm. Numerical results with different datasets demonstrate how the proposed model supports the efficiency, flexibility, and reliability of the consignment-store-based supply chain.


2021 ◽  
Vol 4 (5) ◽  
pp. 43-48
Author(s):  
Yu. V. NIKITOCHKINA ◽  

This article proposes a new model for stimulating distributors – a multi-stage scale that compares the distributor's rating with the percentage level of discount for regional coverage and specialization. The list of indicators for calculating discounts by specialization and regional coverage includes a group of indicators to increase the motivation of the distributor to improve the quality of their work and a group of indicators to increase the motivation of the distributor to expand the scope of action. Using the method of index grouping of expert estimates, the weight values of each indicator were found. The task of calculating the evaluation of the results of the distribution was set as a multi-criteria task, in which the additive optimization method was used for the procedure of folding private criteria, which was preceded by checking all private criteria for addi-tive independence. The developed incentive model can be adapted for any commercial enterprise interested in promoting its product through regional coverage, as well as in supporting the product image, which, of course, contributes to stimulating demand and increasing sales.


Author(s):  
Robabeh Eslami ◽  
Mohammad Khoveyni

Hitherto, the presented models for measuring the efficiency score of multi-stage decision-making units (DMUs) either are nonlinear or require to specify the weights for combining their divisional efficiencies. The nonlinearity leads to high computational complexity for these models, especially when used for problems with enormous dimensions, and also assigning various weights to the divisional efficiencies causes to obtain different efficiency scores for the multi-stage network system. To tackle these problems, this study contributes to network DEA by introducing a novel enhanced Russell graph (ERG) efficiency measure for evaluating the general two-stage series network structures. Then, the proposed model is extended into the general multi-stage series network structures. This study also describes the managerial and economic implications of measuring the efficiency score of the multi-stage DMUs and provides two numerical and empirical examples for illustrating the use of our proposed model.


2021 ◽  
pp. 1-16
Author(s):  
Fang He ◽  
Wenyu Zhang ◽  
Zhijia Yan

Credit scoring has become increasingly important for financial institutions. With the advancement of artificial intelligence, machine learning methods, especially ensemble learning methods, have become increasingly popular for credit scoring. However, the problems of imbalanced data distribution and underutilized feature information have not been well addressed sufficiently. To make the credit scoring model more adaptable to imbalanced datasets, the original model-based synthetic sampling method is extended herein to balance the datasets by generating appropriate minority samples to alleviate class overlap. To enable the credit scoring model to extract inherent correlations from features, a new bagging-based feature transformation method is proposed, which transforms features using a tree-based algorithm and selects features using the chi-square statistic. Furthermore, a two-layer ensemble method that combines the advantages of dynamic ensemble selection and stacking is proposed to improve the classification performance of the proposed multi-stage ensemble model. Finally, four standardized datasets are used to evaluate the performance of the proposed ensemble model using six evaluation metrics. The experimental results confirm that the proposed ensemble model is effective in improving classification performance and is superior to other benchmark models.


Sign in / Sign up

Export Citation Format

Share Document