Study of Financial Warning Ensemble Model for Listed Companies Based on Unbalanced Classification Perspective

2020 ◽  
Vol 16 (1) ◽  
pp. 32-48
Author(s):  
Wei Cong

Using the ensemble learning method to mine valuable information from a sea of financial data accumulated on the market of financial securities is very important for studying data processing. On the basis of financial data from A-share companies listed on Shanghai Stock Market, this article takes the perspective of unbalanced classification of ST stocks to carry out a study of the construction of a financial warning model for the listed companies. In our experiment, HDRF (HDRandom Forest, Hellinger Distance based Random Forest), ensemble classification models of Bagging, AdaBoost, and Rotation Forest, which take Hellinger distance decision tree (HDDT) as the base classifier, and the ensemble classification model which takes the C4.5 decision tree as the base classifier, are compared in respect of both the area under the ROC curve and the F-measure. As shown in the experimental results, the HDRF and the HDDT based classifier, as an ensemble method, are effective for financial data of listed companies.

Author(s):  
N. REN ◽  
M. ZARGHAM ◽  
S. RAHIMI

Stock selection rules are extensively utilized as the guideline to construct high performance stock portfolios. However, the predictive performance of the rules developed by some economic experts in the past has decreased dramatically for the current stock market. In this paper, C4.5 decision tree classification method was adopted to construct a model for stock prediction based on the fundamental stock data, from which a set of stock selection rules was derived. The experimental results showed that the generated rules have exceptional predictive performance. Moreover, it also demonstrated that the C4.5 decision tree classification model can work efficiently on the high noise stock data domain.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Xinchun Liu

Financial supervision plays an important role in the construction of market economy, but financial data has the characteristics of being nonstationary and nonlinear and low signal-to-noise ratio, so an effective financial detection method is needed. In this paper, two machine learning algorithms, decision tree and random forest, are used to detect the company's financial data. Firstly, based on the financial data of 100 sample listed companies, this paper makes an empirical study on the fraud of financial statements of listed companies by using machine learning technology. Through the empirical analysis of logistic regression, gradient lifting decision tree, and random forest model, the preliminary results are obtained, and then the random forest model is used for secondary judgment. This paper constructs an efficient, accurate, and simple comprehensive application model of machine learning. The empirical results show that the comprehensive application model constructed in this paper has an accuracy of 96.58% in judging the abnormal financial data of listed companies. The paper puts forward an accurate and practical method for capital market participants to identify the fraud of financial statements of listed companies and has certain practical significance for investors and securities research institutions to deal with the fraud of financial statements.


Author(s):  
Conrad S. Tucker ◽  
Christopher Hoyle ◽  
Harrison M. Kim ◽  
Wei Chen

This paper presents a comparative study of choice modeling and classification techniques that are currently being employed in the engineering design community to understand customer purchasing behavior. An in-depth comparison of two similar but distinctive techniques — the Discrete Choice Analysis (DCA) model and the C4.5 Decision Tree (DT) classification model — is performed, highlighting the strengths and limitations of each approach in relation to customer choice preferences modeling. A vehicle data set from a well established data repository is used to evaluate each model based on certain performance metrics; how the models differ in making predictions/classifications, computational complexity (challenges of model generation), ease of model interpretation and robustness of the model in regards to sensitivity analysis, and scale/size of data. The results reveal that both the Discrete Choice Analysis model and the C4.5 Decision Tree classification model can be used at different stages of product design and development to understand and model customer interests and choice behavior. We however believe that the C4.5 Decision Tree may be better suited in predicting attribute relevance in relation to classifying choice patterns while the Discrete Choice Analysis model is better suited to quantify the choice share of each customer choice alternative.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 526
Author(s):  
Yang Han ◽  
Chunbao Liu ◽  
Lingyun Yan ◽  
Lei Ren

Smart wearable robotic system, such as exoskeleton assist device and powered lower limb prostheses can rapidly and accurately realize man–machine interaction through locomotion mode recognition system. However, previous locomotion mode recognition studies usually adopted more sensors for higher accuracy and effective intelligent algorithms to recognize multiple locomotion modes simultaneously. To reduce the burden of sensors on users and recognize more locomotion modes, we design a novel decision tree structure (DTS) based on using an improved backpropagation neural network (IBPNN) as judgment nodes named IBPNN-DTS, after analyzing the experimental locomotion mode data using the original values with a 200-ms time window for a single inertial measurement unit to hierarchically identify nine common locomotion modes (level walking at three kinds of speeds, ramp ascent/descent, stair ascent/descent, Sit, and Stand). In addition, we reduce the number of parameters in the IBPNN for structure optimization and adopted the artificial bee colony (ABC) algorithm to perform global search for initial weight and threshold value to eliminate system uncertainty because randomly generated initial values tend to result in a failure to converge or falling into local optima. Experimental results demonstrate that recognition accuracy of the IBPNN-DTS with ABC optimization (ABC-IBPNN-DTS) was up to 96.71% (97.29% for the IBPNN-DTS). Compared to IBPNN-DTS without optimization, the number of parameters in ABC-IBPNN-DTS shrank by 66% with only a 0.58% reduction in accuracy while the classification model kept high robustness.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 126-127
Author(s):  
Lucas S Lopes ◽  
Christine F Baes ◽  
Dan Tulpan ◽  
Luis Artur Loyola Chardulo ◽  
Otavio Machado Neto ◽  
...  

Abstract The aim of this project is to compare some of the state-of-the-art machine learning algorithms on the classification of steers finished in feedlots based on performance, carcass and meat quality traits. The precise classification of animals allows for fast, real-time decision making in animal food industry, such as culling or retention of herd animals. Beef production presents high variability in its numerous carcass and beef quality traits. Machine learning algorithms and software provide an opportunity to evaluate the interactions between traits to better classify animals. Four different treatment levels of wet distiller’s grain were applied to 97 Angus-Nellore animals and used as features for the classification problem. The C4.5 decision tree, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP) Artificial Neural Network algorithms were used to predict and classify the animals based on recorded traits measurements, which include initial and final weights, sheer force and meat color. The top performing classifier was the C4.5 decision tree algorithm with a classification accuracy of 96.90%, while the RF, the MLP and NB classifiers had accuracies of 55.67%, 39.17% and 29.89% respectively. We observed that the final decision tree model constructed with C4.5 selected only the dry matter intake (DMI) feature as a differentiator. When DMI was removed, no other feature or combination of features was sufficiently strong to provide good prediction accuracies for any of the classifiers. We plan to investigate in a follow-up study on a significantly larger sample size, the reasons behind DMI being a more relevant parameter than the other measurements.


2013 ◽  
Vol 397-400 ◽  
pp. 2296-2300 ◽  
Author(s):  
Fei Shuai ◽  
Jun Quan Li

In current, there are complex relationship between the assets of information security product. According to this characteristic, we propose a new asset recognition algorithm (ART) on the improvement of the C4.5 decision tree algorithm, and analyze the computational complexity and space complexity of the proposed algorithm. Finally, we demonstrate that our algorithm is more precise than C4.5 algorithm in asset recognition by an application example whose result verifies the availability of our algorithm.Keywordsdecision tree, information security product, asset recognition, C4.5


Sign in / Sign up

Export Citation Format

Share Document