C4.5 Decision Tree Enhanced with AdaBoost Versus Multilayer Perceptron for Credit Scoring Modeling

Data Mining methods can be used in order to facilitate auditors to issue their opinions. Numerous of these methods have not yet been tested on the purpose of discriminating cases of qualified opinions. In this study, we employ three Data Mining classification techniques to develop models capable of identifying qualified auditors' reports. The techniques used are C4.5 Decision Tree, Multilayer Perceptron Neural Network, and Bayesian Belief Network. The sample contains 450 publicly listed, nonfinancial U.K. and Irish firms. The input vector is composed of one qualitative and several quantitative variables. The three developed models are compared in terms of their performance. Additionally, variables that are associated with qualified reports and can be used as indicators are also revealed. The results of this study can be useful to internal and external auditors and company decision-makers.

Download Full-text

Classification of complete blood count and haemoglobin typing data by a C4.5 decision tree, a naïve Bayes classifier and a multilayer perceptron for thalassaemia screening

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2011.03.007 ◽

2012 ◽

Vol 7 (2) ◽

pp. 202-212 ◽

Cited By ~ 17

Author(s):

Damrongrit Setsirichok ◽

Theera Piroonratana ◽

Waranyu Wongseree ◽

Touchpong Usavanarong ◽

Nuttawut Paulkhaolarn ◽

...

Keyword(s):

Decision Tree ◽

Multilayer Perceptron ◽

Blood Count ◽

Naive Bayes ◽

Complete Blood Count ◽

Naïve Bayes ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

C4.5 Decision Tree

Download Full-text

Internet Traffic Classification Using C4.5 Decision Tree

Journal of Software ◽

10.3724/sp.j.1001.2009.03444 ◽

2009 ◽

Vol 20 (10) ◽

pp. 2692-2704 ◽

Cited By ~ 26

Author(s):

Peng XU ◽

Sen LIN

Keyword(s):

Decision Tree ◽

Internet Traffic ◽

Traffic Classification ◽

C4.5 Decision Tree ◽

Internet Traffic Classification

Download Full-text

174 A comparison of machine learning algorithms in the classification of beef steers finished in feedlot

Journal of Animal Science ◽

10.1093/jas/skaa278.231 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 126-127

Author(s):

Lucas S Lopes ◽

Christine F Baes ◽

Dan Tulpan ◽

Luis Artur Loyola Chardulo ◽

Otavio Machado Neto ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Final Decision ◽

Relevant Parameter ◽

Good Prediction ◽

Quality Traits ◽

C4.5 Decision Tree

Abstract The aim of this project is to compare some of the state-of-the-art machine learning algorithms on the classification of steers finished in feedlots based on performance, carcass and meat quality traits. The precise classification of animals allows for fast, real-time decision making in animal food industry, such as culling or retention of herd animals. Beef production presents high variability in its numerous carcass and beef quality traits. Machine learning algorithms and software provide an opportunity to evaluate the interactions between traits to better classify animals. Four different treatment levels of wet distiller’s grain were applied to 97 Angus-Nellore animals and used as features for the classification problem. The C4.5 decision tree, Naïve Bayes (NB), Random Forest (RF) and Multilayer Perceptron (MLP) Artificial Neural Network algorithms were used to predict and classify the animals based on recorded traits measurements, which include initial and final weights, sheer force and meat color. The top performing classifier was the C4.5 decision tree algorithm with a classification accuracy of 96.90%, while the RF, the MLP and NB classifiers had accuracies of 55.67%, 39.17% and 29.89% respectively. We observed that the final decision tree model constructed with C4.5 selected only the dry matter intake (DMI) feature as a differentiator. When DMI was removed, no other feature or combination of features was sufficiently strong to provide good prediction accuracies for any of the classifiers. We plan to investigate in a follow-up study on a significantly larger sample size, the reasons behind DMI being a more relevant parameter than the other measurements.

Download Full-text

A DECISION TREE-BASED CLASSIFICATION APPROACH TO RULE EXTRACTION FOR SECURITY ANALYSIS

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622006001824 ◽

2006 ◽

Vol 05 (01) ◽

pp. 227-240 ◽

Cited By ~ 15

Author(s):

N. REN ◽

M. ZARGHAM ◽

S. RAHIMI

Keyword(s):

Decision Tree ◽

High Performance ◽

Security Analysis ◽

Predictive Performance ◽

Classification Model ◽

Stock Selection ◽

Selection Rules ◽

Stock Portfolios ◽

Decision Tree Classification ◽

C4.5 Decision Tree

Stock selection rules are extensively utilized as the guideline to construct high performance stock portfolios. However, the predictive performance of the rules developed by some economic experts in the past has decreased dramatically for the current stock market. In this paper, C4.5 decision tree classification method was adopted to construct a model for stock prediction based on the fundamental stock data, from which a set of stock selection rules was derived. The experimental results showed that the generated rules have exceptional predictive performance. Moreover, it also demonstrated that the C4.5 decision tree classification model can work efficiently on the high noise stock data domain.

Download Full-text

Network Anomaly Detection by Cascading K-Means Clustering and C4.5 Decision Tree algorithm

Procedia Engineering ◽

10.1016/j.proeng.2012.01.849 ◽

2012 ◽

Vol 30 ◽

pp. 174-182 ◽

Cited By ~ 83

Author(s):

Amuthan Prabakar Muniyandi ◽

R. Rajeswari ◽

R. Rajaram

Keyword(s):

Decision Tree ◽

Anomaly Detection ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C4.5 Decision Tree ◽

Network Anomaly Detection

Download Full-text

Research on Asset Recognition Algorithm of Information Security Product Based on Decision Tree Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.397-400.2296 ◽

2013 ◽

Vol 397-400 ◽

pp. 2296-2300 ◽

Cited By ~ 1

Author(s):

Fei Shuai ◽

Jun Quan Li

Keyword(s):

Computational Complexity ◽

Information Security ◽

Decision Tree ◽

Recognition Algorithm ◽

Space Complexity ◽

Complex Relationship ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C4.5 Algorithm ◽

C4.5 Decision Tree

In current, there are complex relationship between the assets of information security product. According to this characteristic, we propose a new asset recognition algorithm (ART) on the improvement of the C4.5 decision tree algorithm, and analyze the computational complexity and space complexity of the proposed algorithm. Finally, we demonstrate that our algorithm is more precise than C4.5 algorithm in asset recognition by an application example whose result verifies the availability of our algorithm.Keywordsdecision tree, information security product, asset recognition, C4.5

Download Full-text

Credit scoring model based on multilayer perceptron

Proceedings of the 2003 IEEE International Symposium on Intelligent Control ISIC-03 ◽

10.1109/isic.2003.1254686 ◽

2003 ◽

Author(s):

Sulin Pang ◽

Yanming Wang ◽

Yuanhuai Bai

Keyword(s):

Multilayer Perceptron ◽

Credit Scoring ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

Impact of Unusual Features in Credit Scoring Problem

10.5753/kdmile.2020.11962 ◽

2020 ◽

Author(s):

Luiz Felipe Vercosa ◽

Rodrigo Lira ◽

Rodrigo Monteiro ◽

Kleber Silva ◽

Jailson Magalhaes ◽

...

Keyword(s):

Multilayer Perceptron ◽

Demographic Data ◽

Credit Scoring ◽

Financial Data ◽

Unusual Feature ◽

The Real ◽

Kolmogorov Smirnov ◽

Small Improvement ◽

Better Than

Standard features used for Credit Scoring includes mainly registration and financial data from customers. However, exploring new features is of great interest for financial companies, since slight improvements in the person score directly impact the company revenue. In this work, we categorize features from open credit scoring datasets and compare them with the features found in a real company dataset. The company dataset contains unusual feature groups such as historical, geolocation, web behavior, and demographic data. We performed bivariate tests using the Kolmogorov-Smirnov metric and features to assess the performance of the particular feature groups. We also generated a score of good payer by using AdaBoost, Multilayer Perceptron, and XGBoost algorithms. Then, we analyzed the results with different metrics and compared them with the real company results. Our main finding was that these features added a small improvement to current datasets. We also identified the most promising feature groups and noticed that the tuned XGBoost performed better than the company solution in three out of four deployed metrics.

Download Full-text