Performance Analysis of Predictive Models using Generic Datasets

Today over 2.5 quintillion bytes of data is being created every single day where 753 crore people on this planet are creating 1.7mb of data each second. Most often than not, Researchers only scratch the surface when it comes to analyzing which algorithm will be best suited with their dataset and which one will give the highest efficiency. Sometimes, this analysis takes more computational time than the actual execution itself. Aim of this paper is to understand and solve this dilemma by applying different predictions models like Neural Networks, Regression and Decision Tree algorithms to different datasets where their performance was measured using ROC Index, Average Square Error and Misclassification Rate. A comparative analysis is done to show their best performance in different scopes and conditions. All data sets and results were compared and analyzed using SAS tool.

Download Full-text

Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation*

Revista Perspectiva Empresarial ◽

10.16967/23898186.667 ◽

2020 ◽

Vol 7 (2-1) ◽

pp. 31-43

Author(s):

Nidia Rodríguez Mazahua ◽

Lisbeth Rodríguez Mazahua ◽

Asdrúbal López Chau ◽

Giner Alor Hernández

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Decision Tree ◽

Data Warehouse ◽

Data Sets ◽

Star Schema ◽

Tree Algorithms ◽

Horizontal Fragmentation ◽

Roc Area ◽

F Measure

One of the main problems faced by Data Warehouse designers is fragmentation.Several studies have proposed data mining-based horizontal fragmentation methods.However, not exists a horizontal fragmentation technique that uses a decision tree. This paper presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka, considering four evaluation metrics (Precision, ROC Area, Recall and F-measure) for different selected data sets using the Star Schema Benchmark. The results showed that the two best algorithms were J48 and Random Forest in most cases; nevertheless, J48 was selected because it is more efficient in building the model.

Download Full-text

The Use of Quantitative Methods in Investment Decisions

Handbook of Research on Global Issues in Financial Communication and Investment Decision Making - Advances in Finance, Accounting, and Economics ◽

10.4018/978-1-5225-9265-5.ch013 ◽

2019 ◽

pp. 256-275 ◽

Cited By ~ 5

Author(s):

Serkan Eti

Keyword(s):

Neural Networks ◽

Monte Carlo Simulation ◽

Monte Carlo ◽

Artificial Neural Networks ◽

Decision Tree ◽

Quantitative Methods ◽

Simulation Analysis ◽

Investment Decision ◽

Tree Algorithms ◽

Artificial Neural

Quantitative methods are mainly preferred in the literature. The main purpose of this chapter is to evaluate the usage of quantitative methods in the subject of the investment decision. Within this framework, the studies related to the investment decision in which quantitative methods are taken into consideration. As for the quantitative methods, probit, logit, decision tree algorithms, artificial neural networks methods, Monte Carlo simulation, and MARS approaches are taken into consideration. The findings show that MARS methodology provides a more accurate results in comparison with other techniques. In addition to this situation, it is also concluded that probit and logit methodologies were less preferred in comparison with decision tree algorithms, artificial neural networks methods, and Monte Carlo simulation analysis, especially in the last studies. Therefore, it is recommended that a new evaluation for investment analysis can be performed with MARS method because it is understood that this approach provides better results.

Download Full-text

Comparative analysis of decision tree algorithms: Random forest and C4.5 for airlines customer satisfaction classification

Journal of Physics Conference Series ◽

10.1088/1742-6596/1402/6/066055 ◽

2019 ◽

Vol 1402 ◽

pp. 066055 ◽

Cited By ~ 1

Author(s):

W Baswardono ◽

D Kurniadi ◽

A Mulyani ◽

D M Arifin

Keyword(s):

Comparative Analysis ◽

Random Forest ◽

Customer Satisfaction ◽

Decision Tree ◽

Tree Algorithms

Download Full-text

Performance Analysis of Decision Tree Algorithms for Breast Cancer Classification

Indian Journal of Science and Technology ◽

10.17485/ijst/2015/v8i29/84646 ◽

2015 ◽

Vol 8 (1) ◽

pp. 1-8

Author(s):

E. Venkatesan

Keyword(s):

Breast Cancer ◽

Performance Analysis ◽

Decision Tree ◽

Cancer Classification ◽

Breast Cancer Classification ◽

Tree Algorithms

Download Full-text

Analysis of Algorithms for Searching Objects in Images Using Convolutional Neural Network

Advances in Cyber-Physical Systems ◽

10.23939/acps2021.02.128 ◽

2021 ◽

Vol 6 (2) ◽

pp. 128-133

Author(s):

Ihor Koval ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Comparative Analysis ◽

Analysis Of Algorithms ◽

Network Models ◽

Data Sets ◽

Neural Network Models ◽

Network Algorithms ◽

Modern Computer ◽

The Neural Network

The problem of finding objects in images using modern computer vision algorithms has been considered. The description of the main types of algorithms and methods for finding objects based on the use of convolutional neural networks has been given. A comparative analysis and modeling of neural network algorithms to solve the problem of finding objects in images has been conducted. The results of testing neural network models with different architectures on data sets VOC2012 and COCO have been presented. The results of the study of the accuracy of recognition depending on different hyperparameters of learning have been analyzed. The change in the value of the time of determining the location of the object depending on the different architectures of the neural network has been investigated.

Download Full-text

A Comparative Analysis of Machine/Deep Learning Models for Parking Space Availability Prediction

Sensors ◽

10.3390/s20010322 ◽

2020 ◽

Vol 20 (1) ◽

pp. 322 ◽

Cited By ~ 9

Author(s):

Faraz Malik Awan ◽

Yasir Saleem ◽

Roberto Minerva ◽

Noel Crespi

Keyword(s):

Deep Learning ◽

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Multilayer Perceptron ◽

Large Data ◽

Data Sets ◽

Application Domain ◽

Parking Space ◽

Data Set

Machine/Deep Learning (ML/DL) techniques have been applied to large data sets in order to extract relevant information and for making predictions. The performance and the outcomes of different ML/DL algorithms may vary depending upon the data sets being used, as well as on the suitability of algorithms to the data and the application domain under consideration. Hence, determining which ML/DL algorithm is most suitable for a specific application domain and its related data sets would be a key advantage. To respond to this need, a comparative analysis of well-known ML/DL techniques, including Multilayer Perceptron, K-Nearest Neighbors, Decision Tree, Random Forest, and Voting Classifier (or the Ensemble Learning Approach) for the prediction of parking space availability has been conducted. This comparison utilized Santander’s parking data set, initiated while working on the H2020 WISE-IoT project. The data set was used in order to evaluate the considered algorithms and to determine the one offering the best prediction. The results of this analysis show that, regardless of the data set size, the less complex algorithms like Decision Tree, Random Forest, and KNN outperform complex algorithms such as Multilayer Perceptron, in terms of higher prediction accuracy, while providing comparable information for the prediction of parking space availability. In addition, in this paper, we are providing Top-K parking space recommendations on the basis of distance between current position of vehicles and free parking spots.

Download Full-text

Assessing Data Mining Approaches for Analyzing Actuarial Student Success Rate

Data Mining ◽

10.4018/978-1-4666-2455-9.ch094 ◽

2013 ◽

pp. 1819-1834

Author(s):

Alan Olinsky ◽

Phyllis A. Schumacher ◽

John Quinn

Keyword(s):

Data Mining ◽

Neural Networks ◽

Logistic Regression ◽

Decision Tree ◽

Student Success ◽

Predictive Models ◽

Drop Out ◽

Predicting Success ◽

Best Fitting ◽

Fitting Model

One way to enhance the likelihood that more students will graduate within the specific major that they begin with is to attract the type of students who have typically (historically) done well in that field of study. This chapter details a study that utilizes data mining techniques to analyze the characteristics of students who enroll as actuarial students and then either drop out of the major or graduate as actuarial students. Several predictive models including logistic regression, neural networks and decision trees are obtained. The models are then compared and the best fitting model is determined. The regression model turns out to be the best predictor. Since this is a very well understood method, it can easily be explained. The decision tree, although its underpinnings are somewhat difficult to explain, gives a clear and well understood output. Not only is the resulting model a good one for predicting success in the major, it also allows us the ability to better counsel students.

Download Full-text

Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

Studies in Computational Intelligence - New Perspectives on Enterprise Decision-Making Applying Artificial Intelligence Techniques ◽

10.1007/978-3-030-71115-3_15 ◽

2021 ◽

pp. 337-363

Author(s):

Nidia Rodríguez-Mazahua ◽

Lisbeth Rodríguez-Mazahua ◽

Asdrúbal López-Chau ◽

Giner Alor-Hernández ◽

S. Gustavo Peláez-Camarena

Keyword(s):

Comparative Analysis ◽

Decision Tree ◽

Data Warehouse ◽

Tree Algorithms

Download Full-text

SETTING UP A PROBABILISTIC NEURAL NETWORK FOR CLASSIFICATION OF HIGHWAY VEHICLES

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026805001702 ◽

2005 ◽

Vol 05 (04) ◽

pp. 411-423 ◽

Cited By ~ 5

Author(s):

MAJURA F. SELEKWA ◽

VALERIAN KWIGIZILE ◽

RENATUS N. MUSSA

Keyword(s):

Neural Network ◽

Neural Networks ◽

Decision Tree ◽

Probabilistic Neural Network ◽

Classification Problem ◽

Misclassification Rate ◽

Tree Methods ◽

Decision Tree Methods

Many neural network methods used for efficient classification of populations work only when the population is globally separable. In situ classification of highway vehicles is one of the problems with globally nonseparable populations. This paper presents a systematic procedure for setting up a probabilistic neural network that can classify the globally nonseparable population of highway vehicles. The method is based on a simple concept that any set of classifiable data can be broken down to subclasses of locally separable data. Hence, if these locally separable data can be identified, then the classification problem can be carried out in two hierarchical steps; step one classifies the data according to the local subclasses, and step two classifies the local subclasses into the global classes. The proposed approach was tested on the problem of classifying highway vehicles according to the US Federal Highway Administration standard, which is normally handled by decision tree methods that use vehicle axle information and a set of IF-THEN rules. By using a sample of 3326 vehicles, the proposed method showed improved classification results with an overall misclassification rate of only 2.9% compared to 9.7% of the decision tree methods. A similar setup can be used with different neural networks such as recurrent neural networks, but they were not tested in this study especially since the focus was for in situ applications where a high learning rate is desired.

Download Full-text

Comparative Analysis of Decision Tree Algorithms: ID3, C4.5 and Random Forest

Computational Intelligence in Data Mining - Volume 1 - Smart Innovation, Systems and Technologies ◽

10.1007/978-81-322-2205-7_51 ◽

2014 ◽

pp. 549-562 ◽

Cited By ~ 13

Author(s):

Shiju Sathyadevan ◽

Remya R. Nair

Keyword(s):

Comparative Analysis ◽

Random Forest ◽

Decision Tree ◽

Tree Algorithms

Download Full-text