An Arithmetic Mean of Information Gain and Correlation Ratio Based Decision Tree Algorithm for Accident Dataset Mining: A Case Study of Accident Dataset of Gombe – Numan –Yola High Way, Nigeria

Decision tree algorithm is a common classification algorithm in data mining technology, and its results are usually expressed in the form of if-then rules. The C4.5 algorithm is one of the decision tree algorithms, which has the advantages of easy to understand and high accuracy, and the concept of information gain rate is added compared with its predecessor ID3 algorithm. After theoretical analysis, C4.5 algorithm is chosen to analyze the performance appraisal results, and the decision tree for performance appraisal is generated by collecting data, data preprocessing, calculating information gain rate, determining splitting attributes, and postpruning. The system is developed in B/S architecture, and an R&D project management system and platform that can realize performance assessment analysis are built by means of visualization tools, decision tree algorithm, and dynamic web pages. The system includes information storage, task management, report generation, role authority control, information visualization, and other management information system functional modules. They can realize the project management functions such as project establishment and management, task flow, employee information filling and management, performance assessment system establishment, report generation of various dimensions, management cockpit construction. With decision tree algorithm as the core technology, the system obtains scientific and reliable project management information with high accuracy and realizes data visualization, which can assist enterprises to establish a good management system in the era of big data.

Download Full-text

Developing Guidelines for Implementing Transit Signal Priority and Freight Signal Priority Using Simulation Modeling and a Decision Tree Algorithm

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211057528 ◽

2021 ◽

pp. 036119812110575

Author(s):

Shahadat Iqbal ◽

Taraneh Ardalan ◽

Mohammed Hadi ◽

Evangelos Kaisar

Keyword(s):

Decision Tree ◽

Signal Control ◽

Signal Timing ◽

Transit Signal Priority ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Traffic Demand ◽

Capacity Ratio ◽

The Impact

Transit signal priority (TSP) and freight signal priority (FSP) allow transportation agencies to prioritize signal service allocations considering the priority of vehicles and, potentially, decrease the impact signal control has on them. However, there have been no studies to develop guidelines for implementing signal control considering both TSP and FSP. This paper reports on a study conducted to provide such guidelines that employed a literature review, a simulation study, and a decision tree algorithm based on the simulation results. The guideline developed provides recommendations in accordance with the signal timing slack time, the proportion of major to minor street hourly volume, hourly truck volume per lane for the major street, hourly truck volume per lane for the minor street, the proportion of major to minor street hourly truck volume, the proportion of major to minor street hourly bus volume, the volume-to-capacity ratio for the major street, and the volume-to-capacity ratio for the minor street. The guideline developed was validated by implementing it for a case study facility. The validation result showed that the guideline works correctly for both high and low traffic demand.

Download Full-text

Classifier Ensemble Algorithm for Data Stream with Attribute Uncertainty

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5747 ◽

2016 ◽

Vol 13 (10) ◽

pp. 7519-7525 ◽

Cited By ~ 1

Author(s):

Zhang Xing ◽

Wang MeiLi ◽

Zhang Yang ◽

Ning Jifeng

Keyword(s):

Decision Tree ◽

Data Stream ◽

High Speed ◽

Information Gain ◽

Uncertain Data ◽

Classifier Ensemble ◽

Ensemble Classifiers ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Ensemble Algorithm

To build a classifier for uncertain data stream, an Ensemble of Uncertain Decision Tree Algorithm (EDTU) is proposed. Firstly, the decision tree algorithm for uncertain data (DTU) was improved by changing the calculation method of its information gain and improving the efficiency of the algorithm so that it can process the high-speed flow of data streams; then, based on this basic classifier, dynamic classifier ensemble algorithm was used, and the classifiers presenting effective classification were selected to constitute ensemble classifiers. Experimental results on SEA and Forest Covertype Datasets demonstrate that the proposed EDTU algorithm is efficient in classifying data stream with uncertain attribute, and the performance is stable under the different parameters.

Download Full-text

Decision Tree Algorithm for Mining "If Then Else" Rule in Single Slope Basin Solar Still plant

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4475.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 405-410

Keyword(s):

Decision Making ◽

Decision Tree ◽

Information Gain ◽

Learning Approaches ◽

Decision Tree Algorithm ◽

Rule Mining ◽

Rule Based ◽

Tree Algorithm ◽

C4.5 Decision Tree ◽

Learning Concept

Soft computing dedicatedly works for decision making. In this domain a number of techniques are used for prediction, classification, categorization, optimization, and information extraction. Among rule mining is one of the essential methodologies. “IF Then Else” can work as rules, to classify, or predict an event in real world. Basically, that is rule based learning concept, additionally it is frequently used in various data mining applications during decision making and machine learning. There are some supervised learning approaches are available which can be used for rule mining. In this context decision tree is a helpful algorithm. The algorithm works on data splitting strategy using entropy and information gain. The data information is mapped in a tree structure for developing “IF Then Else” rules. In this work an application of rule based learning is presented for recycling of water in a distillation unit. By using the designed experimental still plant different attributes are collected with the observed distillated yield and instantaneous efficiency. This observed data is learned with the C4.5 decision tree algorithm and also predict the distillated yield and instantaneous efficiency. Finally to classify and predict the required parameters “IF Then Else” rules are prepared. The experimental results demonstrate, the proposed C4.5 algorithm provides higher accuracy as compared to similar state of art techniques. The proposed technique offers up to 5-9% improved outcome in terms of accuracy.

Download Full-text

Analisis Perbandingan Algoritma Decision Tree, kNN, dan Naive Bayes untuk Prediksi Kesuksesan Start-up

JISKA (Jurnal Informatika Sunan Kalijaga) ◽

10.14421/jiska.2021.6.3.178-188 ◽

2021 ◽

Vol 6 (3) ◽

pp. 178-188

Author(s):

Adhitya Prayoga Permana ◽

Kurniyatul Ainiyah ◽

Khadijah Fahmi Hayati Holle

Keyword(s):

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Test Results ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Start Up ◽

The Many ◽

Start Ups

Start-ups have a very important role in economic growth, the existence of a start-up can open up many new jobs. However, not all start-ups that are developing can become successful start-ups. This is because start-ups have a high failure rate, data shows that 75% of start-ups fail in their development. Therefore, it is important to classify the successful and failed start-ups, so that later it can be used to see the factors that most influence start-up success, and can also predict the success of a start-up. Among the many classifications in data mining, the Decision Tree, kNN, and Naïve Bayes algorithms are the algorithms that the authors chose to classify the 923 start-up data records that were previously obtained. The test results using cross-validation and T-test show that the Decision Tree Algorithm is the most appropriate algorithm for classifying in this case study. This is evidenced by the accuracy value obtained from the Decision Tree algorithm, which is greater than other algorithms, which is 79.29%, while the kNN algorithm has an accuracy value of 66.69%, and Naive Bayes is 64.21%.

Download Full-text

An Improved SPRINT Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1685 ◽

2012 ◽

Vol 532-533 ◽

pp. 1685-1690 ◽

Cited By ~ 1

Author(s):

Zhi Kang Luo ◽

Huai Ying Sun ◽

De Wang

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Learning Communities ◽

Information Gain ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Gain Ratio ◽

Information Gain Ratio ◽

Improved Algorithm

This paper presents an improved SPRINT algorithm. The original SPRINT algorithm is a scalable and parallelizable decision tree algorithm, which is a popular algorithm in data mining and machine learning communities. To improve the algorithm's efficiency, we propose an improved algorithm. Firstly, we select the splitting attributes and obtain the best splitting attribute from them by computing the information gain ratio of each attribute. After that, we calculate the best splitting point of the best splitting attribute. Since it avoids a lot of calculations of other attributes, the improved algorithm can effectively reduce the computation.

Download Full-text

Designing an ERP System by Adopting the Decision Tree Algorithm: A Case Study in the General Company of Oil Products Distribution

TANMIYAT AL-RAFIDAIN ◽

10.33899/tanra.2021.168687 ◽

2021 ◽

Vol 40 (130) ◽

pp. 102-134

Author(s):

anmar aljuboury ◽

Moyassar I. Ahmed Aljuboury

Keyword(s):

Decision Tree ◽

Oil Products ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Erp System

Download Full-text

An Improved ID3 Decision Tree Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.962-965.2842 ◽

2014 ◽

Vol 962-965 ◽

pp. 2842-2847 ◽

Cited By ~ 3

Author(s):

Xiao Juan Chen ◽

Zhi Gang Zhang ◽

Yue Tong

Keyword(s):

Decision Tree ◽

Information Entropy ◽

Information Gain ◽

Classification Algorithm ◽

Learning Ability ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Decision Tree Classification ◽

Id3 Algorithm ◽

Improved Algorithm

As the classical algorithm of the decision tree classification algorithm, ID3 algorithm is famous for the merits of high classifying speed, strong learning ability and easy construction. But when used to make classification, the problem of inclining to choose attributions which have many values affect its practicality. This paper presents an improved algorithm based on the expectation information entropy and Association Function instead of the traditional information gain. In the improved algorithm, it modified the expectation information entropy with the improved Association Function and the number of the attributes values. The experiment result shows that the improved algorithm can get more reasonable and more effective rules.

Download Full-text