An Improved SPRINT Algorithm

This paper presents an improved SPRINT algorithm. The original SPRINT algorithm is a scalable and parallelizable decision tree algorithm, which is a popular algorithm in data mining and machine learning communities. To improve the algorithm's efficiency, we propose an improved algorithm. Firstly, we select the splitting attributes and obtain the best splitting attribute from them by computing the information gain ratio of each attribute. After that, we calculate the best splitting point of the best splitting attribute. Since it avoids a lot of calculations of other attributes, the improved algorithm can effectively reduce the computation.

Download Full-text

SENTIMENT ANALYSIS ON TWITTER OF PSBB EFFECT USING MACHINE LEARNING

Jurnal Techno Nusa Mandiri ◽

10.33480/techno.v17i2.1635 ◽

2020 ◽

Vol 17 (2) ◽

pp. 143-150

Author(s):

Irwansyah Saputra ◽

Jose Andrean Halomoan ◽

Adam Bagusmugi Raharjo ◽

Cyra Rezky Ananda Syavira

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Sentiment Analysis ◽

Bayes Classifier ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Processing Data ◽

Twitter Users ◽

Using Data

A collection of tweets from Twitter users about PSBB can be used as sentiment analysis. The data obtained is processed using data mining techniques (data mining), in which there is a process of mining the text, tokenize, transformation, classification, stem, etc. Then calculated into three different algorithms to be compared, the algorithm used is the Decision Tree, K-NN, and Naïve Bayes Classifier to find the best accuracy. Rapidminer application is also used to facilitate writers in processing data. The highest results from this study were the Decision Tree algorithm with an accuracy of 83.3%, precision 79%, and recall 87.17%.

Download Full-text

An Improved ID3 Decision Tree Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.962-965.2842 ◽

2014 ◽

Vol 962-965 ◽

pp. 2842-2847 ◽

Cited By ~ 3

Author(s):

Xiao Juan Chen ◽

Zhi Gang Zhang ◽

Yue Tong

Keyword(s):

Decision Tree ◽

Information Entropy ◽

Information Gain ◽

Classification Algorithm ◽

Learning Ability ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Decision Tree Classification ◽

Id3 Algorithm ◽

Improved Algorithm

As the classical algorithm of the decision tree classification algorithm, ID3 algorithm is famous for the merits of high classifying speed, strong learning ability and easy construction. But when used to make classification, the problem of inclining to choose attributions which have many values affect its practicality. This paper presents an improved algorithm based on the expectation information entropy and Association Function instead of the traditional information gain. In the improved algorithm, it modified the expectation information entropy with the improved Association Function and the number of the attributes values. The experiment result shows that the improved algorithm can get more reasonable and more effective rules.

Download Full-text

Research on Decision Tree Algorithm Based on Information Entropy

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.267.732 ◽

2011 ◽

Vol 267 ◽

pp. 732-737 ◽

Cited By ~ 3

Author(s):

Ming Du ◽

Shu Mei Wang ◽

Gu Gong

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Information Entropy ◽

Learning Method ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Knowledge Based

Decision tree is an important learning method in machine learning and data mining ,this paper discusses the method of choosing the best attribute based on information entropy .It analyzes the process and the characters of classification and the discovery knowledge based on decision tree about the application of decision tree on data mining .Through an instance ,the paper shows the procedure of selecting the decision attribute in detail ,finally it pointes out the developing trends of decision tree.

Download Full-text

Enhanced Decision Tree Algorithm for Discovering Intra and Inter Class Exceptions

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1816.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1539-1548

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Classification Algorithms ◽

Decision Tree Algorithm ◽

High Confidence ◽

Tree Algorithm ◽

General Rules ◽

Tree Algorithms ◽

Small Disjuncts

Decision tree algorithms, being accurate and comprehensible classifiers, have been one of the most widely used classifiers in data mining and machine learning. However, like many other classification algorithms, decision tree algorithms focus on extracting patterns with high generality and in the process, these ignore some rare but useful and interesting patterns that may exist in small disjuncts of data. Such extraordinary patterns with low support and high confidence capture very specific but exceptional behavior present in data. This paper proposes a novel Enhanced Decision Tree Algorithm for Discovering Intra and Inter-class Exceptions (EDTADE). Intra-class exceptions cover objects of unique interest within a class whereas inter-class exceptions capture rare conditions due to which we are forced shift the class of few unusual objects. For instance, whales and bats are examples of intra-class exceptions since these have unique characteristics within the class of mammals. Further, most of the birds are flying creatures, but the rare birds, like penguin and ostrich fall in the category of no flying birds. Here, penguin and ostrich are inter-class exceptions. In fact, without knowing about such exceptional patterns, our knowledge about a domain is incomplete. We have enhanced the decision tree algorithm by defining a framework for capturing intra and inter-class exceptions at leaf nodes of a decision tree. The proposed algorithm (EDTADE) is applied to many datasets from UCI Machine Learning Repository. The results show that the EDTADE has been successful in discovering many intra and inter-class exceptions. The decision tree augmented with intra and inter-class exceptions are more accurate, comprehensible as well as interesting since these provide additional knowledge in the form of exceptional patterns that deviate from the general rules discovered for classification

Download Full-text

An Analysis of Factors Influencing Foreign Language Self-Efficacy Based on C5.0 Decision Tree Algorithm in Data Mining

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.538.460 ◽

2014 ◽

Vol 538 ◽

pp. 460-464

Author(s):

Xue Li

Keyword(s):

Data Mining ◽

Decision Tree ◽

Foreign Language ◽

Language Learning ◽

Learning Strategies ◽

Self Efficacy ◽

Foreign Language Learning ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

C5.0 Decision Tree

Based on inter-correlation and permeability among disciplines, the author makes an attempt to apply the information science to cognitive linguistics to provide a new perspective for the study of foreign languages. The correlation between self-efficacy and such four factors as anxiety, learning strategies, motivation and learners’ past achievement is analyzed by means of data mining and the extent to which the above factors affect self-efficacy in language learning is explored in this paper. The paper employs the decision tree algorithm in SPSS Clementine. C5.0 decision tree algorithm is adopted to analyze data in the study. The results are elicited from the researches carried out in this paper. The increased anxiety is bound to weaken learners’ motivation over time. It is obvious that learners have low self-efficacy. It is very important to employ strategies in foreign language learning. Ignorance of using learning strategies may result in unplanned learning with unsatisfactory achievements in spite of more efforts involved. Self-efficacy in foreign language learning may be weakened accordingly. Learners’ past achievement is a reference dimension in measuring self-efficacy with weaker influence.

Download Full-text

The Application of Improved Decision Tree Algorithm in Data Mining of Employment Rate: Evidence from China

2009 First International Workshop on Database Technology and Applications ◽

10.1109/dbta.2009.72 ◽

2009 ◽

Cited By ~ 1

Author(s):

Yuxiang Shao ◽

Qing Chen ◽

Weiming Yin

Keyword(s):

Data Mining ◽

Decision Tree ◽

Employment Rate ◽

Decision Tree Algorithm ◽

Tree Algorithm

Download Full-text

Optimization of Management Mode of Small- and Medium-Sized Enterprises Based on Decision Tree Model

Journal of Mathematics ◽

10.1155/2021/2815086 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Yuzhu Diao ◽

Qing Zhang

Keyword(s):

Project Management ◽

Decision Tree ◽

Performance Appraisal ◽

Information Gain ◽

Decision Tree Algorithm ◽

Management Information ◽

Report Generation ◽

Tree Algorithm ◽

Gain Rate ◽

C4.5 Algorithm

Decision tree algorithm is a common classification algorithm in data mining technology, and its results are usually expressed in the form of if-then rules. The C4.5 algorithm is one of the decision tree algorithms, which has the advantages of easy to understand and high accuracy, and the concept of information gain rate is added compared with its predecessor ID3 algorithm. After theoretical analysis, C4.5 algorithm is chosen to analyze the performance appraisal results, and the decision tree for performance appraisal is generated by collecting data, data preprocessing, calculating information gain rate, determining splitting attributes, and postpruning. The system is developed in B/S architecture, and an R&D project management system and platform that can realize performance assessment analysis are built by means of visualization tools, decision tree algorithm, and dynamic web pages. The system includes information storage, task management, report generation, role authority control, information visualization, and other management information system functional modules. They can realize the project management functions such as project establishment and management, task flow, employee information filling and management, performance assessment system establishment, report generation of various dimensions, management cockpit construction. With decision tree algorithm as the core technology, the system obtains scientific and reliable project management information with high accuracy and realizes data visualization, which can assist enterprises to establish a good management system in the era of big data.

Download Full-text

Decision tree algorithm in locally advanced rectal cancer: an example of over-interpretation and misuse of a machine learning approach

Journal of Cancer Research and Clinical Oncology ◽

10.1007/s00432-019-03102-y ◽

2019 ◽

Vol 146 (3) ◽

pp. 761-765 ◽

Cited By ~ 2

Author(s):

Francesca De Felice ◽

D. Crocetti ◽

M. Parisi ◽

V. Maiuri ◽

E. Moscarelli ◽

...

Keyword(s):

Machine Learning ◽

Rectal Cancer ◽

Decision Tree ◽

Locally Advanced ◽

Advanced Rectal Cancer ◽

Locally Advanced Rectal Cancer ◽

Learning Approach ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Machine Learning Approach

Download Full-text

Data Mining for Detecting E-learning Courses Anomalies: An Application of Decision Tree Algorithm

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.8.3.2756 ◽

2018 ◽

Vol 8 (3) ◽

pp. 980 ◽

Cited By ~ 2

Author(s):

Fatiha Elghibari ◽

Rachid Elouahbi ◽

Fatima El Khoukhi

Keyword(s):

Data Mining ◽

Decision Tree ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

E Learning

Download Full-text

Classifier Ensemble Algorithm for Data Stream with Attribute Uncertainty

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5747 ◽

2016 ◽

Vol 13 (10) ◽

pp. 7519-7525 ◽

Cited By ~ 1

Author(s):

Zhang Xing ◽

Wang MeiLi ◽

Zhang Yang ◽

Ning Jifeng

Keyword(s):

Decision Tree ◽

Data Stream ◽

High Speed ◽

Information Gain ◽

Uncertain Data ◽

Classifier Ensemble ◽

Ensemble Classifiers ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Ensemble Algorithm

To build a classifier for uncertain data stream, an Ensemble of Uncertain Decision Tree Algorithm (EDTU) is proposed. Firstly, the decision tree algorithm for uncertain data (DTU) was improved by changing the calculation method of its information gain and improving the efficiency of the algorithm so that it can process the high-speed flow of data streams; then, based on this basic classifier, dynamic classifier ensemble algorithm was used, and the classifiers presenting effective classification were selected to constitute ensemble classifiers. Experimental results on SEA and Forest Covertype Datasets demonstrate that the proposed EDTU algorithm is efficient in classifying data stream with uncertain attribute, and the performance is stable under the different parameters.

Download Full-text