Data-driven decision tree learning algorithm based on rough set theory

Author(s):  
Desheng Yin ◽  
Guoyin Wang ◽  
Yu Wu
2020 ◽  
Vol 9 (4) ◽  
pp. 1701-1710
Author(s):  
Saif Ali Alsaidi ◽  
Ahmed T. Sadeq ◽  
Hasanen S. Abdullah

In recent years, Text Mining wasan important topic because of the growth of digital text data from many sources such as government document, Email, Social Media, Website, etc. The English poemsare one of the text data to categorization English Poems will use Text categorization, Text categorization is a method in which classify documents into one or more categories that were predefined the category based on the text content in a document .In this paper we will solve the problem of how to categorize the English poem into one of the English Poems categorizations by using text mining technique and Machine learning algorithm, Our data set consist of seven categorizations for poems the data set is divided into two-part training (learning)and testing data. In the proposed model we apply the text preprocessing for the documents file to reduce the number of feature and reduce dimensionality the preprocessing process converts the text poem to features and remove the irrelevant feature by using text mining process (tokenize,remove stop word and stemming), to reduce the feature vector of the remaining feature we usetwo methods for feature selection and use Rough set theory as machine learning algorithm to perform the categorization, and we get 88% success classification of the proposed model.


2015 ◽  
Vol 743 ◽  
pp. 390-394
Author(s):  
Liu Liu ◽  
Bao Sheng Wang ◽  
Qiu Xi Zhong ◽  
Hai Liang Hu

Rough set theory is a popular mathematical knowledge to resolve problems which are vagueness and uncertainly. And it has been used of solving the redundancy of attribute and data. Decision tree has been widely used in data mining techniques, because it is efficient, fast and easy to be understood in terms of data classification. There are many approaches have been used to generate a decision tree. In this paper, a novel and effective algorithm is introduced for decision tree. This algorithm is based on the core of discernibility matrix on rough set theory and the degree of consistent dependence. This algorithm is to improve the decision tree on node selection. This approach reduces the time complexity of the decision tree production and space complexity compared with ID3.In the end of the article, there is an example of the algorithm can exhibit superiority.


Sign in / Sign up

Export Citation Format

Share Document