English poems categorization using text mining  and rough set theory

In recent years, Text Mining wasan important topic because of the growth of digital text data from many sources such as government document, Email, Social Media, Website, etc. The English poemsare one of the text data to categorization English Poems will use Text categorization, Text categorization is a method in which classify documents into one or more categories that were predefined the category based on the text content in a document .In this paper we will solve the problem of how to categorize the English poem into one of the English Poems categorizations by using text mining technique and Machine learning algorithm, Our data set consist of seven categorizations for poems the data set is divided into two-part training (learning)and testing data. In the proposed model we apply the text preprocessing for the documents file to reduce the number of feature and reduce dimensionality the preprocessing process converts the text poem to features and remove the irrelevant feature by using text mining process (tokenize,remove stop word and stemming), to reduce the feature vector of the remaining feature we usetwo methods for feature selection and use Rough set theory as machine learning algorithm to perform the categorization, and we get 88% success classification of the proposed model.

Download Full-text

Rough Set Theory Application in Online Course Satisfaction

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.271-273.1239 ◽

2011 ◽

Vol 271-273 ◽

pp. 1239-1242

Author(s):

Shao Jun Chen

Keyword(s):

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Online Courses ◽

Educational Environment ◽

Course Satisfaction ◽

Online Course ◽

High Quality ◽

Proposed Model ◽

Theory Application

The most important issue for online courses is to provide learners with high quality satisfacion. In order to resolve the question and evaluating course satisfaction , rough set theory is proposed in this article, by which we reduce 10 attributes to 5 and get the index of value assessment.As a result, teachers can make an adjustment to achieve better effect in teaching by taking advantage of the method.The proposed model can be applied to not only a network environment but also remote educational environment.

Download Full-text

Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

Journal of Integrative Bioinformatics ◽

10.1515/jib-2012-199 ◽

2012 ◽

Vol 9 (3) ◽

pp. 1-17 ◽

Cited By ~ 5

Author(s):

D. Calvo-Dmgz ◽

J. F. Gálvez ◽

D. Glez-Peña ◽

S. Gómez-Meire ◽

F. Fdez-Riverola

Keyword(s):

Gene Expression ◽

Set Theory ◽

Rough Set ◽

Microarray Data ◽

Rough Set Theory ◽

Biological Knowledge ◽

Variable Precision ◽

Gene Sets ◽

Proposed Model ◽

Variable Precision Rough Set

Summary DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS). The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

Download Full-text

Intuitionistic Fuzzy Neighborhood Rough Set Model for Feature Selection

International Journal of Fuzzy System Applications ◽

10.4018/ijfsa.2018040104 ◽

2018 ◽

Vol 7 (2) ◽

pp. 75-84 ◽

Cited By ~ 3

Author(s):

Shivam Shreevastava ◽

Anoop Kumar Tiwari ◽

Tanmoy Som

Keyword(s):

Feature Selection ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Continuous Data ◽

Feature Subset ◽

Data Set ◽

Intuitionistic Fuzzy ◽

Neighborhood Models ◽

Neighborhood Rough Set

Feature selection is one of the widely used pre-processing techniques to deal with large data sets. In this context, rough set theory has been successfully implemented for feature selection of discrete data set but in case of continuous data set it requires discretization, which may cause information loss. Fuzzy rough set theory approaches have also been used successfully to resolve this issue as it can handle continuous data directly. Moreover, almost all feature selection techniques are used to handle homogeneous data set. In this article, the center of attraction is on heterogeneous feature subset reduction. A novel intuitionistic fuzzy neighborhood models have been proposed by combining intuitionistic fuzzy sets and neighborhood rough set models by taking an appropriate pair of lower and upper approximations and generalize it for feature selection, supported with theory and its validation. An appropriate algorithm along with application to a data set has been added.

Download Full-text

Data-driven decision tree learning algorithm based on rough set theory

Proceedings of the 2005 International Conference on Active Media Technology, 2005. (AMT 2005). ◽

10.1109/amt.2005.1505426 ◽

2005 ◽

Author(s):

Desheng Yin ◽

Guoyin Wang ◽

Yu Wu

Keyword(s):

Decision Tree ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Learning Algorithm ◽

Data Driven ◽

Decision Tree Learning

Download Full-text

Construction Of Opinion Models For E-Learning Courses By Rough Set Theory And Text Mining

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8066.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 1107-1111 ◽

Cited By ~ 1

Keyword(s):

Text Mining ◽

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Attribute Reduction ◽

Expressive Power ◽

Machine Learning Techniques ◽

Decision Attributes ◽

E Learning ◽

Lower And Upper Approximations

Extracting knowledge through the machine learning techniques in general lacks in its predictions the level of perfection with minimal error or accuracy. Recently, researchers have been enjoying the fruits of Rough Set Theory (RST) to uncover the hidden patterns with its simplicity and expressive power. In RST mainly the issue of attribute reduction is tackled through the notion of ‘reducts’ using lower and upper approximations of rough sets based on a given information table with conditional and decision attributes. Hence, while researchers go for dimension reduction they propose many methods among which RST approach shown to be simple and efficient for text mining tasks. The area of text mining has focused on patterns based on text files or corpus, initially preprocessed to identify and remove irrelevant and replicated words without inducing any information loss for the classifying models later generated and tested. In this current work, this hypothesis are taken as core and tested on feedbacks for elearning courses using RST’s attribution reduction and generating distinct models of n-grams and finally the results are presented for selecting final efficient model

Download Full-text

δ-Cut Decision-Theoretic Rough Set Approach: Model and Attribute Reductions

The Scientific World JOURNAL ◽

10.1155/2014/382439 ◽

2014 ◽

Vol 2014 ◽

pp. 1-12 ◽

Cited By ~ 1

Author(s):

Hengrong Ju ◽

Huili Dou ◽

Yong Qi ◽

Hualong Yu ◽

Dongjun Yu ◽

...

Keyword(s):

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Original Data ◽

Boundary Region ◽

Research Trends ◽

Data Set ◽

Lower Approximation ◽

New Research ◽

Decision Cost

Decision-theoretic rough set is a quite useful rough set by introducing the decision cost into probabilistic approximations of the target. However, Yao’s decision-theoretic rough set is based on the classical indiscernibility relation; such a relation may be too strict in many applications. To solve this problem, aδ-cut decision-theoretic rough set is proposed, which is based on theδ-cut quantitative indiscernibility relation. Furthermore, with respect to criterions of decision-monotonicity and cost decreasing, two different algorithms are designed to compute reducts, respectively. The comparisons between these two algorithms show us the following: (1) with respect to the original data set, the reducts based on decision-monotonicity criterion can generate more rules supported by the lower approximation region and less rules supported by the boundary region, and it follows that the uncertainty which comes from boundary region can be decreased; (2) with respect to the reducts based on decision-monotonicity criterion, the reducts based on cost minimum criterion can obtain the lowest decision costs and the largest approximation qualities. This study suggests potential application areas and new research trends concerning rough set theory.

Download Full-text

ATTRIBUTE SELECTION USING ROUGH SETS IN SOFTWARE QUALITY CLASSIFICATION

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539309003307 ◽

2009 ◽

Vol 16 (01) ◽

pp. 73-89 ◽

Cited By ~ 16

Author(s):

TAGHI M. KHOSHGOFTAAR ◽

LOFTON A. BULLARD ◽

KEHAN GAO

Keyword(s):

Set Theory ◽

Rough Set ◽

Software Metrics ◽

Rough Set Theory ◽

Training Data ◽

Case Based Reasoning ◽

Classification Models ◽

Data Set ◽

Irrelevant Attributes ◽

Case Based

Finding techniques to reduce software developmental effort and produce highly reliable software is an extremely vital goal for software developers. One method that has proven quite useful is the application of software metrics-based classification models. Classification models can be constructed to identify faulty components in a software system with high accuracy. Significant research has been dedicated towards developing methods for improving the quality of software metrics-based classification models. It has been shown in several studies that the accuracy of these models improves when irrelevant attributes are identified and eliminated from the training data set. This study presents a rough set theory approach, based on classical set theory, for identifying and eliminating irrelevant attributes from a training data set. Rough set theory is used to find small groups of attributes, determined by the relationships that exist between the objects in a data set, with comparable discernibility as larger sets of attributes. This allows for the development of simpler classification models that are easy for analyst to understand and explain to others. We built case-based reasoning models in order to evaluate their classification performance on the smaller subsets of attributes selected using rough set theory. The empirical studies demonstrated that by applying a rough set approach to find small subsets of attributes we can build case-based reasoning models with an accuracy comparable to, and in some cases better than, a case-based reasoning model built with a complete set of attributes.

Download Full-text