ATTRIBUTE SELECTION USING ROUGH SETS IN SOFTWARE QUALITY CLASSIFICATION

Author(s):  
TAGHI M. KHOSHGOFTAAR ◽  
LOFTON A. BULLARD ◽  
KEHAN GAO

Finding techniques to reduce software developmental effort and produce highly reliable software is an extremely vital goal for software developers. One method that has proven quite useful is the application of software metrics-based classification models. Classification models can be constructed to identify faulty components in a software system with high accuracy. Significant research has been dedicated towards developing methods for improving the quality of software metrics-based classification models. It has been shown in several studies that the accuracy of these models improves when irrelevant attributes are identified and eliminated from the training data set. This study presents a rough set theory approach, based on classical set theory, for identifying and eliminating irrelevant attributes from a training data set. Rough set theory is used to find small groups of attributes, determined by the relationships that exist between the objects in a data set, with comparable discernibility as larger sets of attributes. This allows for the development of simpler classification models that are easy for analyst to understand and explain to others. We built case-based reasoning models in order to evaluate their classification performance on the smaller subsets of attributes selected using rough set theory. The empirical studies demonstrated that by applying a rough set approach to find small subsets of attributes we can build case-based reasoning models with an accuracy comparable to, and in some cases better than, a case-based reasoning model built with a complete set of attributes.

2014 ◽  
Vol 3 (3) ◽  
pp. 285-294 ◽  
Author(s):  
Mohammad Taghi Rezvan ◽  
Ali Zeinal Hamadani ◽  
Babak Saffari ◽  
Ali Shalbafzadeh

2012 ◽  
Vol 170-173 ◽  
pp. 3644-3648
Author(s):  
Chun Fei Yuan ◽  
Jing Cai ◽  
Yi Ming Xu

Modern fault diagnosis system always is a dynamic, flexible and uncertain complicated system, so many fault diagnosis methods are not effective to determine fault causes. Considering that abundant of fault diagnosis cases have been accumulated in daily maintenance work, a fault diagnosis method based on case-based reasoning (CBR) and rough set theory is proposed. Rough set theory is employed to process reduction on attributes and the weighting coefficient of case description attributes. This method makes full use of the advantage of" let the data speak". At last the method is testified by an example, and the result shows it is feasible and effective.


2018 ◽  
Vol 7 (2) ◽  
pp. 75-84 ◽  
Author(s):  
Shivam Shreevastava ◽  
Anoop Kumar Tiwari ◽  
Tanmoy Som

Feature selection is one of the widely used pre-processing techniques to deal with large data sets. In this context, rough set theory has been successfully implemented for feature selection of discrete data set but in case of continuous data set it requires discretization, which may cause information loss. Fuzzy rough set theory approaches have also been used successfully to resolve this issue as it can handle continuous data directly. Moreover, almost all feature selection techniques are used to handle homogeneous data set. In this article, the center of attraction is on heterogeneous feature subset reduction. A novel intuitionistic fuzzy neighborhood models have been proposed by combining intuitionistic fuzzy sets and neighborhood rough set models by taking an appropriate pair of lower and upper approximations and generalize it for feature selection, supported with theory and its validation. An appropriate algorithm along with application to a data set has been added.


2013 ◽  
Vol 13 (Special-Issue) ◽  
pp. 62-74 ◽  
Author(s):  
Zhong Wu ◽  
Ruixia Yan

Abstract To tackle a multi-attribute decision making problem, rough set and casebased reasoning are often combined. However, the reduction in a rough set is always complex. In this paper we provide a new relative importance measure about the unitary attributes values by ranking the relative importance of the attributes in the rough set theory. A new rough set model based on ranking the relative importance of the attributes is built and its properties are studied. Then unitary attributes values are utilized to compute the similarity of rules in case-based reasoning, for there might be incompletely match or miss values. A new multiattribute decision making based on case-based reasoning and a rough set based on the ranking relative importance of the attributes is constructed, which obtains rules, avoiding reduction and rule extraction.


2020 ◽  
Vol 9 (4) ◽  
pp. 1701-1710
Author(s):  
Saif Ali Alsaidi ◽  
Ahmed T. Sadeq ◽  
Hasanen S. Abdullah

In recent years, Text Mining wasan important topic because of the growth of digital text data from many sources such as government document, Email, Social Media, Website, etc. The English poemsare one of the text data to categorization English Poems will use Text categorization, Text categorization is a method in which classify documents into one or more categories that were predefined the category based on the text content in a document .In this paper we will solve the problem of how to categorize the English poem into one of the English Poems categorizations by using text mining technique and Machine learning algorithm, Our data set consist of seven categorizations for poems the data set is divided into two-part training (learning)and testing data. In the proposed model we apply the text preprocessing for the documents file to reduce the number of feature and reduce dimensionality the preprocessing process converts the text poem to features and remove the irrelevant feature by using text mining process (tokenize,remove stop word and stemming), to reduce the feature vector of the remaining feature we usetwo methods for feature selection and use Rough set theory as machine learning algorithm to perform the categorization, and we get 88% success classification of the proposed model.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Hengrong Ju ◽  
Huili Dou ◽  
Yong Qi ◽  
Hualong Yu ◽  
Dongjun Yu ◽  
...  

Decision-theoretic rough set is a quite useful rough set by introducing the decision cost into probabilistic approximations of the target. However, Yao’s decision-theoretic rough set is based on the classical indiscernibility relation; such a relation may be too strict in many applications. To solve this problem, aδ-cut decision-theoretic rough set is proposed, which is based on theδ-cut quantitative indiscernibility relation. Furthermore, with respect to criterions of decision-monotonicity and cost decreasing, two different algorithms are designed to compute reducts, respectively. The comparisons between these two algorithms show us the following: (1) with respect to the original data set, the reducts based on decision-monotonicity criterion can generate more rules supported by the lower approximation region and less rules supported by the boundary region, and it follows that the uncertainty which comes from boundary region can be decreased; (2) with respect to the reducts based on decision-monotonicity criterion, the reducts based on cost minimum criterion can obtain the lowest decision costs and the largest approximation qualities. This study suggests potential application areas and new research trends concerning rough set theory.


Author(s):  
Eleazar Gil-Herrera ◽  
Garrick Aden-Buie ◽  
Ali Yalcin ◽  
Athanasios Tsalatsanis ◽  
Laura E. Barnes ◽  
...  

Fuzzy Systems ◽  
2017 ◽  
pp. 1367-1384
Author(s):  
Noor Akhmad Setiawan

The objective of this research is to develop an evidence based fuzzy decision support system for the diagnosis of coronary artery disease. The development of decision support system is implemented based on three processing stages: rule generation, rule selection and rule fuzzification. Rough Set Theory (RST) is used to generate the classification rules from training data set. The training data are obtained from University California Irvine (UCI) data repository. Rule selection is conducted by transforming the rules into a decision table based on unseen data set. Furthermore, RST attributes reduction is proposed and applied to select the most important rules. The selected rules are transformed into fuzzy rules based on discretization cuts of numerical input attributes and simple triangular and trapezoidal membership functions. Fuzzy rules weighing is also proposed and applied based on rules support on the training data. The system is validated using UCI heart disease data sets collected from the U.S., Switzerland and Hungary and data set from Ipoh Specialist Hospital Malaysia. The system is verified by three cardiologists. The results show that the system is able to give the approximate possibility of coronary artery blocking.


Sign in / Sign up

Export Citation Format

Share Document