Rough Set-Based Clustering Utilizing Probabilistic Memberships

Representing the positive, possible, and boundary regions of clusters, rough set-based C-means clustering methods, such as generalized rough C-means (GRCM) and rough set C-means (RSCM), are promising for analyzing vague cluster shapes and realizing reliable classification. In this study, we consider rough set-based clustering approaches that utilize probabilistic memberships as variants of GRCM and RSCM, including π generalized rough C-means (πGRCM), π rough set C-means (πRSCM), and rough membership C-means (RMCM). πGRCM and πRSCM assign equal probabilities of cluster belonging according to Laplace’s principle of indifference, whereas RMCM assigns the probabilities according to rough memberships, which represent conditional probabilities based on the object’s neighborhood derived from a binary relation. In addition, we discuss the theoretical validity of our RMCM approach and compare it with other methods considered in this study. Furthermore, we conducted numerical experiments for evaluating the classification performances of the abovementioned methods. Based on our experimental results, the methods were found to be effective.

Download Full-text

Data Clustering Algorithms Using Rough Sets

Handbook of Research on Computational Intelligence for Engineering, Science, and Business ◽

10.4018/978-1-4666-2518-1.ch012 ◽

2013 ◽

pp. 297-327 ◽

Cited By ~ 6

Author(s):

B.K. Tripathy ◽

Adhir Ghosh

Keyword(s):

Comparative Study ◽

Rough Set ◽

Fuzzy Clustering ◽

Fuzzy Set ◽

Rough Sets ◽

Data Clustering ◽

Clustering Algorithms ◽

Clustering Methods ◽

Future Studies ◽

Multiple Clusters

Developing Data Clustering algorithms have been pursued by researchers since the introduction of k-means algorithm (Macqueen 1967; Lloyd 1982). These algorithms were subsequently modified to handle categorical data. In order to handle the situations where objects can have memberships in multiple clusters, fuzzy clustering and rough clustering methods were introduced (Lingras et al 2003, 2004a). There are many extensions of these initial algorithms (Lingras et al 2004b; Lingras 2007; Mitra 2004; Peters 2006, 2007). The MMR algorithm (Parmar et al 2007), its extensions (Tripathy et al 2009, 2011a, 2011b) and the MADE algorithm (Herawan et al 2010) use rough set techniques for clustering. In this chapter, the authors focus on rough set based clustering algorithms and provide a comparative study of all the fuzzy set based and rough set based clustering algorithms in terms of their efficiency. They also present problems for future studies in the direction of the topics covered.

Download Full-text

Rough Set Based Aggregation for Effective Evaluation of Web Search Systems

Handbook of Research on Industrial Informatics and Manufacturing Intelligence - Advances in Civil and Industrial Engineering ◽

10.4018/978-1-4666-0294-6.ch008 ◽

2012 ◽

pp. 193-210

Author(s):

Rashid Ali ◽

M. M. Sufyan Beg

Keyword(s):

Set Theory ◽

Rough Set ◽

Search Engines ◽

Web Search ◽

Rough Set Theory ◽

User Feedback ◽

Rank Aggregation ◽

Experimental Results ◽

Industrial Environment ◽

Aggregation Technique

Rank aggregation is the process of generating a single aggregated ranking for a given set of rankings. In industrial environment, there are many applications where rank aggregation must be applied. Rough set based rank aggregation is a user feedback based technique which mines ranking rules for rank aggregation using rough set theory. In this chapter, the authors discuss rough set based rank aggregation technique in light of Web search evaluation. Since there are many search engines available, which can be used by used by industrial houses to advertise their products, Web search evaluation is essential to decide which search engines to rely on. Here, the authors discuss the limitations of rough set based rank aggregation and present an improved version of the same, which is more suitable for aggregation of different techniques for Web search evaluation. In the improved version, the authors incorporate the confidence of the rules in predicting a class for a given set of data. They validate the mined ranking rules by comparing the predicted user feedback based ranking with the actual user feedback based ranking. They show their experimental results pertaining to the evaluation of seven public search engines using improved version of rough set based aggregation for a set of 37 queries.

Download Full-text

Leaning Causal Models with Conditional Causal Probabilities from Data

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2002.p0025 ◽

2002 ◽

Vol 6 (1) ◽

pp. 25-32

Author(s):

Koichi Yamada ◽

Keyword(s):

Numerical Experiments ◽

Causal Model ◽

Causal Models ◽

Conditional Probabilities ◽

Prior Probabilities ◽

Counting Data

We propose a way to lean probabilistic causal models using conditional causal probabilities (CCPs) to represent uncertainty of causalities. The CCP is a probability devised by Peng and Reggia representing the uncertainty that a cause actually causes an effect given the cause. The main advantage of using CCPs is that they represent exact probabilities of causalities that people recognize mentally, and that the number of probabilities used in the causal model is far smaller than that of conditional probabilities by all combinations of possible causes. Thus, Peng and Reggia assumed that CCPs are given by human experts as subjective ones, and did not discuss how to calculate them from data when a dataset was available. We address this problem, starting from a discussion about properties of data frequently given in practical problems, and shows that prior probabilities that should be learned may differ from those derived by counting data. We then discuss and propose how to learn prior probabilities and CCPs from data, and evaluate the proposed method through numerical experiments and analyze results to show that the precision of leaned models is satisfactory.

Download Full-text

A Generalized Attributes Significance Measure Method

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.1551 ◽

2014 ◽

Vol 989-994 ◽

pp. 1551-1554

Author(s):

Fa Chao Li ◽

Ming Li ◽

Shuo Liu

Keyword(s):

Set Theory ◽

Rough Set ◽

Rough Set Theory ◽

Experimental Results ◽

Core Content ◽

Measure Methods ◽

The Difference

Currently, rough set theory has been widely used in many fields. In rough set theory, how to measure the attributes significance of the data is a core content. In order to solve the problem that the existing attributes significance measure methods usually ignore the interaction among the attributes, the paper presents a measure method based on difference degree. When given a set, the proposed method first divides it into several subsets according to the value of condition attributes, and then computes the difference degree in the subsets. Secondly, the important attributes are selected based on the value of difference degree. Further the paper discussed some properties of the difference degree, and the experimental results shows the effectiveness of this method in the final.

Download Full-text

THREE-WAY GOVERNMENT DECISION ANALYSIS WITH DECISION-THEORETIC ROUGH SETS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488512400090 ◽

2012 ◽

Vol 20 (supp01) ◽

pp. 119-132 ◽

Cited By ~ 82

Author(s):

DUN LIU ◽

TIANRUI LI ◽

DECUI LIANG

Keyword(s):

Rough Set ◽

Rough Sets ◽

Decision Procedure ◽

Loss Functions ◽

Conditional Probabilities ◽

Bayesian Decision ◽

Government Decision ◽

Risk Investment ◽

Negative Rule

By considering the risks in policy making procedure, a three-way decision approach based on the decision-theoretic rough set model is adopted to risk government decision-making. A three-way decision is made based on a pair of thresholds on conditional probabilities. A positive rule makes a decision of executing, a negative rule makes a decision of non-executing, and a boundary rule makes a decision of deferment. The loss functions are used to calculate the required two thresholds to describe the decision risk with the Bayesian decision procedure. A case study of government petroleum risk investment demonstrates the proposed method.

Download Full-text

A Heuristic Attribute Reduction Based on Multi-Granularity Rough Set

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.513-517.973 ◽

2014 ◽

Vol 513-517 ◽

pp. 973-977

Author(s):

Zhi Li Pei ◽

Jian Hong Qi ◽

Li Sha Liu ◽

Qing Hu Wang ◽

Ming Yang Jiang ◽

...

Keyword(s):

Rough Set ◽

Classification Accuracy ◽

Attribute Reduction ◽

Experimental Results ◽

Attribute Importance

In 2012, Wang Zuofei built up granularity-function and applied it to the measure of attribute importance and attribute reduction. On this basis, granularity-function based upon pessimistic and optimistic multi-granularity rough set is constructed. It is applied to the calculation of attribute importance and attribute reduction. According to the experimental results, the method can reduce the dimension of features and obviously improve the classification accuracy and efficiency.

Download Full-text

Noise Rejection Approaches for Various Rough Set-Based C-Means Clustering

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2020.p0738 ◽

2020 ◽

Vol 24 (6) ◽

pp. 738-749

Author(s):

Seiki Ubukata ◽

◽

Sho Sekiya ◽

Akira Notsu ◽

Katsuhiro Honda

Keyword(s):

Cluster Analysis ◽

Robust Estimation ◽

Rough Set ◽

Real World ◽

Numerical Experiments ◽

Cluster Center ◽

Two Dimensional ◽

Noise Rejection ◽

Noise Clustering ◽

Real World Datasets

In the field of cluster analysis, rough set-based extensions of hard C-means (HCM; k-means) including rough C-means (RCM), rough set C-means (RSCM), and rough membership C-means (RMCM) are promising approaches for dealing with the certainty, possibility, uncertainty of belonging of object to clusters. Since C-means-type methods are strongly affected by noise, noise clustering approaches have been proposed. In noise clustering approaches, noise objects, which are far from any cluster center, are rejected for robust estimation. In this paper, we introduce noise rejection approaches for rough set-based C-means based on probabilistic memberships and propose noise RCM with membership normalization (NRCM-MN), noise RSCM with membership normalization (NRSCM-MN), and noise RMCM (NRMCM). In addition, visualization demonstration of the cluster boundaries on the two-dimensional plane of the proposed methods is carried out to confirm the characteristics of each method. Furthermore, the clustering performance is verified by numerical experiments using real-world datasets.

Download Full-text

Construction of Lexicons to Perk Up Re-Clustering

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2018.7.3.1891 ◽

2018 ◽

Vol 7 (3) ◽

pp. 82-85

Author(s):

A. George Louis Raja ◽

F. Sagayaraj Francis ◽

P. Sugumar

Keyword(s):

Experimental Results ◽

Incremental Clustering ◽

Clustering Methods ◽

Semantic Clustering ◽

Construction Algorithm ◽

Speed Up ◽

Breakeven Point

The existing semantic methods cluster the documents based on unabridged or abridged term comparisons. After clustering, these terms are not preserved, costing the cluster operation to be repeated in its entirety upon the arrival of new documents. Hence the semantic clustering methods can be considered as “on the go” methods. Re-clustering becomes unavoidable in all circumstances both in the Iterative and Incremental Clustering Methods. It would be more appropriate to build and evolve a lexicon with the derived keywords of the documents and to refer them in further cluster operations. The rationale is to deny re-clustering upon new documents and refer the Lexicon to formulate clusters until the quality of clusters is intact, and when it breaks above the threshold, the cluster operation can be repeated. Since re-clustering is delayed until a breakeven point, the process of re-clustering becomes faster. This process may incur additional runtime complexity, but would extremely simplify and speed up the process of re-clustering. This paper discusses about the construction of lexicons and its applications in clustering. The Keyword based Lexicon Construction Algorithm (KBLCA) is demonstrated to build lexicons and the breakeven point for re-clustering is proposed and described. The theory of denying re-clustering is briefed, along with experimental results.

Download Full-text

Rough Set Application in Customer Classification

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.267.46 ◽

2011 ◽

Vol 267 ◽

pp. 46-49 ◽

Cited By ~ 1

Author(s):

Ju Li ◽

Wen Bin Xu ◽

Wei Yuan Tu ◽

Xing Wang ◽

Wei Zhang ◽

...

Keyword(s):

Decision Making ◽

Rough Set ◽

Customer Relationship Management ◽

Decision Rules ◽

Relationship Management ◽

Experimental Results ◽

Customer Relationship ◽

Decision Table ◽

Final Decision ◽

Customer Classification

Based on the study of customer relationship management. First, we got the data from the database, transformed the corresponding decision table, then got the data in decision-making table for further simplification, generated the final decision rules. and got good results, experimental results showed that the method provided some practical value.

Download Full-text

Approximations in Rough Sets vs Granular Computing for Coverings

Developments in Natural Intelligence Research and Knowledge Engineering ◽

10.4018/978-1-4666-1743-8.ch011 ◽

2012 ◽

pp. 152-163

Author(s):

Guilong Liu ◽

William Zhu

Keyword(s):

Knowledge Discovery ◽

Binary Relation ◽

Set Theory ◽

Rough Set ◽

Rough Sets ◽

Rough Set Theory ◽

Knowledge Discovery In Databases ◽

Equivalence Relations ◽

Binary Relations ◽

Covering Rough Set

Rough set theory is an important technique in knowledge discovery in databases. Classical rough set theory proposed by Pawlak is based on equivalence relations, but many interesting and meaningful extensions have been made based on binary relations and coverings, respectively. This paper makes a comparison between covering rough sets and rough sets based on binary relations. This paper also focuses on the authors’ study of the condition under which the covering rough set can be generated by a binary relation and the binary relation based rough set can be generated by a covering.

Download Full-text