Leukemia Prediction from Gene Expression Data—A Rough Set Approach

Identification of cancer subtypes is the central goal in the cancer gene expression data analysis. Modified symmetry-based clustering is an unsupervised learning technique for detecting symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of cancer tissues (samples), in this chapter, the authors propose a rough set based hybrid approach for modified symmetry-based clustering algorithm. A natural basis for analyzing gene expression data using the symmetry-based algorithm is to group together genes with similar symmetrical patterns of microarray expressions. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in gene expression measurement data. For rough-set-theoretic decision rule generation, each cluster is classified using heuristically searched optimal reducts to overcome overlapping cluster problem. The rough modified symmetry-based clustering algorithm is compared with another newly implemented rough-improved symmetry-based clustering algorithm and existing K-Means algorithm over five benchmark cancer gene expression data sets, to demonstrate its superiority in terms of validity. The statistical analyses are also performed to establish the significance of this rough modified symmetry-based clustering approach.

Download Full-text

Cancer Gene Expression Data Analysis Using Rough Based Symmetrical Clustering

Handbook of Research on Computational Intelligence for Engineering, Science, and Business ◽

10.4018/978-1-4666-2518-1.ch027 ◽

2013 ◽

pp. 699-715 ◽

Cited By ~ 4

Author(s):

Anasua Sarkar ◽

Ujjwal Maulik

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Gene Expression Data ◽

Rough Set ◽

Clustering Algorithm ◽

Data Sets ◽

Cancer Gene ◽

Expression Data ◽

Gene Expression Data Analysis ◽

Cancer Subtypes

Identification of cancer subtypes is the central goal in the cancer gene expression data analysis. Modified symmetry-based clustering is an unsupervised learning technique for detecting symmetrical convex or non-convex shaped clusters. To enable fast automatic clustering of cancer tissues (samples), in this chapter, the authors propose a rough set based hybrid approach for modified symmetry-based clustering algorithm. A natural basis for analyzing gene expression data using the symmetry-based algorithm is to group together genes with similar symmetrical patterns of microarray expressions. Rough-set theory helps in faster convergence and initial automatic optimal classification, thereby solving the problem of unknown knowledge of number of clusters in gene expression measurement data. For rough-set-theoretic decision rule generation, each cluster is classified using heuristically searched optimal reducts to overcome overlapping cluster problem. The rough modified symmetry-based clustering algorithm is compared with another newly implemented rough-improved symmetry-based clustering algorithm and existing K-Means algorithm over five benchmark cancer gene expression data sets, to demonstrate its superiority in terms of validity. The statistical analyses are also performed to establish the significance of this rough modified symmetry-based clustering approach.

Download Full-text

A semi-supervised rough set and random forest approach for pattern classification of gene expression data

International Journal of Reasoning-based Intelligent Systems ◽

10.1504/ijris.2016.082976 ◽

2016 ◽

Vol 8 (3/4) ◽

pp. 155 ◽

Cited By ~ 1

Author(s):

Pradeep Kumar Mallick ◽

Debahuti Mishra ◽

Srikanta Patnaik ◽

Kailash Shaw

Keyword(s):

Gene Expression ◽

Random Forest ◽

Pattern Classification ◽

Gene Expression Data ◽

Rough Set ◽

Expression Data

Download Full-text

Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data

PLoS ONE ◽

10.1371/journal.pone.0167504 ◽

2016 ◽

Vol 11 (12) ◽

pp. e0167504

Author(s):

Pugalendhi Ganesh Kumar ◽

Muthu Subash Kavitha ◽

Byeong-Cheol Ahn

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Rough Set ◽

Human Gene ◽

Automated Detection ◽

Expression Data ◽

Fuzzy Rough Set ◽

Human Gene Expression

Download Full-text

A Novel and Efficient Rough Set Based Clustering Technique for Gene Expression Data

2014 2nd International Conference on Business and Information Management (ICBIM) ◽

10.1109/icbim.2014.6970930 ◽

2014 ◽

Author(s):

Krishnendu Adhikary ◽

Suman Das ◽

Samir Roy

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Rough Set ◽

Expression Data ◽

Clustering Technique

Download Full-text

Rough ACO: A Hybridized Model for Feature Selection in Gene Expression Data

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2010.1009 ◽

2010 ◽

pp. 85-98

Author(s):

Debahuti Mishra ◽

Dr. Amiya Kumar Rath ◽

Dr. Milu Acharya ◽

Tanushree Jena

Keyword(s):

Gene Expression ◽

Dimensionality Reduction ◽

Set Theory ◽

Gene Expression Data ◽

Rough Set ◽

Rough Set Theory ◽

Optimality Criteria ◽

Expression Data ◽

Essential Information ◽

Novel Method

Dimensionality reduction of a feature set is a common preprocessing step used for pattern recognition, classification applications and in compression schemes. Rough Set Theory is one of the popular methods used, and can be shown to be optimal using different optimality criteria. This paper proposes a novel method for dimensionality reduction of a feature set by choosing a subset of the original features that contains most of the essential information, using the same criteria as the ACO hybridized with Rough Set Theory. We call this method Rough ACO. The proposed method is successfully applied for choosing the best feature combinations and then applying the Upper and Lower Approximations to find the reduced set of features from a gene expression data.

Download Full-text