Novel gene sets improve set-level classification of prokaryotic gene expression data

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.

Download Full-text

A class imbalance-aware Relief algorithm for the classification of tumors using microarray gene expression data

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2019.03.017 ◽

2019 ◽

Vol 80 ◽

pp. 121-127 ◽

Cited By ~ 3

Author(s):

Yuanyu He ◽

Junhai Zhou ◽

Yaping Lin ◽

Tuanfei Zhu

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Class Imbalance ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Relief Algorithm ◽

Classification Of Tumors ◽

Microarray Gene

Download Full-text

Improving the Performance of Principal Components for Classification of Gene Expression Data Through Feature Selection

Studies in Classification, Data Analysis, and Knowledge Organization - Data Science and Classification ◽

10.1007/3-540-34416-0_35 ◽

2006 ◽

pp. 325-332

Author(s):

Edgar Acuña ◽

Jaime Porras

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Principal Components ◽

Expression Data

Download Full-text

Classification of micro-array gene expression data using neural networks

The 2010 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2010.5596568 ◽

2010 ◽

Author(s):

David Tian ◽

Keith Burley

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Gene Expression Data ◽

Expression Data ◽

Micro Array

Download Full-text

Classification of Microarray Gene Expression Data by MultiBlock Dimension Reduction

Communications for Statistical Applications and Methods ◽

10.5351/ckss.2006.13.3.567 ◽

2006 ◽

Vol 13 (3) ◽

pp. 567-576

Author(s):

Mi-Ra Oh ◽

Seo-Young Kim ◽

Kyung-Sook Kim ◽

Jang-Sun Baek ◽

Young-Sook Son

Keyword(s):

Gene Expression ◽

Dimension Reduction ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

INCORPORATING FEATURE RANKING AND EVOLUTIONARY METHODS FOR THE CLASSIFICATION OF HIGH-DIMENSIONAL DNA MICROARRAY GENE EXPRESSION DATA

Australasian Medical Journal ◽

10.21767/amj.2013.1641 ◽

2013 ◽

Vol 06 (05) ◽

Author(s):

Mani Abedini ◽

Michael Kirley ◽

Raymond Chiong

Keyword(s):

Gene Expression ◽

Dna Microarray ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

High Dimensional ◽

Feature Ranking ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

COMBINING GENERALIZED NMF AND DISCRIMINATIVE MIXTURE MODELS FOR CLASSIFICATION OF GENE EXPRESSION DATA

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001408006892 ◽

2008 ◽

Vol 22 (08) ◽

pp. 1587-1598 ◽

Cited By ~ 3

Author(s):

WEIXIANG LIU ◽

KEHONG YUAN ◽

JIAN WU ◽

DATIAN YE ◽

ZHEN JI ◽

...

Keyword(s):

Gene Expression ◽

Mixture Model ◽

Gene Expression Data ◽

Small Sample Size ◽

Data Classification ◽

Small Sample ◽

Training Data ◽

Microarray Data Analysis ◽

Expression Data

Classification of gene expression samples is a core task in microarray data analysis. How to reduce thousands of genes and to select a suitable classifier are two key issues for gene expression data classification. This paper introduces a framework on combining both feature extraction and classifier simultaneously. Considering the non-negativity, high dimensionality and small sample size, we apply a discriminative mixture model which is designed for non-negative gene express data classification via non-negative matrix factorization (NMF) for dimension reduction. In order to enhance the sparseness of training data for fast learning of the mixture model, a generalized NMF is also adopted. Experimental results on several real gene expression datasets show that the classification accuracy, stability and decision quality can be significantly improved by using the generalized method, and the proposed method can give better performance than some previous reported results on the same datasets.

Download Full-text