Biological networks integration based on dense module identification for gene prioritization from microarray data

We review several commonly used methods for the design and analysis of microarray data. To begin with, some experimental design issues are addressed. Several approaches for pre‐processing the data (filtering and normalization) before the statistical analysis stage are then discussed. A common first step in this type of analysis is gene selection based on statistical testing. Two approaches, permutation and model‐based methods are explained and we emphasize the need to correct for multiple testing. Moreover, powerful approaches based on gene sets are mentioned. Clustering of either genes or samples is frequently performed when analyzing microarray data. We summarize the basics of both supervised and unsupervised clustering (classification). The latter may be of use for creating diagnostic arrays, for example. Construction of biological networks, such as pathways, is a statistically challenging but complex task that is a relatively new development and hence mentioned only briefly. We finish with some remarks on literature and software. The emphasis in this paper is on the philosophy behind several statistical issues and on a critical interpretation of microarray related analysis methods.

Download Full-text

Module Identification of Biological Networks via Graph Partition

Recent Advances in Biological Network Analysis ◽

10.1007/978-3-030-57173-3_6 ◽

2020 ◽

pp. 125-150

Author(s):

Yijie Wang

Keyword(s):

Biological Networks ◽

Graph Partition ◽

Module Identification

Download Full-text

Biomarker Identification for Prostate Cancer and Lymph Node Metastasis from Microarray Data and Protein Interaction Network Using Gene Prioritization Method

The Scientific World JOURNAL ◽

10.1100/2012/842727 ◽

2012 ◽

Vol 2012 ◽

pp. 1-15 ◽

Cited By ~ 10

Author(s):

Carlos Roberto Arias ◽

Hsiang-Yuan Yeh ◽

Von-Wun Soo

Keyword(s):

Prostate Cancer ◽

Protein Interaction ◽

Microarray Data ◽

Shortest Paths ◽

Gene Prioritization ◽

The Other ◽

Biomarker Identification ◽

Lymph Nodes Metastasis ◽

Prioritization Method ◽

Voting Scheme

Finding a genetic disease-related gene is not a trivial task. Therefore, computational methods are needed to present clues to the biomedical community to explore genes that are more likely to be related to a specific disease as biomarker. We present biomarker identification problem using gene prioritization method called gene prioritization from microarray data based on shortest paths, extended with structural and biological properties and edge flux using voting scheme (GP-MIDAS-VXEF). The method is based on finding relevant interactions on protein interaction networks, then scoring the genes using shortest paths and topological analysis, integrating the results using a voting scheme and a biological boosting. We applied two experiments, one is prostate primary and normal samples and the other is prostate primary tumor with and without lymph nodes metastasis. We used 137 truly prostate cancer genes as benchmark. In the first experiment, GP-MIDAS-VXEF outperforms all the other state-of-the-art methods in the benchmark by retrieving the truest related genes from the candidate set in the top 50 scores found. We applied the same technique to infer the significant biomarkers in prostate cancer with lymph nodes metastasis which is not established well.

Download Full-text

Temporal and structural analysis of biological networks in combination with microarray data

2008 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology ◽

10.1109/cibcb.2008.4675760 ◽

2008 ◽

Author(s):

Chang Hun You ◽

Lawrence B. Holder ◽

Diane J. Cook

Keyword(s):

Structural Analysis ◽

Microarray Data ◽

Biological Networks

Download Full-text

Gene Module Identification from Microarray Data Using Nonnegative Independent Component Analysis

Gene Regulation and Systems Biology ◽

10.1177/117762500700100023 ◽

2007 ◽

Vol 1 ◽

pp. 117762500700100 ◽

Cited By ~ 1

Author(s):

Ting Gong ◽

Jianhua Xuan ◽

Chen Wang ◽

Huai Li ◽

Eric Hoffman ◽

...

Keyword(s):

Independent Component Analysis ◽

Microarray Data ◽

Component Analysis ◽

Independent Component ◽

Gene Module ◽

Module Identification

Download Full-text

Identifying communities from multiplex biological networks by randomized optimization of modularity

F1000Research ◽

10.12688/f1000research.15486.1 ◽

2018 ◽

Vol 7 ◽

pp. 1042 ◽

Cited By ~ 3

Author(s):

Gilles Didier ◽

Alberto Valdeolivas ◽

Anaïs Baudot

Keyword(s):

Biological Networks ◽

Disease Genes ◽

Module Identification ◽

Gwas Dataset ◽

Disease Module ◽

Common Operation ◽

Network Modularity ◽

Disease Community

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.

Download Full-text

An unsupervised disease module identification technique in biological networks using novel quality metric based on connectivity, conductance and modularity

F1000Research ◽

10.12688/f1000research.14258.1 ◽

2018 ◽

Vol 7 ◽

pp. 378 ◽

Cited By ~ 5

Author(s):

Raghvendra Mall ◽

Ehsan Ullah ◽

Khalid Kunji ◽

Michele Ceccarelli ◽

Halima Bensmail

Keyword(s):

Biological Networks ◽

Complex Traits ◽

Genome Wide Association Study ◽

Module Identification ◽

Quality Metric ◽

Genome Wide ◽

Refinement Method ◽

Disease Module ◽

Identification Technique ◽

Evaluation Metric

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.

Download Full-text

An Empirical Bayesian Method for Estimating Biological Networks from Temporal Microarray Data

Statistical Applications in Genetics and Molecular Biology ◽

10.2202/1544-6115.1513 ◽

2010 ◽

Vol 9 (1) ◽

Cited By ~ 27

Author(s):

Andrea Rau ◽

Florence Jaffrézic ◽

Jean-Louis Foulley ◽

Rebecca W Doerge

Keyword(s):

Microarray Data ◽

Biological Networks ◽

Bayesian Method ◽

Empirical Bayesian ◽

Empirical Bayesian Method

Download Full-text

Recursive module extraction using Louvain and PageRank

F1000Research ◽

10.12688/f1000research.15845.1 ◽

2018 ◽

Vol 7 ◽

pp. 1286

Author(s):

Dimitri Perrin ◽

Guido Zuccon

Keyword(s):

Community Detection ◽

Biological Networks ◽

Biological Network ◽

Biological Function ◽

Recursive Method ◽

Number Of Clusters ◽

Module Identification ◽

Pagerank Algorithm ◽

Disease Module

Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.

Download Full-text

An ensemble approach to microarray data-based gene prioritization after missing value imputation

Bioinformatics ◽

10.1093/bioinformatics/btm010 ◽

2007 ◽

Vol 23 (6) ◽

pp. 747-754 ◽

Cited By ~ 4

Author(s):

D. Hua ◽

Y. Lai

Keyword(s):

Microarray Data ◽

Gene Prioritization ◽

Missing Value ◽

Missing Value Imputation ◽

Ensemble Approach

Download Full-text