DiME: A Scalable Disease Module Identification Algorithm with Application to Glioma Progression

Yunpeng Liu; Daniel A. Tennant; Zexuan Zhu; John K. Heath; Xin Yao; Shan He

doi:10.1371/journal.pone.0086693

Identifying communities from multiplex biological networks by randomized optimization of modularity

F1000Research ◽

10.12688/f1000research.15486.1 ◽

2018 ◽

Vol 7 ◽

pp. 1042 ◽

Cited By ~ 3

Author(s):

Gilles Didier ◽

Alberto Valdeolivas ◽

Anaïs Baudot

Keyword(s):

Biological Networks ◽

Disease Genes ◽

Module Identification ◽

Gwas Dataset ◽

Disease Module ◽

Common Operation ◽

Network Modularity ◽

Disease Community

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.

Get full-text (via PubEx)

An unsupervised disease module identification technique in biological networks using novel quality metric based on connectivity, conductance and modularity

F1000Research ◽

10.12688/f1000research.14258.1 ◽

2018 ◽

Vol 7 ◽

pp. 378 ◽

Cited By ~ 5

Author(s):

Raghvendra Mall ◽

Ehsan Ullah ◽

Khalid Kunji ◽

Michele Ceccarelli ◽

Halima Bensmail

Keyword(s):

Biological Networks ◽

Complex Traits ◽

Genome Wide Association Study ◽

Module Identification ◽

Quality Metric ◽

Genome Wide ◽

Refinement Method ◽

Disease Module ◽

Identification Technique ◽

Evaluation Metric

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.

Get full-text (via PubEx)

Disease Module Identification challenge

F1000Research Channels ◽

10.12688/f1000research.channels.307 ◽

2019 ◽

Keyword(s):

Module Identification ◽

Disease Module

Get full-text (via PubEx)

Disease Module Identification Based on Representation Learning of Complex Networks Integrated From GWAS, eQTL Summaries, and Human Interactome

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2020.00418 ◽

2020 ◽

Vol 8 ◽

Author(s):

Tao Wang ◽

Qidi Peng ◽

Bo Liu ◽

Yongzhuang Liu ◽

Yadong Wang

Keyword(s):

Complex Networks ◽

Representation Learning ◽

Human Interactome ◽

Module Identification ◽

Disease Module

Get full-text (via PubEx)

Recursive module extraction using Louvain and PageRank

F1000Research ◽

10.12688/f1000research.15845.1 ◽

2018 ◽

Vol 7 ◽

pp. 1286

Author(s):

Dimitri Perrin ◽

Guido Zuccon

Keyword(s):

Community Detection ◽

Biological Networks ◽

Biological Network ◽

Biological Function ◽

Recursive Method ◽

Number Of Clusters ◽

Module Identification ◽

Pagerank Algorithm ◽

Disease Module

Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.

Get full-text (via PubEx)

A gene module identification algorithm and its applications to identify gene modules and key genes of hepatocellular carcinoma

Scientific Reports ◽

10.1038/s41598-021-84837-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yan Zhang ◽

Zhengkui Lin ◽

Xiaofeng Lin ◽

Xue Zhang ◽

Qian Zhao ◽

...

Keyword(s):

Gene Expression ◽

Hepatocellular Carcinoma ◽

Prognostic Significance ◽

Identification Algorithm ◽

Gene Module ◽

Number Of Clusters ◽

Module Identification ◽

Gene Modules ◽

Algorithm Framework ◽

Key Genes

AbstractTo further improve the effect of gene modules identification, combining the Newman algorithm in community detection and K-means algorithm framework, a new method of gene module identification, GCNA-Kpca algorithm, was proposed. The core idea of the algorithm was to build a gene co-expression network (GCN) based on gene expression data firstly; Then the Newman algorithm was used to initially identify gene modules based on the topology of GCN, and the number of clusters and clustering centers were determined; Finally the number of clusters and clustering centers were input into the K-means algorithm framework, and the secondary clustering was performed based on the gene expression profile to obtain the final gene modules. The algorithm took into account the role of modularity in the clustering process, and could find the optimal membership module for each gene through multiple iterations. Experimental results showed that the algorithm proposed in this paper had the best performance in error rate, biological significance and CNN classification indicators (Precision, Recall and F-score). The gene module obtained by GCNA-Kpca was used for the task of key gene identification, and these key genes had the highest prognostic significance. Moreover, GCNA-Kpca algorithm was used to identify 10 key genes in hepatocellular carcinoma (HCC): CDC20, CCNB1, EIF4A3, H2AFX, NOP56, RFC4, NOP58, AURKA, PCNA, and FEN1. According to the validation, it was reasonable to speculate that these 10 key genes could be biomarkers for HCC. And NOP56 and NOP58 are key genes for HCC that we discovered for the first time.

Get full-text (via PubEx)

Parent-child signals identify candidate cancer driver genes

F1000Research ◽

10.12688/f1000research.22391.1 ◽

2021 ◽

Vol 10 ◽

pp. 67

Author(s):

Emilie Ann Ramsahai ◽

Vrijesh Tripathi ◽

Melford John

Keyword(s):

Gene Interaction ◽

Parent Child Relationship ◽

Driver Genes ◽

Module Identification ◽

Cancer Driver ◽

Gene Interaction Networks ◽

Child Relationship ◽

Disease Module ◽

Relationship Of ◽

Parent Child

Background: The DREAM Challenge evaluated methods to identify molecular pathways facilitating the detection of multiple genes affecting critical interactions and processes. Dysregulation of pathways by well-known driver genes is often found in the development and progression of cancer. We used the gene interaction networks provided and the scoring rounds to test disease module identification methods to nominate candidate driver genes in these modules. Method: Our algorithm calculated the proportion of the whole network accessible in two steps from each node in a combined network, which was defined as a 2-reach gene value. Genes with high 2-reach values were used to form the center of star cover clusters. These clusters were assessed for significant modules. Within these modules we identified novel candidate driver genes, by considering the parent-child relationship of well-known driver genes. Disturbance to such driver genes or their upstream parents, can lead to disruption of highly regulated signals affecting the normal functions of cells. We explored these parents as a potential source for candidate driver genes. Results: An initial list of 57 candidate driver genes was identified from 13 significant modules. Analysis of the parent-child relationships of well-known driver genes in these modules prioritized PRKDC, YWHAB, GSK3B, and PPP1CB. Conclusion: Our method incorporated the simple m-reach topology metric in disease module identification and its relationship with known driver genes to identify candidate genes. The four genes shortlisted have been highlighted in recent publications in the literature, which supports the need for further wet lab experimental investigation.

Get full-text (via PubEx)

Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks

Frontiers in Genetics ◽

10.3389/fgene.2019.00164 ◽

2019 ◽

Vol 10 ◽

Cited By ~ 12

Author(s):

Beethika Tripathi ◽

Srinivasan Parthasarathy ◽

Himanshu Sinha ◽

Karthik Raman ◽

Balaraman Ravindran

Keyword(s):

Community Detection ◽

Biological Networks ◽

Module Identification ◽

Detection Algorithms ◽

Disease Module

Get full-text (via PubEx)

Cooperative Co-evolutionary Module Identification with Application to Cancer Disease Module Discovery

IEEE Transactions on Evolutionary Computation ◽

10.1109/tevc.2016.2530311 ◽

2016 ◽

pp. 1-1 ◽

Cited By ~ 11

Author(s):

Shan He ◽

Zexuan Zhu ◽

Guanbo Jia ◽

Daniel Tennant ◽

Qiang Huang ◽

...

Keyword(s):

Cancer Disease ◽

Module Identification ◽

Module Discovery ◽

Disease Module

Get full-text (via PubEx)

Network Module Detection using Recursive Local Graph Sparsification and Clustering

10.20944/preprints201808.0421.v1 ◽

2018 ◽

Author(s):

Michael Banf

Keyword(s):

Biological Networks ◽

Network Module ◽

Identification Methods ◽

Structure Preserving ◽

Module Identification ◽

Wide Range ◽

Module Detection ◽

Disease Module ◽

Graph Sparsification ◽

Local Graph

Here we present a fast and highly scalable community structure preserving network module detection that recursively integrates graph sparsification and clustering. Our algorithm, called SparseClust, participated in the most recent DREAM community challenge on disease module identification, an open competition to comprehensively assess module identification methods across a wide range of biological networks.

Get full-text (via PubEx)