scholarly journals DiME: A Scalable Disease Module Identification Algorithm with Application to Glioma Progression

PLoS ONE ◽  
2014 ◽  
Vol 9 (2) ◽  
pp. e86693 ◽  
Author(s):  
Yunpeng Liu ◽  
Daniel A. Tennant ◽  
Zexuan Zhu ◽  
John K. Heath ◽  
Xin Yao ◽  
...  
F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1042 ◽  
Author(s):  
Gilles Didier ◽  
Alberto Valdeolivas ◽  
Anaïs Baudot

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 378 ◽  
Author(s):  
Raghvendra Mall ◽  
Ehsan Ullah ◽  
Khalid Kunji ◽  
Michele Ceccarelli ◽  
Halima Bensmail

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1286
Author(s):  
Dimitri Perrin ◽  
Guido Zuccon

Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Yan Zhang ◽  
Zhengkui Lin ◽  
Xiaofeng Lin ◽  
Xue Zhang ◽  
Qian Zhao ◽  
...  

AbstractTo further improve the effect of gene modules identification, combining the Newman algorithm in community detection and K-means algorithm framework, a new method of gene module identification, GCNA-Kpca algorithm, was proposed. The core idea of the algorithm was to build a gene co-expression network (GCN) based on gene expression data firstly; Then the Newman algorithm was used to initially identify gene modules based on the topology of GCN, and the number of clusters and clustering centers were determined; Finally the number of clusters and clustering centers were input into the K-means algorithm framework, and the secondary clustering was performed based on the gene expression profile to obtain the final gene modules. The algorithm took into account the role of modularity in the clustering process, and could find the optimal membership module for each gene through multiple iterations. Experimental results showed that the algorithm proposed in this paper had the best performance in error rate, biological significance and CNN classification indicators (Precision, Recall and F-score). The gene module obtained by GCNA-Kpca was used for the task of key gene identification, and these key genes had the highest prognostic significance. Moreover, GCNA-Kpca algorithm was used to identify 10 key genes in hepatocellular carcinoma (HCC): CDC20, CCNB1, EIF4A3, H2AFX, NOP56, RFC4, NOP58, AURKA, PCNA, and FEN1. According to the validation, it was reasonable to speculate that these 10 key genes could be biomarkers for HCC. And NOP56 and NOP58 are key genes for HCC that we discovered for the first time.


2019 ◽  
Vol 10 ◽  
Author(s):  
Beethika Tripathi ◽  
Srinivasan Parthasarathy ◽  
Himanshu Sinha ◽  
Karthik Raman ◽  
Balaraman Ravindran

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 67
Author(s):  
Emilie Ann Ramsahai ◽  
Vrijesh Tripathi ◽  
Melford John

Background: The DREAM Challenge evaluated methods to identify molecular pathways facilitating the detection of multiple genes affecting critical interactions and processes. Dysregulation of pathways by well-known driver genes is often found in the development and progression of cancer. We used the gene interaction networks provided and the scoring rounds to test disease module identification methods to nominate candidate driver genes in these modules. Method: Our algorithm calculated the proportion of the whole network accessible in two steps from each node in a combined network, which was defined as a 2-reach gene value. Genes with high 2-reach values were used to form the center of star cover clusters. These clusters were assessed for significant modules. Within these modules we identified novel candidate driver genes, by considering the parent-child relationship of well-known driver genes. Disturbance to such driver genes or their upstream parents, can lead to disruption of highly regulated signals affecting the normal functions of cells. We explored these parents as a potential source for candidate driver genes. Results:  An initial list of 57 candidate driver genes was identified from 13 significant modules. Analysis of the parent-child relationships of well-known driver genes in these modules prioritized PRKDC, YWHAB, GSK3B, and PPP1CB. Conclusion: Our method incorporated the simple m-reach topology metric in disease module identification and its relationship with known driver genes to identify candidate genes. The four genes shortlisted have been highlighted in recent publications in the literature, which supports the need for further wet lab experimental investigation.


Author(s):  
Michael Banf

Here we present a fast and highly scalable community structure preserving network module detection that recursively integrates graph sparsification and clustering. Our algorithm, called SparseClust, participated in the most recent DREAM community challenge on disease module identification, an open competition to comprehensively assess module identification methods across a wide range of biological networks.


Sign in / Sign up

Export Citation Format

Share Document