scholarly journals Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks

2019 ◽  
Vol 10 ◽  
Author(s):  
Beethika Tripathi ◽  
Srinivasan Parthasarathy ◽  
Himanshu Sinha ◽  
Karthik Raman ◽  
Balaraman Ravindran
F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1286
Author(s):  
Dimitri Perrin ◽  
Guido Zuccon

Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.


Data ◽  
2019 ◽  
Vol 4 (4) ◽  
pp. 149
Author(s):  
Amulyashree Sridhar ◽  
Sharvani GS ◽  
AH Manjunatha Reddy ◽  
Biplab Bhattacharjee ◽  
Kalyan Nagaraj

Exploring gene networks is crucial for identifying significant biological interactions occurring in a disease condition. These interactions can be acknowledged by modeling the tie structure of networks. Such tie orientations are often detected within embedded community structures. However, most of the prevailing community detection modules are intended to capture information from nodes and its attributes, usually ignoring the ties. In this study, a modularity maximization algorithm is proposed based on nonlinear representation of local tangent space alignment (LTSA). Initially, the tangent coordinates are computed locally to identify k-nearest neighbors across the genes. These local neighbors are further optimized by generating a nonlinear network embedding function for detecting gene communities based on eigenvector decomposition. Experimental results suggest that this algorithm detects gene modules with a better modularity index of 0.9256, compared to other traditional community detection algorithms. Furthermore, co-expressed genes across these communities are identified by discovering the characteristic tie structures. These detected ties are known to have substantial biological influence in the progression of schizophrenia, thereby signifying the influence of tie patterns in biological networks. This technique can be extended logically on other diseases networks for detecting substantial gene “hotspots”.


2009 ◽  
Vol 23 (17) ◽  
pp. 2089-2106 ◽  
Author(s):  
ZHONGMIN XIONG ◽  
WEI WANG

Many networks, including social and biological networks, are naturally divided into communities. Community detection is an important task when discovering the underlying structure in networks. GN algorithm is one of the most influential detection algorithms based on betweenness scores of edges, but it is computationally costly, as all betweenness scores need to be repeatedly computed once an edge is removed. This paper presents an algorithm which is also based on betweenness scores but more than one edge can be removed when all betweenness scores have been computed. This method is motivated by the following considerations: many components, divided from networks, are independent of each other in their recalculation of betweenness scores and their split into smaller components. It is shown that this method is fast and effective through theoretical analysis and experiments with several real data sets, which have acted as test beds in many related works. Moreover, the version of this method with the minor adjustments allows for the discovery of the communities surrounding a given node without having to compute the full community structure of a graph.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1042 ◽  
Author(s):  
Gilles Didier ◽  
Alberto Valdeolivas ◽  
Anaïs Baudot

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 378 ◽  
Author(s):  
Raghvendra Mall ◽  
Ehsan Ullah ◽  
Khalid Kunji ◽  
Michele Ceccarelli ◽  
Halima Bensmail

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.


Author(s):  
S Rao Chintalapudi ◽  
H. M. Krishna Prasad M

Social network analysis is one of the emerging research areas in the modern world. Social networks can be adapted to all the sectors by using graph theory concepts such as transportation networks, collaboration networks, and biological networks and so on. The most important property of social networks is community, collection of nodes with dense connections inside and sparse connections at outside. Community detection is similar to clustering analysis and has many applications in the real-time world such as recommendation systems, target marketing and so on. Community detection algorithms are broadly classified into two categories. One is disjoint community detection algorithms and the other is overlapping community detection algorithms. This chapter reviews overlapping community detection algorithms with their strengths and limitations. To evaluate these algorithms, a popular synthetic network generator, i.e., LFR benchmark generator and the new extended quality measures are discussed in detail.


Author(s):  
Michael Banf

Here we present a fast and highly scalable community structure preserving network module detection that recursively integrates graph sparsification and clustering. Our algorithm, called SparseClust, participated in the most recent DREAM community challenge on disease module identification, an open competition to comprehensively assess module identification methods across a wide range of biological networks.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 1042 ◽  
Author(s):  
Gilles Didier ◽  
Alberto Valdeolivas ◽  
Anaïs Baudot

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.


Sign in / Sign up

Export Citation Format

Share Document