Adapting Community Detection Algorithms for Disease Module Identification in Heterogeneous Biological Networks

Biological networks are highly modular and contain a large number of clusters, which are often associated with a specific biological function or disease. Identifying these clusters, or modules, is therefore valuable, but it is not trivial. In this article we propose a recursive method based on the Louvain algorithm for community detection and the PageRank algorithm for authoritativeness weighting in networks. PageRank is used to initialise the weights of nodes in the biological network; the Louvain algorithm with the Newman-Girvan criterion for modularity is then applied to the network to identify modules. Any identified module with more than k nodes is further processed by recursively applying PageRank and Louvain, until no module contains more than k nodes (where k is a parameter of the method, no greater than 100). This method is evaluated on a heterogeneous set of six biological networks from the Disease Module Identification DREAM Challenge. Empirical findings suggest that the method is effective in identifying a large number of significant modules, although with substantial variability across restarts of the method.

Download Full-text

An adaptive refinement for community detection methods for disease module identification in biological networks using novel metric based on connectivity, conductance & modularity

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2017.8218027 ◽

2017 ◽

Author(s):

Raghvendra Mall ◽

Ehsan Ullah ◽

Khalid Kunji ◽

Halima Bensmail ◽

Michele Ceccarelli

Keyword(s):

Community Detection ◽

Biological Networks ◽

Adaptive Refinement ◽

Detection Methods ◽

Module Identification ◽

Disease Module

Download Full-text

The Eminence of Co-Expressed Ties in Schizophrenia Network Communities

Data ◽

10.3390/data4040149 ◽

2019 ◽

Vol 4 (4) ◽

pp. 149

Author(s):

Amulyashree Sridhar ◽

Sharvani GS ◽

AH Manjunatha Reddy ◽

Biplab Bhattacharjee ◽

Kalyan Nagaraj

Keyword(s):

Community Detection ◽

Biological Networks ◽

Gene Networks ◽

Biological Interactions ◽

K Nearest Neighbors ◽

Nonlinear Network ◽

Detection Algorithms ◽

Network Communities ◽

Disease Condition ◽

Modularity Maximization

Exploring gene networks is crucial for identifying significant biological interactions occurring in a disease condition. These interactions can be acknowledged by modeling the tie structure of networks. Such tie orientations are often detected within embedded community structures. However, most of the prevailing community detection modules are intended to capture information from nodes and its attributes, usually ignoring the ties. In this study, a modularity maximization algorithm is proposed based on nonlinear representation of local tangent space alignment (LTSA). Initially, the tangent coordinates are computed locally to identify k-nearest neighbors across the genes. These local neighbors are further optimized by generating a nonlinear network embedding function for detecting gene communities based on eigenvector decomposition. Experimental results suggest that this algorithm detects gene modules with a better modularity index of 0.9256, compared to other traditional community detection algorithms. Furthermore, co-expressed genes across these communities are identified by discovering the characteristic tie structures. These detected ties are known to have substantial biological influence in the progression of schizophrenia, thereby signifying the influence of tie patterns in biological networks. This technique can be extended logically on other diseases networks for detecting substantial gene “hotspots”.

Download Full-text

COMMUNITY DETECTION IN SOCIAL NETWORKS EMPLOYING COMPONENT INDEPENDENCY

Modern Physics Letters B ◽

10.1142/s0217984909020242 ◽

2009 ◽

Vol 23 (17) ◽

pp. 2089-2106 ◽

Cited By ~ 3

Author(s):

ZHONGMIN XIONG ◽

WEI WANG

Keyword(s):

Social Networks ◽

Community Structure ◽

Theoretical Analysis ◽

Community Detection ◽

Biological Networks ◽

Real Data ◽

Important Task ◽

Underlying Structure ◽

Data Sets ◽

Detection Algorithms

Many networks, including social and biological networks, are naturally divided into communities. Community detection is an important task when discovering the underlying structure in networks. GN algorithm is one of the most influential detection algorithms based on betweenness scores of edges, but it is computationally costly, as all betweenness scores need to be repeatedly computed once an edge is removed. This paper presents an algorithm which is also based on betweenness scores but more than one edge can be removed when all betweenness scores have been computed. This method is motivated by the following considerations: many components, divided from networks, are independent of each other in their recalculation of betweenness scores and their split into smaller components. It is shown that this method is fast and effective through theoretical analysis and experiments with several real data sets, which have acted as test beds in many related works. Moreover, the version of this method with the minor adjustments allows for the discovery of the communities surrounding a given node without having to compute the full community structure of a graph.

Download Full-text

Identifying communities from multiplex biological networks by randomized optimization of modularity

F1000Research ◽

10.12688/f1000research.15486.1 ◽

2018 ◽

Vol 7 ◽

pp. 1042 ◽

Cited By ~ 3

Author(s):

Gilles Didier ◽

Alberto Valdeolivas ◽

Anaïs Baudot

Keyword(s):

Biological Networks ◽

Disease Genes ◽

Module Identification ◽

Gwas Dataset ◽

Disease Module ◽

Common Operation ◽

Network Modularity ◽

Disease Community

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.

Download Full-text

An unsupervised disease module identification technique in biological networks using novel quality metric based on connectivity, conductance and modularity

F1000Research ◽

10.12688/f1000research.14258.1 ◽

2018 ◽

Vol 7 ◽

pp. 378 ◽

Cited By ~ 5

Author(s):

Raghvendra Mall ◽

Ehsan Ullah ◽

Khalid Kunji ◽

Michele Ceccarelli ◽

Halima Bensmail

Keyword(s):

Biological Networks ◽

Complex Traits ◽

Genome Wide Association Study ◽

Module Identification ◽

Quality Metric ◽

Genome Wide ◽

Refinement Method ◽

Disease Module ◽

Identification Technique ◽

Evaluation Metric

Disease processes are usually driven by several genes interacting in molecular modules or pathways leading to the disease. The identification of such modules in gene or protein networks is the core of computational methods in biomedical research. With this pretext, the Disease Module Identification (DMI) DREAM Challenge was initiated as an effort to systematically assess module identification methods on a panel of 6 diverse genomic networks. In this paper, we propose a generic refinement method based on ideas of merging and splitting the hierarchical tree obtained from any community detection technique for constrained DMI in biological networks. The only constraint was that size of community is in the range [3, 100]. We propose a novel model evaluation metric, called F-score, computed from several unsupervised quality metrics like modularity, conductance and connectivity to determine the quality of a graph partition at given level of hierarchy. We also propose a quality measure, namely Inverse Confidence, which ranks and prune insignificant modules to obtain a curated list of candidate disease modules (DM) for biological network. The predicted modules are evaluated on the basis of the total number of unique candidate modules that are associated with complex traits and diseases from over 200 genome-wide association study (GWAS) datasets. During the competition, we identified 42 modules, ranking 15th at the official false detection rate (FDR) cut-off of 0.05 for identifying statistically significant DM in the 6 benchmark networks. However, for stringent FDR cut-offs 0.025 and 0.01, the proposed method identified 31 (rank 9) and 16 DMIs (rank 10) respectively. From additional analysis, our proposed approach detected a total of 44 DM in the networks in comparison to 60 for the winner of DREAM Challenge. Interestingly, for several individual benchmark networks, our performance was better or competitive with the winner.

Download Full-text

A Survey on Overlapping Communities in Large-Scale Social Networks

Modern Technologies for Big Data Classification and Clustering - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-2805-0.ch008 ◽

2018 ◽

pp. 198-215

Author(s):

S Rao Chintalapudi ◽

H. M. Krishna Prasad M

Keyword(s):

Social Networks ◽

Community Detection ◽

Biological Networks ◽

Modern World ◽

Overlapping Community Detection ◽

Overlapping Communities ◽

Detection Algorithms ◽

Research Areas ◽

Network Generator ◽

Overlapping Community

Social network analysis is one of the emerging research areas in the modern world. Social networks can be adapted to all the sectors by using graph theory concepts such as transportation networks, collaboration networks, and biological networks and so on. The most important property of social networks is community, collection of nodes with dense connections inside and sparse connections at outside. Community detection is similar to clustering analysis and has many applications in the real-time world such as recommendation systems, target marketing and so on. Community detection algorithms are broadly classified into two categories. One is disjoint community detection algorithms and the other is overlapping community detection algorithms. This chapter reviews overlapping community detection algorithms with their strengths and limitations. To evaluate these algorithms, a popular synthetic network generator, i.e., LFR benchmark generator and the new extended quality measures are discussed in detail.

Download Full-text

Network Module Detection using Recursive Local Graph Sparsification and Clustering

10.20944/preprints201808.0421.v1 ◽

2018 ◽

Author(s):

Michael Banf

Keyword(s):

Biological Networks ◽

Network Module ◽

Identification Methods ◽

Structure Preserving ◽

Module Identification ◽

Wide Range ◽

Module Detection ◽

Disease Module ◽

Graph Sparsification ◽

Local Graph

Here we present a fast and highly scalable community structure preserving network module detection that recursively integrates graph sparsification and clustering. Our algorithm, called SparseClust, participated in the most recent DREAM community challenge on disease module identification, an open competition to comprehensively assess module identification methods across a wide range of biological networks.

Download Full-text

Identifying communities from multiplex biological networks by randomized optimization of modularity

F1000Research ◽

10.12688/f1000research.15486.2 ◽

2018 ◽

Vol 7 ◽

pp. 1042 ◽

Cited By ~ 1

Author(s):

Gilles Didier ◽

Alberto Valdeolivas ◽

Anaïs Baudot

Keyword(s):

Biological Networks ◽

Disease Genes ◽

Module Identification ◽

Gwas Dataset ◽

Disease Module ◽

Common Operation ◽

Network Modularity ◽

Disease Community

The identification of communities, or modules, is a common operation in the analysis of large biological networks. The Disease Module Identification DREAM challenge established a framework to evaluate clustering approaches in a biomedical context, by testing the association of communities with GWAS-derived common trait and disease genes. We implemented here several extensions of the MolTi software that detects communities by optimizing multiplex (and monoplex) network modularity. In particular, MolTi now runs a randomized version of the Louvain algorithm, can consider edge and layer weights, and performs recursive clustering. On simulated networks, the randomization procedure clearly improves the detection of communities. On the DREAM challenge benchmark, the results strongly depend on the selected GWAS dataset and enrichment p-value threshold. However, the randomization procedure, as well as the consideration of weighted edges and layers generally increases the number of trait and disease community detected. The new version of MolTi and the scripts used for the DMI DREAM challenge are available at: https://github.com/gilles-didier/MolTi-DREAM.

Download Full-text

Topological and functional comparison of community detection algorithms in biological networks

BMC Bioinformatics ◽

10.1186/s12859-019-2746-0 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 4

Author(s):

Sara Rahiminejad ◽

Mano R. Maurya ◽

Shankar Subramaniam

Keyword(s):

Community Detection ◽

Biological Networks ◽

Detection Algorithms

Download Full-text