Differential Community Detection in Paired Biological Networks

AbstractMotivationBiological networks unravel the inherent structure of molecular interactions which can lead to discovery of driver genes and meaningful pathways especially in cancer context. Often due to gene mutations, the gene expression undergoes changes and the corresponding gene regulatory network sustains some amount of localized re-wiring. The ability to identify significant changes in the interaction patterns caused by the progression of the disease can lead to the revelation of novel relevant signatures.MethodsThe task of identifying differential sub-networks in paired biological networks (A:control,B:case) can be re-phrased as one of finding dense communities in a single noisy differential topological (DT) graph constructed by taking absolute difference between the topological graphs of A and B. In this paper, we propose a fast two-stage approach, namely Differential Community Detection (DCD), to identify differential sub-networks as differential communities in a de-noised version of the DT graph. In the first stage, we iteratively re-order the nodes of the DT graph to determine approximate block diagonals present in the DT adjacency matrix using neighbourhood information of the nodes and Jaccard similarity. In the second stage, the ordered DT adjacency matrix is traversed along the diagonal to remove all the edges associated with a node, if that node has no immediate edges within a window. We then apply community detection methods on this de-noised DT graph to discover differential sub-networks as communities.ResultsOur proposed DCD approach can effectively locate differential sub-networks in several simulated paired random-geometric networks and various paired scale-free graphs with different power-law exponents. The DCD approach easily outperforms community detection methods applied on the original noisy DT graph and recent statistical techniques in simulation studies. We applied DCD method on two real datasets: a) Ovarian cancer dataset to discover differential DNA co-methylation sub-networks in patients and controls; b) Glioma cancer dataset to discover the difference between the regulatory networks of IDH-mutant and IDH-wild-type. We demonstrate the potential benefits of DCD for finding network-inferred bio-markers/pathways associated with a trait of interest.ConclusionThe proposed DCD approach overcomes the limitations of previous statistical techniques and the issues associated with identifying differential sub-networks by use of community detection methods on the noisy DT graph. This is reflected in the superior performance of the DCD method with respect to various metrics like Precision, Accuracy, Kappa and Specificity. The code implementing proposed DCD method is available at https://sites.google.com/site/ raghvendramallmlresearcher/codes.

Download Full-text

Detection of statistically significant network changes in complex biological networks

10.1101/061515 ◽

2016 ◽

Author(s):

Raghvendra Mall ◽

Luigi Cerulo ◽

Halima Bensmail ◽

Antonio Iavarone ◽

Michele Ceccarelli

Keyword(s):

Biological Networks ◽

Regulatory Networks ◽

Hamming Distance ◽

State Of The Art ◽

Statistical Significance ◽

Complex Structure ◽

The State ◽

Computational Time ◽

Interaction Patterns ◽

Driver Genes

Abstract1MotivationBiological networks contribute effectively to unveil the complex structure of molecular interactions and to discover driver genes especially in cancer context. It can happen that due to gene mutations, as for example when cancer progresses, the gene expression network undergoes some amount of localised re-wiring. The ability to detect statistical relevant changes in the interaction patterns induced by the progression of the disease can lead to discovery of novel relevant signatures.2ResultsSeveral procedures have been recently proposed to detect sub-network differences in pairwise labeled weighted networks. In this paper, we propose an improvement over the state-of-the-art based on the Generalized Hamming Distance adopted for evaluating the topological difference between two networks and estimating its statistical significance. The proposed procedure exploits a more effective model selection criteria to generate p-values for statistical significance and is more efficient in terms of computational time and prediction accuracy than literature methods. Moreover, the structure of the proposed algorithm allows for a faster parallelized implementation. In the case of dense random geometric networks the proposed approach is 10−15x faster and achieves 5-10% higher AUC, Precision/Recall, and Kappa value than the state-of-the-art. We also report the application of the method to dissect the difference between the regulatory networks of IDH-mutant versus IDH-wild-type glioma cancer. In such a case our method is able to identify some recently reported master regulators as well as novel important candidates.3AvailabilityThe scripts implementing the proposed algorithms are available in R at https://sites.google.com/site/raghvendramallmlresearcher/[email protected]

Download Full-text

Hierarchical Hidden Community Detection for Protein Complex Prediction

10.21203/rs.3.rs-116708/v1 ◽

2020 ◽

Author(s):

Chao Li ◽

Kun He ◽

Guang shuai Liu ◽

John E. Hopcroft

Keyword(s):

Community Detection ◽

Hierarchical Structure ◽

Biological Networks ◽

Protein Complexes ◽

Detection Methods ◽

Biological Interactions ◽

Protein Protein Interaction ◽

Protein Complex Prediction ◽

Detection Approach ◽

New Perspective

Abstract BackgroundDiscovering functional modules in protein-protein interaction networks through optimization remains a longstanding challenge in Biology. Traditional algorithms simply consider strong protein complexes found in the original network by optimizing some metric, which may cause obstacles for discovering weak and hidden complexes that are overshadowed by strong complexes. Additionally, protein complexes have not only different densities but also various ranges of scales, making them extremely difficult to be detected. We address these issues and propose a hierarchical hidden community detection approach to predict protein complexes of various strengths and scales accurately. ResultsWe propose a meta-method called HirHide (Hierarchical Hidden Community Detection). It is the first combination of hierarchical structure with hidden structure, which provides a new perspective for finding protein complexes of various strengths and scales. We compare the performance of several standard community detection methods with their HirHide versions. Experimental results show that the HirHide versions achieve better performance and sometimes even significantly outperform the baselines. ConclusionsHirHide can adopt any standard community detection method as the base algorithm and enable it to discover hidden hierarchical communities as well as boosting the detection of strong hierarchical communities. Some biological networks are too complex for standard community detection algorithms to produce a positive performance. Most of the time, a better choice is to choose a corresponding algorithm based on the characteristics of a specific biological network. Under these circumstances, HirHide has clear advantages because of its flexibility. At the same time, according to the natural hierarchy of cells, organelle, intracellular compound etc., hierarchical structure with hidden structure is in line with the characteristics of the data itself, thus helping researchers to study biological interactions more deeply.

Download Full-text

Generating Ensembles of Gene Regulatory Networks to Assess Robustness of Disease Modules

10.1101/2020.07.12.198747 ◽

2020 ◽

Author(s):

James T. Lim ◽

Chen Chen ◽

Adam D. Grant ◽

Megha Padi

Keyword(s):

Biological Networks ◽

Regulatory Networks ◽

Network Inference ◽

Disease Onset ◽

Null Distribution ◽

Generative Models ◽

Computational Method ◽

Detection Methods ◽

Biological Research ◽

Consensus Clustering

AbstractThe use of biological networks such as protein-protein interaction and transcriptional regulatory networks is becoming an integral part of biological research in the genomics era. However, these networks are not static, and during phenotypic transitions like disease onset, they can acquire new “communities” of genes that carry out key cellular processes. Changes in community structure can be detected by maximizing a modularity-based score, but because biological systems and network inference algorithms are inherently noisy, it remains a challenge to determine whether these changes represent real cellular responses or whether they appeared by random chance. Here, we introduce Constrained Random Alteration of Network Edges (CRANE), a computational method that samples networks with fixed node strengths to identify a null distribution and assess the robustness of observed changes in network structure. In contrast with other approaches, such as consensus clustering or established network generative models, CRANE produces more biologically realistic results and performs better in simulations. When applied to breast and ovarian cancer networks, CRANE improves the recovery of cancer-relevant GO terms while reducing the signal from non-specific housekeeping processes. CRANE is a general tool that can be applied in tandem with a variety of stochastic community detection methods to evaluate the veracity of their results.

Download Full-text

An adaptive refinement for community detection methods for disease module identification in biological networks using novel metric based on connectivity, conductance & modularity

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2017.8218027 ◽

2017 ◽

Author(s):

Raghvendra Mall ◽

Ehsan Ullah ◽

Khalid Kunji ◽

Halima Bensmail ◽

Michele Ceccarelli

Keyword(s):

Community Detection ◽

Biological Networks ◽

Adaptive Refinement ◽

Detection Methods ◽

Module Identification ◽

Disease Module

Download Full-text

Multiple feedback loop design in the tryptophan regulatory network of Escherichia coli suggests a paradigm for robust regulation of processes in series

Journal of The Royal Society Interface ◽

10.1098/rsif.2005.0103 ◽

2005 ◽

Vol 3 (8) ◽

pp. 383-391 ◽

Cited By ~ 20

Author(s):

Sharad Bhartiya ◽

Nikhil Chaudhary ◽

K.V Venkatesh ◽

Francis J Doyle

Keyword(s):

Escherichia Coli ◽

Biological Networks ◽

Regulatory Networks ◽

Feedback Loop ◽

Feedback Loops ◽

Superior Performance ◽

Uncertain Environments ◽

Loop Design ◽

Tank System ◽

In Series

Biological networks have evolved through adaptation in uncertain environments. Of the different possible design paradigms, some may offer functional advantages over others. These designs can be quantified by the structure of the network resulting from molecular interactions and the parameter values. One may, therefore, like to identify the design motif present in the evolved network that makes it preferable over other alternatives. In this work, we focus on the regulatory networks characterized by serially arranged processes, which are regulated by multiple feedback loops. Specifically, we consider the tryptophan system present in Escherichia coli , which may be conceptualized as three processes in series, namely transcription, translation and tryptophan synthesis. The multiple feedback loop motif results from three distinct negative feedback loops, namely genetic repression, mRNA attenuation and enzyme inhibition. A framework is introduced to identify the key design components of this network responsible for its physiological performance. We demonstrate that the multiple feedback loop motif, as seen in the tryptophan system, enables robust performance to variations in system parameters while maintaining a rapid response to achieve homeostasis. Superior performance, if arising from a design principle, is intrinsic and, therefore, inherent to any similarly designed system, either natural or engineered. An experimental engineering implementation of the multiple feedback loop design on a two-tank system supports the generality of the robust attributes offered by the design.

Download Full-text

Algorithmic and Stochastic Representations of Gene Regulatory Networks and Protein-Protein Interactions

Current Topics in Medicinal Chemistry ◽

10.2174/1568026619666190311125256 ◽

2019 ◽

Vol 19 (6) ◽

pp. 413-425 ◽

Cited By ~ 3

Author(s):

Athanasios Alexiou ◽

Stylianos Chatzichronis ◽

Asma Perveen ◽

Abdul Hafeez ◽

Ghulam Md. Ashraf

Keyword(s):

Protein Interactions ◽

Biological Networks ◽

Regulatory Networks ◽

Review Paper ◽

Boolean Networks ◽

Diagnostic Tools ◽

Complex Nature ◽

Cellular Interactions ◽

Protein Protein Interactions ◽

Deterministic Models

Background:Latest studies reveal the importance of Protein-Protein interactions on physiologic functions and biological structures. Several stochastic and algorithmic methods have been published until now, for the modeling of the complex nature of the biological systems.Objective:Biological Networks computational modeling is still a challenging task. The formulation of the complex cellular interactions is a research field of great interest. In this review paper, several computational methods for the modeling of GRN and PPI are presented analytically.Methods:Several well-known GRN and PPI models are presented and discussed in this review study such as: Graphs representation, Boolean Networks, Generalized Logical Networks, Bayesian Networks, Relevance Networks, Graphical Gaussian models, Weight Matrices, Reverse Engineering Approach, Evolutionary Algorithms, Forward Modeling Approach, Deterministic models, Static models, Hybrid models, Stochastic models, Petri Nets, BioAmbients calculus and Differential Equations.Results:GRN and PPI methods have been already applied in various clinical processes with potential positive results, establishing promising diagnostic tools.Conclusion:In literature many stochastic algorithms are focused in the simulation, analysis and visualization of the various biological networks and their dynamics interactions, which are referred and described in depth in this review paper.

Download Full-text

Modelling community structure and temporal spreading on complex networks

Computational Social Networks ◽

10.1186/s40649-021-00094-z ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Vesa Kuikka

Keyword(s):

Community Structure ◽

Complex Networks ◽

Community Detection ◽

Network Structure ◽

Network Connectivity ◽

Network Models ◽

Building Blocks ◽

Detection Algorithm ◽

Research Area ◽

Detection Methods

AbstractWe present methods for analysing hierarchical and overlapping community structure and spreading phenomena on complex networks. Different models can be developed for describing static connectivity or dynamical processes on a network topology. In this study, classical network connectivity and influence spreading models are used as examples for network models. Analysis of results is based on a probability matrix describing interactions between all pairs of nodes in the network. One popular research area has been detecting communities and their structure in complex networks. The community detection method of this study is based on optimising a quality function calculated from the probability matrix. The same method is proposed for detecting underlying groups of nodes that are building blocks of different sub-communities in the network structure. We present different quantitative measures for comparing and ranking solutions of the community detection algorithm. These measures describe properties of sub-communities: strength of a community, probability of formation and robustness of composition. The main contribution of this study is proposing a common methodology for analysing network structure and dynamics on complex networks. We illustrate the community detection methods with two small network topologies. In the case of network spreading models, time development of spreading in the network can be studied. Two different temporal spreading distributions demonstrate the methods with three real-world social networks of different sizes. The Poisson distribution describes a random response time and the e-mail forwarding distribution describes a process of receiving and forwarding messages.

Download Full-text

Some new Pythagorean fuzzy correlation techniques via statistical viewpoint with applications to decision-making problems

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202469 ◽

2021 ◽

pp. 1-13

Author(s):

Paul Augustine Ejegwa ◽

Shiping Wen ◽

Yuming Feng ◽

Wei Zhang ◽

Jia Chen

Keyword(s):

Decision Making ◽

Correlation Coefficient ◽

Fuzzy Set ◽

Intuitionistic Fuzzy Set ◽

Superior Performance ◽

Statistical Techniques ◽

New Techniques ◽

Reliable Technique ◽

Performance Indexes ◽

Fuzzy Correlation

Pythagorean fuzzy set is a reliable technique for soft computing because of its ability to curb indeterminate data when compare to intuitionistic fuzzy set. Among the several measuring tools in Pythagorean fuzzy environment, correlation coefficient is very vital since it has the capacity to measure interdependency and interrelationship between any two arbitrary Pythagorean fuzzy sets (PFSs). In Pythagorean fuzzy correlation coefficient, some techniques of calculating correlation coefficient of PFSs (CCPFSs) via statistical perspective have been proposed, however, with some limitations namely; (i) failure to incorporate all parameters of PFSs which lead to information loss, (ii) imprecise results, and (iii) less performance indexes. Sequel, this paper introduces some new statistical techniques of computing CCPFSs by using Pythagorean fuzzy variance and covariance which resolve the limitations with better performance indexes. The new techniques incorporate the three parameters of PFSs and defined within the range [-1, 1] to show the power of correlation between the PFSs and to indicate whether the PFSs under consideration are negatively or positively related. The validity of the new statistical techniques of computing CCPFSs is tested by considering some numerical examples, wherein the new techniques show superior performance indexes in contrast to the similar existing ones. To demonstrate the applicability of the new statistical techniques of computing CCPFSs, some multi-criteria decision-making problems (MCDM) involving medical diagnosis and pattern recognition problems are determined via the new techniques.

Download Full-text

Criminal Community Detection Based on Isomorphic Subgraph Analytics

Open Computer Science ◽

10.1515/comp-2020-0112 ◽

2020 ◽

Vol 10 (1) ◽

pp. 164-174

Author(s):

Theyvaa Sangkaran ◽

Azween Abdullah ◽

NZ Jhanjhi

Keyword(s):

Law Enforcement ◽

Community Detection ◽

Traditional Method ◽

Detection Methods ◽

Law Enforcement Agencies ◽

Modus Operandi ◽

Research Gap ◽

New Perspective ◽

Key Participants ◽

Investigative Process

AbstractAll highly centralised enterprises run by criminals do share similar traits, which, if recognised, can help in the criminal investigative process. While conducting a complex confederacy investigation, law enforcement agents should not only identify the key participants but also be able to grasp the nature of the inter-connections between the criminals to understand and determine the modus operandi of an illicit operation. We studied community detection in criminal networks using the graph theory and formally introduced an algorithm that opens a new perspective of community detection compared to the traditional methods used to model the relations between objects. Community structure, generally described as densely connected nodes and similar patterns of links is an important property of complex networks. Our method differs from the traditional method by allowing law enforcement agencies to be able to compare the detected communities and thereby be able to assume a different viewpoint of the criminal network, as presented in the paper we have compared our algorithm to the well-known Girvan-Newman. We consider this method as an alternative or an addition to the traditional community detection methods mentioned earlier, as the proposed algorithm allows, and will assists in, the detection of different patterns and structures of the same community for enforcement agencies and researches. This methodology on community detection has not been extensively researched. Hence, we have identified it as a research gap in this domain and decided to develop a new method of criminal community detection.

Download Full-text

Spectral gap in random bipartite biregular graphs and applications

Combinatorics Probability Computing ◽

10.1017/s0963548321000249 ◽

2021 ◽

pp. 1-39

Author(s):

Gerandy Brito ◽

Ioana Dumitriu ◽

Kameron Decker Harris

Keyword(s):

Community Detection ◽

Adjacency Matrix ◽

Moment Method ◽

Coding Theory ◽

High Probability ◽

Spectral Gap ◽

Matrix Completion ◽

Full Rank ◽

The Moment ◽

Deterministic Matrix

Abstract We prove an analogue of Alon’s spectral gap conjecture for random bipartite, biregular graphs. We use the Ihara–Bass formula to connect the non-backtracking spectrum to that of the adjacency matrix, employing the moment method to show there exists a spectral gap for the non-backtracking matrix. A by-product of our main theorem is that random rectangular zero-one matrices with fixed row and column sums are full rank with high probability. Finally, we illustrate applications to community detection, coding theory, and deterministic matrix completion.

Download Full-text