scholarly journals Community Detection in Large-Scale Bipartite Biological Networks

2021 ◽  
Vol 12 ◽  
Author(s):  
Genís Calderer ◽  
Marieke L. Kuijjer

Networks are useful tools to represent and analyze interactions on a large, or genome-wide scale and have therefore been widely used in biology. Many biological networks—such as those that represent regulatory interactions, drug-gene, or gene-disease associations—are of a bipartite nature, meaning they consist of two different types of nodes, with connections only forming between the different node sets. Analysis of such networks requires methodologies that are specifically designed to handle their bipartite nature. Community structure detection is a method used to identify clusters of nodes in a network. This approach is especially helpful in large-scale biological network analysis, as it can find structure in networks that often resemble a “hairball” of interactions in visualizations. Often, the communities identified in biological networks are enriched for specific biological processes and thus allow one to assign drugs, regulatory molecules, or diseases to such processes. In addition, comparison of community structures between different biological conditions can help to identify how network rewiring may lead to tissue development or disease, for example. In this mini review, we give a theoretical basis of different methods that can be applied to detect communities in bipartite biological networks. We introduce and discuss different scores that can be used to assess the quality of these community structures. We then apply a wide range of methods to a drug-gene interaction network to highlight the strengths and weaknesses of these methods in their application to large-scale, bipartite biological networks.

2022 ◽  
Author(s):  
Wei-Zhen Zhou ◽  
Wenke Li ◽  
Huayan Shen ◽  
Ruby W. Wang ◽  
Wen Chen ◽  
...  

Congenital heart disease (CHD) is the most common cause of major birth defects, with a prevalence of 1%. Although an increasing number of studies reporting the etiology of CHD, the findings scattered throughout the literature are difficult to retrieve and utilize in research and clinical practice. We therefore developed CHDbase, an evidence-based knowledgebase with CHD-related genes and clinical manifestations manually curated from 1114 publications, linking 1124 susceptibility genes and 3591 variations to more than 300 CHD types and related syndromes. Metadata such as the information of each publication and the selected population and samples, the strategy of studies, and the major findings of study were integrated with each item of research record. We also integrated functional annotations through parsing ~50 databases/tools to facilitate the interpretation of these genes and variations in disease pathogenicity. We further prioritized the significance of these CHD-related genes with a gene interaction network approach, and extracted a core CHD sub-network with 163 genes. The clear genetic landscape of CHD enables the phenotype classification based on the shared genetic origin. Overall, CHDbase provides a comprehensive and freely available resource to study CHD susceptibility, supporting a wide range of users in the scientific and medical communities. CHDbase is accessible at http://chddb.fwgenetics.org/.


2022 ◽  
Vol 12 ◽  
Author(s):  
Liya Huang ◽  
Ting Ye ◽  
Jingjing Wang ◽  
Xiaojing Gu ◽  
Ruiting Ma ◽  
...  

Pancreatic adenocarcinoma is one of the leading causes of cancer-related death worldwide. Since little clinical symptoms were shown in the early period of pancreatic adenocarcinoma, most patients were found to carry metastases when diagnosis. The lack of effective diagnosis biomarkers and therapeutic targets makes pancreatic adenocarcinoma difficult to screen and cure. The fundamental problem is we know very little about the regulatory mechanisms during carcinogenesis. Here, we employed weighted gene co-expression network analysis (WGCNA) to build gene interaction network using expression profile of pancreatic adenocarcinoma from The Cancer Genome Atlas (TCGA). STRING was used for the construction and visualization of biological networks. A total of 22 modules were detected in the network, among which yellow and pink modules showed the most significant associations with pancreatic adenocarcinoma. Dozens of new genes including PKMYT1, WDHD1, ASF1B, and RAD18 were identified. Further survival analysis yielded their valuable effects on the diagnosis and treatment of pancreatic adenocarcinoma. Our study pioneered network-based algorithm in the application of tumor etiology and discovered several promising regulators for pancreatic adenocarcinoma detection and therapy.


2020 ◽  
Vol 15 ◽  
Author(s):  
Dariush Salimi ◽  
Ali Moeini

Objective: A gene interaction network, along with its related biological features, has an important role in computational biology. Bayesian network, as an efficient model, based on probabilistic concepts is able to exploit known and novel biological casual relationships between genes. Success of Bayesian networks in predicting the relationships greatly depends on selecting priors Methods: K-mers have been applied as the prominent features to uncover similarity between genes in a specific pathway, suggesting that this feature can be applied to study genes dependencies. In this study, we propose k-mer (4,5 and 6-mers) highly correlated with epigenetic modifications, including 17 modifications, as a new prior for Bayesian inference in gene interaction network Result: Employing this model on a network of 23 human genes and on a network based on 27 genes related to yeast resulted in F-measure improvements in different biological networks Conclusion: The improvements in the best case are 12%, 36% and 10% in pathway, co-expression, and physical interaction, respectively.


2022 ◽  
Vol 8 ◽  
Author(s):  
Qing Chen ◽  
Ji Zhang ◽  
Banghe Bao ◽  
Fan Zhang ◽  
Jie Zhou

The early clinical symptoms of gastric cancer are not obvious, and metastasis may have occurred at the time of treatment. Poor prognosis is one of the important reasons for the high mortality of gastric cancer. Therefore, the identification of gastric cancer-related genes can be used as relevant markers for diagnosis and treatment to improve diagnosis precision and guide personalized treatment. In order to further reveal the pathogenesis of gastric cancer at the gene level, we proposed a method based on Gradient Boosting Decision Tree (GBDT) to identify the susceptible genes of gastric cancer through gene interaction network. Based on the known genes related to gastric cancer, we collected more genes which can interact with them and constructed a gene interaction network. Random Walk was used to extract network association of each gene and we used GBDT to identify the gastric cancer-related genes. To verify the AUC and AUPR of our algorithm, we implemented 10-fold cross-validation. GBDT achieved AUC as 0.89 and AUPR as 0.81. We selected four other methods to compare with GBDT and found GBDT performed best.


Author(s):  
Yuanyuan Chen ◽  
Yu Gu ◽  
Zixi Hu ◽  
Xiao Sun

Abstract Breast cancer is a highly heterogeneous disease, and there are many forms of categorization for breast cancer based on gene expression profiles. Gene expression profiles are variables and may show differences if measured at different time points or under different conditions. In contrast, biological networks are relatively stable over time and under different conditions. In this study, we used a gene interaction network from a new point of view to explore the subtypes of breast cancer based on individual-specific edge perturbations measured by relative gene expression value. Our study reveals that there are four breast cancer subtypes based on gene interaction perturbations at the individual level. The new network-based subtypes of breast cancer show strong heterogeneity in prognosis, somatic mutations, phenotypic changes and enriched pathways. The network-based subtypes are closely related to the PAM50 subtypes and immunohistochemistry index. This work helps us to better understand the heterogeneity and mechanisms of breast cancer from a network perspective.


2021 ◽  
Author(s):  
Andrew J Kavran ◽  
Aaron Clauset

Abstract Background: Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation.Results: We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data.Conclusions: Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Hao Yu ◽  
Yang Liu ◽  
Chao Li ◽  
Jianhao Wang ◽  
Bo Yu ◽  
...  

Background. Neuropathic pain (NP) is a devastating complication following nerve injury, and it can be alleviated by regulating neuroimmune direction. We aimed to explore the neuroimmune mechanism and identify some new diagnostic or therapeutic targets for NP treatment via bioinformatic analysis. Methods. The microarray GSE18803 was downloaded and analyzed using R. The Venn diagram was drawn to find neuroimmune-related differentially expressed genes (DEGs) in neuropathic pain. Gene Ontology (GO), pathway enrichment, and protein-protein interaction (PPI) network were used to analyze DEGs, respectively. Besides, the identified hub genes were submitted to the DGIdb database to find relevant therapeutic drugs. Results. A total of 91 neuroimmune-related DEGs were identified. The results of GO and pathway enrichment analyses were closely related to immune and inflammatory responses. PPI analysis showed two important modules and 8 hub genes: PTPRC, CD68, CTSS, RAC2, LAPTM5, FCGR3A, CD53, and HCK. The drug-hub gene interaction network was constructed by Cytoscape, and it included 24 candidate drugs and 3 hub genes. Conclusion. The present study helps us better understand the neuroimmune mechanism of neuropathic pain and provides some novel insights on NP treatment, such as modulation of microglia polarization and targeting bone resorption. Besides, CD68, CTSS, LAPTM5, FCGR3A, and CD53 may be used as early diagnostic biomarkers and the gene HCK can be a therapeutic target.


Sign in / Sign up

Export Citation Format

Share Document