Protein subcellular localization prediction plays a crucial role in improving our understandings of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder - the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model gives an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations.
The tremendous success of graphical neural networks (GNNs) has already had a major impact on systems biology research. For example, GNNs are currently used for drug target recognition in protein-drug interaction networks as well as cancer gene discovery and more. Important aspects whose practical relevance is often underestimated are comprehensibility, interpretability, and explainability. In this work, we present a graph-based deep learning framework for disease subnetwork detection via explainable GNNs. In our framework, each patient is represented by the topology of a protein-protein network (PPI), and the nodes are enriched by molecular multimodal data, such as gene expression and DNA methylation. Therefore, our novel modification of the GNNexplainer for model-wide explanations can detect potential disease subnetworks, which is of high practical relevance. The proposed methods are implemented in the GNN-SubNet Python program, which we have made freely available on our GitHub for the international research community (https://github.com/pievos101/GNN-SubNet).
AbstractThe likely genetic architecture of complex diseases is that subgroups of patients share variants in genes in specific networks sufficient to express a shared phenotype. We combined high throughput sequencing with advanced bioinformatic approaches to identify such subgroups of patients with variants in shared networks. We performed targeted sequencing of patients with 2 or 3 generations of preterm birth on genes, gene sets and haplotype blocks that were highly associated with preterm birth. We analyzed the data using a multi-sample, protein–protein interaction (PPI) tool to identify significant clusters of patients associated with preterm birth. We identified shared protein interaction networks among preterm cases in two statistically significant clusters, p < 0.001. We also found two small control-dominated clusters. We replicated these data on an independent, large birth cohort. Separation testing showed significant similarity scores between the clusters from the two independent cohorts of patients. Canonical pathway analysis of the unique genes defining these clusters demonstrated enrichment in inflammatory signaling pathways, the glucocorticoid receptor, the insulin receptor, EGF and B-cell signaling, These results support a genetic architecture defined by subgroups of patients that share variants in genes in specific networks and pathways which are sufficient to give rise to the disease phenotype.
The metabolic processes of organisms are very complex. Each process is crucial and affects the growth, development, and reproduction of organisms. Metabolism-related mechanisms in Octopus ocellatus behaviors have not been widely studied. Brood-care is a common behavior in most organisms, which can improve the survival rate and constitution of larvae. Octopus ocellatus carried out this behavior, but it was rarely noticed by researchers before. In our study, 3,486 differentially expressed genes (DEGs) were identified based on transcriptome analysis of O. ocellatus. We identify metabolism-related DEGs using GO and KEGG enrichment analyses. Then, we construct protein–protein interaction networks to search the functional relationships between metabolism-related DEGs. Finally, we identified 10 hub genes related to multiple gene functions or involved in multiple signal pathways and verified them using quantitative real-time polymerase chain reaction (qRT-PCR). Protein–protein interaction networks were first used to study the effects of brood-care behavior on metabolism in the process of growing of O. ocellatus larvae, and the results provide us valuable genetic resources for understanding the metabolic processes of invertebrate larvae. The data lay a foundation for further study the brood-care behavior and metabolic mechanisms of invertebrates.
Multispecies fisheries, particularly those that routinely adapt the timing, location, and methods of fishing to prioritize fishery targets, present a challenge to traditional single-species management approaches. Efforts to develop robust management for multispecies fisheries require an understanding of how priorities drive the network of interactions between catch of different species, especially given the added challenges presented by climate change. Using 35 years of landings data from a southern California recreational fishery, we leveraged empirical dynamic modelling methods to construct causal interaction networks among the main species targeted by the fishery. We found strong evidence for dependencies among species landings time series driven by apparent hierarchical catch preference within the fishery. In addition, by parsing the landings time series into anomalously cool, normal, and anomalously warm regimes (the last reflecting ocean temperatures anticipated by 2040), we found that network complexity was highest during warm periods. Our findings suggest that as ocean temperatures continue to rise, so too will the risk of unintended consequences from single species management in this multispecies fishery.
Background: The COVID-19 pandemic poses an imminent threat to humanity, especially for those who have comorbidities. Evidence of COVID-19 and COPD comorbidities is accumulating. However, data revealing the molecular mechanism of COVID-19 and COPD comorbid diseases is limited.Methods: We got COVID-19/COPD -related genes from different databases by restricted screening conditions (top500), respectively, and then supplemented with COVID-19/COPD-associated genes (FDR<0.05, |LogFC|≥1) from clinical sample data sets. By taking the intersection, 42 co-morbid host factors for COVID-19 and COPD were finally obtained. On the basis of shared host factors, we conducted a series of bioinformatics analysis, including protein-protein interaction analysis, gene ontology and pathway enrichment analysis, transcription factor-gene interaction network analysis, gene-microRNA co-regulatory network analysis, tissue-specific enrichment analysis and candidate drug prediction.Results: We revealed the comorbidity mechanism of COVID-19 and COPD from the perspective of host factor interaction, obtained the top ten gene and 3 modules with different biological functions. Furthermore, we have obtained the signaling pathways and concluded that dexamethasone, estradiol, progesterone, and nitric oxide shows effective interventions.Conclusion: This study revealed host factor interaction networks for COVID-19 and COPD, which could confirm the potential drugs for treating the comorbidity, ultimately, enhancing the management of the respiratory disease.