functional annotations
Recently Published Documents


TOTAL DOCUMENTS

363
(FIVE YEARS 201)

H-INDEX

30
(FIVE YEARS 10)

2022 ◽  
Author(s):  
Wenmin Zhang ◽  
Hamed Najafabadi ◽  
Yue Li

Abstract Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct genome-wide fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype data, we enable a linear search of causal variants instead of a combinatorial search of causal configurations used in most existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We show that, compared to other methods, the causal variants identified by SparsePro are highly enriched for expression quantitative trait loci and explain a larger proportion of trait heritability. We also identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs, and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs. SparsePro software is available on GitHub at https://github.com/zhwm/SparsePro.


2021 ◽  
Author(s):  
Georgie Stephan ◽  
Benjamin Dugdale ◽  
Pradeep Deo ◽  
Rob Harding ◽  
James Dale ◽  
...  

Background: Functional annotation assigns descriptive biological meaning to genetic sequences. Limited availability of manually curated or experimentally validated plant genes from a diverse range of taxa poses a significant challenge for functional annotation in non-model organisms. Accurate computational approaches are required. We argue that recent breakthroughs in deep learning have the potential to not only narrow the functional annotation gap between non-model and model plant organisms, but also annotate and reveal novel functions even for genes with no homologs in public databases. Results: Deep learning models were applied to functionally annotate a set of previously published differentially expressed genes. Predicted protein structures and functional annotations were generated using the AlphaFold protein structure and DeepFRI protein language inference models respectively. The resulting structures and functional annotations were validated using small molecule docking experiments. DeepFRI and AlphaFold models not only correctly annotated differentially expressed genes, but also revealed detailed mechanisms involving protein-protein interactions. Conclusions: Deep learning models are capable of inferring novel functions and achieving high accuracy in functional annotation. Their increased use in plant research will result in major improvements in annotations for non-model plants that are underrepresented in genome databases. We illustrate how integrating protein structure prediction, functional residue prediction, and small molecule docking can infer plausible protein-protein interactions and yield additional mechanistic insights. This approach will aid in the selection of candidate genes for further study from differential expression studies that generate large gene lists.


2021 ◽  
Vol 12 ◽  
Author(s):  
Tertius Alwyn Ras ◽  
Erick Strauss ◽  
Annelise Botes

Mycoplasmas are responsible for a wide range of disease states in both humans and animals, in which their parasitic lifestyle has allowed them to reduce their genome sizes and curtail their biosynthetic capabilities. The subsequent dependence on their host offers a unique opportunity to explore pathways for obtaining and producing cofactors – such as coenzyme A (CoA) – as possible targets for the development of new anti-mycoplasma agents. CoA plays an essential role in energy and fatty acid metabolism and is required for membrane synthesis. However, our current lack of knowledge of the relevance and importance of the CoA biosynthesis pathway in mycoplasmas, and whether it could be bypassed within their pathogenic context, prevents further exploration of the potential of this pathway. In the universal, canonical CoA biosynthesis pathway, five enzymes are responsible for the production of CoA. Given the inconsistent presence of the genes that code for these enzymes across Mycoplasma genomes, this study set out to establish the genetic capacity of mycoplasmas to synthesize their own CoA de novo. Existing functional annotations and sequence, family, motif, and domain analysis of protein products were used to determine the existence of relevant genes in Mycoplasma genomes. We found that most Mycoplasma species do have the genetic capacity to synthesize CoA, but there was a differentiated prevalence of these genes across species. Phylogenetic analysis indicated that the phylogenetic position of a species could not be used to predict its enzyme-encoding gene combinations. Despite this, the final enzyme in the biosynthesis pathway – dephospho-coenzyme A kinase (DPCK) – was found to be the most common among the studied species, suggesting that it has the most potential as a target in the search for new broad-spectrum anti-mycoplasma agents.


2021 ◽  
Vol 8 ◽  
Author(s):  
Abdelmounim Essabbar ◽  
Souad Kartti ◽  
Tarek Alouane ◽  
Mohammed Hakmi ◽  
Lahcen Belyamani ◽  
...  

Ending COVID-19 pandemic requires a collaborative understanding of SARS-CoV-2 and COVID-19 mechanisms. Yet, the evolving nature of coronaviruses results in a continuous emergence of new variants of the virus. Central to this is the need for a continuous monitoring system able to detect potentially harmful variants of the virus in real-time. In this manuscript, we present the International Database of SARS-CoV-2 Variations (IDbSV), the result of ongoing efforts in curating, analyzing, and sharing comprehensive interpretation of SARS-CoV-2's genetic variations and variants. Through user-friendly interactive data visualizations, we aim to provide a novel surveillance tool to the scientific and public health communities. The database is regularly updated with new records through a 4-step workflow (1—Quality control of curated sequences, 2—Call of variations, 3—Functional annotation, and 4—Metadata association). To the best of our knowledge, IDbSV provides access to the largest repository of SARS-CoV-2 variations and the largest analysis of SARS-CoV-2 genomes with over 60 thousand annotated variations curated from the 1,808,613 genomes alongside their functional annotations, first known appearance, and associated genetic lineages, enabling a robust interpretation tool for SARS-CoV-2 variations to help understanding SARS-CoV-2 dynamics across the world.


Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1938
Author(s):  
Andrey V. Khrunin ◽  
Gennady V. Khvorykh ◽  
Alexandra V. Rozhkova ◽  
Evgeniya A. Koltsova ◽  
Elizaveta A. Petrova ◽  
...  

Although there has been great progress in understanding the genetic bases of ischemic stroke (IS), many of its aspects remain underexplored. These include the genetics of outcomes, as well as problems with the identification of real causative loci and their functional annotations. Therefore, analysis of the results obtained from animal models of brain ischemia could be helpful. We have developed a bioinformatic approach exploring single nucleotide polymorphisms (SNPs) in human orthologues of rat genes expressed differentially under conditions of induced brain ischemia. Using this approach, we identified and analyzed nine SNPs in 553 Russian individuals (331 patients with IS and 222 controls). We explored the association of SNPs with both IS outcomes and with the risk of IS. SNP rs66782529 (LGALS3) was associated with negative IS outcomes (p = 0.048). SNPs rs62278647 and rs2316710 (PTX3) were associated significantly with IS (p = 0.000029 and p = 0.0025, respectively). These correlations for rs62278647 and rs2316710 were found only in women, which suggests a sex-specific association of the PTX3 polymorphism. Thus, this research not only reveals some new genetic associations with IS and its outcomes but also shows how exploring variations in genes from a rat model of brain ischemia can be of use in searching for human genetic markers of this disorder.


2021 ◽  
Vol 9 (4) ◽  
pp. 111-122
Author(s):  
Yan Luo ◽  
Si-ting Gao ◽  
Jun-xiong Cheng ◽  
Wei-jian Xiong ◽  
Wen-Fu Cao

Lianhuaqingwen (LH) is the widely used in the treatment of Coronavirus disease 2019 (COVID-19). However, its mechanisms of action and molecular targets for treatment of COVID-19 are not clear. The active compounds of LH were collected and their targets were identified through the network pharmacology. The mechanism of compound multi components and multi targets and its relationship with disease are analyzed. COVID-19 targets were obtained by analyzing with TCMSP. In total, 282 active ingredients and 510 targets of LH were identified. Twenty-one target genes associated with LH and COVID-19. Protein-protein interaction (PPI) data were then obtained and PPI networks of LH putative targets and COVID-19-related targets were visualized and merged to identify the candidate targets for LH against COVID-19. Gene ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis were carried out. The gene-pathway network was constructed to screen the crucial target genes. The functional annotations of target genes were found to be related to immune regulation, host defense, inflammatory reaction and autoimmune diseases and so on. Twenty pathways including immunology, cancer, and cell processing were significantly enriched. Quercetin and luteolin might be the crucial ingredients. IL6 was the core gene and other several genes including IL1B, STAT1, IFNGR1, and NCF1 were the key genes in the gene-pathway network of LH for treatment of COVID-19. The results indicated that LH’s effects against COVID-19 might relate to regulation of immunological function through the specific biological processes and the related pathways. This study demonstrates the application of network pharmacology in evaluating mechanisms of action and molecular targets of complex herbal formulations.


2021 ◽  
Vol 12 ◽  
Author(s):  
Xiaozhen Zhao ◽  
Kunjiang Yu ◽  
Chengke Pang ◽  
Xu Wu ◽  
Rui Shi ◽  
...  

As an important physiological and reproductive organ, the silique is a determining factor of seed yield and a breeding target trait in rapeseed (Brassica napus L.). Genetic studies of silique-related traits are helpful for rapeseed marker-assisted high-yield breeding. In this study, a recombinant inbred population containing 189 lines was used to perform a quantitative trait loci (QTLs) analysis for five silique-related traits in seven different environments. As a result, 120 consensus QTLs related to five silique-related traits were identified, including 23 for silique length, 25 for silique breadth, 29 for silique thickness, 22 for seed number per silique and 21 for silique volume, which covered all the chromosomes, except C5. Among them, 13 consensus QTLs, one, five, two, four and one for silique length, silique breadth, silique thickness, seed number per silique and silique volume, respectively, were repeatedly detected in multiple environments and explained 4.38–13.0% of the phenotypic variation. On the basis of the functional annotations of Arabidopsis homologous genes and previously reported silique-related genes, 12 potential candidate genes underlying these 13 QTLs were screened and found to be stable in multiple environments by analyzing the re-sequencing results of the two parental lines. These findings provide new insights into the gene networks affecting silique-related traits at the QTL level in rapeseed.


Cells ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 3238
Author(s):  
Giovanna Morello ◽  
Ambra Villari ◽  
Antonio Gianmaria Spampinato ◽  
Valentina La Cognata ◽  
Maria Guarnaccia ◽  
...  

Neuronal apoptosis and survival are regulated at the transcriptional level. To identify key genes and upstream regulators primarily responsible for these processes, we overlayed the temporal transcriptome of cerebellar granule neurons following induction of apoptosis and their rescue by three different neurotrophic factors. We identified a core set of 175 genes showing opposite expression trends at the intersection of apoptosis and survival. Their functional annotations and expression signatures significantly correlated to neurological, psychiatric and oncological disorders. Transcription regulatory network analysis revealed the action of nine upstream transcription factors, converging pro-apoptosis and pro-survival-inducing signals in a highly interconnected functionally and temporally ordered manner. Five of these transcription factors are potential drug targets. Transcriptome-based computational drug repurposing produced a list of drug candidates that may revert the apoptotic core set signature. Besides elucidating early drivers of neuronal apoptosis and survival, our systems biology-based perspective paves the way to innovative pharmacology focused on upstream targets and regulatory networks.


2021 ◽  
Vol 22 (22) ◽  
pp. 12462
Author(s):  
Neha Kaushik ◽  
Soumya Rastogi ◽  
Sonia Verma ◽  
Deepak Pandey ◽  
Ashutosh Halder ◽  
...  

Insulin/IGF-1-like signaling (IIS) plays a crucial, conserved role in development, growth, reproduction, stress tolerance, and longevity. In Caenorhabditis elegans, the enhanced longevity under reduced insulin signaling (rIIS) is primarily regulated by the transcription factors (TFs) DAF-16/FOXO, SKN-1/Nrf-1, and HSF1/HSF-1. The specific and coordinated regulation of gene expression by these TFs under rIIS has not been comprehensively elucidated. Here, using RNA-sequencing analysis, we report a systematic study of the complexity of TF-dependent target gene interactions during rIIS under analogous genetic and experimental conditions. We found that DAF-16 regulates only a fraction of the C. elegans transcriptome but controls a large set of genes under rIIS; SKN-1 and HSF-1 show the opposite trend. Both of the latter TFs function as activators and repressors to a similar extent, while DAF-16 is predominantly an activator. For expression of the genes commonly regulated by TFs under rIIS conditions, DAF-16 is the principal determining factor, dominating over the other two TFs, irrespective of whether they activate or repress these genes. The functional annotations and regulatory networks presented in this study provide novel insights into the complexity of the gene regulatory networks downstream of the IIS pathway that controls diverse phenotypes, including longevity.


2021 ◽  
Author(s):  
Javier Pardo-Diaz ◽  
Philip Poole ◽  
Mariano Beguerisse-Diaz ◽  
Charlotte Deane ◽  
Gesine Reinert

Even within well-studied organisms, many genes lack useful functional annotations. One way to generate such functional information is to infer biological relationships between genes or proteins, using a network of gene coexpression data that includes functional annotations. Signed distance correlation has proved useful for the construction of unweighted gene coexpression networks. However, transforming correlation values into unweighted networks may lead to a loss of important biological information related to the intensity of the correlation. Here introduce a principled method to construct \emph{weighted} gene coexpression networks using signed distance correlation. These networks contain weighted edges only between those pairs of genes whose correlation value is higher than a given threshold. We analyse data from different organisms and find that networks generated with our method based on signed distance correlation are more stable and capture more biological information compared to networks obtained from Pearson correlation. Moreover, we show that signed distance correlation networks capture more biological information than unweighted networks based on the same metric. While we use biological data sets to illustrate the method, the approach is general and can be used to construct networks in other domains.


Sign in / Sign up

Export Citation Format

Share Document