scholarly journals PSGfinder: fast identification of genes under divergent positive selection using the dynamic windows method

2017 ◽  
Author(s):  
Joël Tuberosa ◽  
Juan I. Montoya-Burgos

AbstractSummaryOrthologous genes evolving under divergent positive selection are those involved in divergent adaptive trajectories between related species. Current methods to identify such genes are complex and conservative or present some imperfections, limiting genome-wide searches. We present a simple method, Dynamic Windows, to detect regions of protein-coding genes evolving under divergent positive selection. This method is implemented in PSGfinder, a user-friendly and flexible software, allowing rapid genome-wide screenings of regions with a dN/dS >1. PSGfinder additionally includes an alignment cleaning procedure and an adapted multiple comparison correction to identify significant signals of positive selection.Availability and ImplementationPSGfinder is a software that implements the DWin method, is written in Python and is freely available with its documentation at: https://genev.unige.ch/research/laboratory/Juan-Montoya or at: https://github.com/joel-tuberosa/[email protected]; [email protected]

2017 ◽  
Author(s):  
Morgan N. Price ◽  
Adam P. Arkin

AbstractLarge-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources that link protein sequences to scientific articles (Swiss-Prot, GeneRIF, and EcoCyc). PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/.


2017 ◽  
Author(s):  
Cristina Cruz ◽  
Monica Della Rosa ◽  
Christel Krueger ◽  
Qian Gao ◽  
Lucy Field ◽  
...  

AbstractTranscription of protein coding genes is accompanied by recruitment of COMPASS to promoter-proximal chromatin, which deposits di- and tri-methylation on histone H3 lysine 4 (H3K4) to form H3K4me2 and H3K4me3. Here we determine the importance of COMPASS in maintaining gene expression across lifespan in budding yeast. We find that COMPASS mutations dramatically reduce replicative lifespan and cause widespread gene expression defects. Known repressive functions of H3K4me2 are progressively lost with age, while hundreds of genes become dependent on H3K4me3 for full expression. Induction of these H3K4me3 dependent genes is also impacted in young cells lacking COMPASS components including the H3K4me3-specific factor Spp1. Remarkably, the genome-wide occurrence of H3K4me3 is progressively reduced with age despite widespread transcriptional induction, minimising the normal positive correlation between promoter H3K4me3 and gene expression. Our results provide clear evidence that H3K4me3 is required to attain normal expression levels of many genes across organismal lifespan.


2019 ◽  
Vol 9 (12) ◽  
pp. 6821-6832 ◽  
Author(s):  
Jacob Njaramba Ngatia ◽  
Tian Ming Lan ◽  
Thi Dao Dinh ◽  
Le Zhang ◽  
Ahmed Khalid Ahmed ◽  
...  

2020 ◽  
Vol 49 (D1) ◽  
pp. D962-D968 ◽  
Author(s):  
Zhao Li ◽  
Lin Liu ◽  
Shuai Jiang ◽  
Qianpeng Li ◽  
Changrui Feng ◽  
...  

Abstract Expression profiles of long non-coding RNAs (lncRNAs) across diverse biological conditions provide significant insights into their biological functions, interacting targets as well as transcriptional reliability. However, there lacks a comprehensive resource that systematically characterizes the expression landscape of human lncRNAs by integrating their expression profiles across a wide range of biological conditions. Here, we present LncExpDB (https://bigd.big.ac.cn/lncexpdb), an expression database of human lncRNAs that is devoted to providing comprehensive expression profiles of lncRNA genes, exploring their expression features and capacities, identifying featured genes with potentially important functions, and building interactions with protein-coding genes across various biological contexts/conditions. Based on comprehensive integration and stringent curation, LncExpDB currently houses expression profiles of 101 293 high-quality human lncRNA genes derived from 1977 samples of 337 biological conditions across nine biological contexts. Consequently, LncExpDB estimates lncRNA genes’ expression reliability and capacities, identifies 25 191 featured genes, and further obtains 28 443 865 lncRNA-mRNA interactions. Moreover, user-friendly web interfaces enable interactive visualization of expression profiles across various conditions and easy exploration of featured lncRNAs and their interacting partners in specific contexts. Collectively, LncExpDB features comprehensive integration and curation of lncRNA expression profiles and thus will serve as a fundamental resource for functional studies on human lncRNAs.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 464 ◽  
Author(s):  
Leos G. Kral ◽  
Sara Watson

Background: Mitochondrial DNA of vertebrates contains genes for 13 proteins involved in oxidative phosphorylation. Some of these genes have been shown to undergo adaptive evolution in a variety of species. This study examines all mitochondrial protein coding genes in 11 darter species to determine if any of these genes show evidence of positive selection. Methods: The mitogenome from four darter was sequenced and annotated. Mitogenome sequences for another seven species were obtained from GenBank. Alignments of each of the protein coding genes were subject to codon-based identification of positive selection by Selecton, MEME and FEL. Results: Evidence of positive selection was obtained for six of the genes by at least one of the methods. CYTB was identified as having evolved under positive selection by all three methods at the same codon location. Conclusions: Given the evidence for positive selection of mitochondrial protein coding genes in darters, a more extensive analysis of mitochondrial gene evolution in all the extant darter species is warranted.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Yibo Dong ◽  
Shichao Chen ◽  
Shifeng Cheng ◽  
Wenbin Zhou ◽  
Qing Ma ◽  
...  

Although geographic isolation is a leading driver of speciation, the tempo and pattern of divergence at the genomic level remain unclear. We examine genome-wide divergence of putatively single-copy orthologous genes (POGs) in 20 allopatric species/variety pairs from diverse angiosperm clades, with 16 pairs reflecting the classic eastern Asia-eastern North America floristic disjunction. In each pair, >90% of POGs are under purifying selection, and <10% are under positive selection. A set of POGs are under strong positive selection, 14 of which are shared by 10–15 pairs, and one shared by all pairs; 15 POGs are annotated to biological processes responding to various stimuli. The relative abundance of POGs under different selective forces exhibits a repeated pattern among pairs despite an ~10 million-year difference in divergence time. Species divergence times are positively correlated with abundance of POGs under moderate purifying selection, but negatively correlated with abundance of POGs under strong purifying selection.


2020 ◽  
Author(s):  
Yura Kim ◽  
Mariam Naghavi ◽  
Ying-Tao Zhao

ABSTRACTThe human genome contains more than 4000 genes that are longer than 100 kb. These long genes require more time and resources to make a transcript than shorter genes do. Long genes have also been linked to various human diseases. Specific mechanisms are utilized by long genes to facilitate their transcription and co-transcriptional processes. This results in unique features in their multi-omics profiles. Although these unique profiles are important to understand long genes, a database that provides an integrated view and easy access to the multi-omics profiles of long genes does not exist. We leveraged the publicly accessible multi-omics data and systematically analyzed the genomic conservation, histone modifications, chromatin organization, tissue-specific transcriptome, and single cell transcriptome of 992 protein-coding genes that are longer than 200 kb in the mouse genome. We also examined the evolution history of their gene lengths in 15 species that belong to six Classes and 11 Orders. To share the multi-omics profiles of long genes, we developed a user-friendly and easy-to-use database, LongGeneDB (https://longgenedb.com), for users to search, browse, and download these profiles. LongGeneDB will be a useful data hub for the biomedical research community to understand long genes.


2021 ◽  
Author(s):  
Marc-André Legault ◽  
Louis-Philippe Lemieux Perreault ◽  
Marie-Pierre Dubé

Structured AbstractMotivationThe relationship between protein coding genes and phenotypes has the potential to inform on the underlying molecular function in disease etiology. We conducted a phenome-wide association study (pheWAS) of protein coding genes using a principal components analysis-based approach in the UK Biobank.ResultsWe tested the association between 19,114 protein coding gene regions and 1,210 phenotypes including anthropometric measurements, laboratory biomarkers, cancer registry data, hospitalization and death record codes and algorithmically-defined cardiovascular outcomes. We report the pheWAS results in a user-friendly web-based browser. Taking atrial fibrillation, a common cardiac arrhythmia, as an example, ExPheWas identified genes that are known drug targets for the treatment of arrhythmias and genes involved in biological processes implicated in cardiac muscle function. We also identified MYOT as a possible atrial fibrillation gene.Availability and implementationThe ExPheWas browser and API are available at http://exphewas.statgen.org/[email protected]


2019 ◽  
Vol 47 (W1) ◽  
pp. W106-W113 ◽  
Author(s):  
Jana Marie Schwarz ◽  
Daniela Hombach ◽  
Sebastian Köhler ◽  
David N Cooper ◽  
Markus Schuelke ◽  
...  

Abstract RegulationSpotter is a web-based tool for the user-friendly annotation and interpretation of DNA variants located outside of protein-coding transcripts (extratranscriptic variants). It is designed for clinicians and researchers who wish to assess the potential impact of the considerable number of non-coding variants found in Whole Genome Sequencing runs. It annotates individual variants with underlying regulatory features in an intuitive way by assessing over 100 genome-wide annotations. Additionally, it calculates a score, which reflects the regulatory potential of the variant region. Its dichotomous classifications, ‘functional’ or ‘non-functional’, and a human-readable presentation of the underlying evidence allow a biologically meaningful interpretation of the score. The output shows key aspects of every variant and allows rapid access to more detailed information about its possible role in gene regulation. RegulationSpotter can either analyse single variants or complete VCF files. Variants located within protein-coding transcripts are automatically assessed by MutationTaster as well as by RegulationSpotter to account for possible intragenic regulatory effects. RegulationSpotter offers the possibility of using phenotypic data to focus on known disease genes or genomic elements interacting with them. RegulationSpotter is freely available at https://www.regulationspotter.org.


Sign in / Sign up

Export Citation Format

Share Document