scholarly journals RegulationSpotter: annotation and interpretation of extratranscriptic DNA variants

2019 ◽  
Vol 47 (W1) ◽  
pp. W106-W113 ◽  
Author(s):  
Jana Marie Schwarz ◽  
Daniela Hombach ◽  
Sebastian Köhler ◽  
David N Cooper ◽  
Markus Schuelke ◽  
...  

Abstract RegulationSpotter is a web-based tool for the user-friendly annotation and interpretation of DNA variants located outside of protein-coding transcripts (extratranscriptic variants). It is designed for clinicians and researchers who wish to assess the potential impact of the considerable number of non-coding variants found in Whole Genome Sequencing runs. It annotates individual variants with underlying regulatory features in an intuitive way by assessing over 100 genome-wide annotations. Additionally, it calculates a score, which reflects the regulatory potential of the variant region. Its dichotomous classifications, ‘functional’ or ‘non-functional’, and a human-readable presentation of the underlying evidence allow a biologically meaningful interpretation of the score. The output shows key aspects of every variant and allows rapid access to more detailed information about its possible role in gene regulation. RegulationSpotter can either analyse single variants or complete VCF files. Variants located within protein-coding transcripts are automatically assessed by MutationTaster as well as by RegulationSpotter to account for possible intragenic regulatory effects. RegulationSpotter offers the possibility of using phenotypic data to focus on known disease genes or genomic elements interacting with them. RegulationSpotter is freely available at https://www.regulationspotter.org.

Author(s):  
Naveen K. Bansal ◽  
Mehdi Maadooliat ◽  
Steven J. Schrodi

Abstract We consider a multiple hypotheses problem with directional alternatives in a decision theoretic framework. We obtain an empirical Bayes rule subject to a constraint on mixed directional false discovery rate (mdFDR≤α) under the semiparametric setting where the distribution of the test statistic is parametric, but the prior distribution is nonparametric. We proposed separate priors for the left tail and right tail alternatives as it may be required for many applications. The proposed Bayes rule is compared through simulation against rules proposed by Benjamini and Yekutieli and Efron. We illustrate the proposed methodology for two sets of data from biological experiments: HIV-transfected cell-line mRNA expression data, and a quantitative trait genome-wide SNP data set. We have developed a user-friendly web-based shiny App for the proposed method which is available through URL https://npseb.shinyapps.io/npseb/. The HIV and SNP data can be directly accessed, and the results presented in this paper can be executed.


2021 ◽  
Author(s):  
Marc-André Legault ◽  
Louis-Philippe Lemieux Perreault ◽  
Marie-Pierre Dubé

Structured AbstractMotivationThe relationship between protein coding genes and phenotypes has the potential to inform on the underlying molecular function in disease etiology. We conducted a phenome-wide association study (pheWAS) of protein coding genes using a principal components analysis-based approach in the UK Biobank.ResultsWe tested the association between 19,114 protein coding gene regions and 1,210 phenotypes including anthropometric measurements, laboratory biomarkers, cancer registry data, hospitalization and death record codes and algorithmically-defined cardiovascular outcomes. We report the pheWAS results in a user-friendly web-based browser. Taking atrial fibrillation, a common cardiac arrhythmia, as an example, ExPheWas identified genes that are known drug targets for the treatment of arrhythmias and genes involved in biological processes implicated in cardiac muscle function. We also identified MYOT as a possible atrial fibrillation gene.Availability and implementationThe ExPheWas browser and API are available at http://exphewas.statgen.org/[email protected]


2017 ◽  
Vol 137 (12) ◽  
pp. 2544-2551 ◽  
Author(s):  
Hong Liu ◽  
Zhenzhen Wang ◽  
Yi Li ◽  
Gongqi Yu ◽  
Xi’an Fu ◽  
...  

2014 ◽  
Vol 18 (1) ◽  
pp. 86-91 ◽  
Author(s):  
Aniket Mishra ◽  
Stuart Macgregor

Gene-based tests such as versatile gene-based association study (VEGAS) are commonly used following per-single nucleotide polymorphism (SNP) GWAS (genome-wide association studies) analysis. Two limitations of VEGAS were that the HapMap2 reference set was used to model the correlation between SNPs and only autosomal genes were considered. HapMap2 has now been superseded by the 1,000 Genomes reference set, and whereas early GWASs frequently ignored the X chromosome, it is now commonly included. Here we have developed VEGAS2, an extension that uses 1,000 Genomes data to model SNP correlations across the autosomes and chromosome X. VEGAS2 allows greater flexibility when defining gene boundaries. VEGAS2 offers both a user-friendly, web-based front end and a command line Linux version. The online version of VEGAS2 can be accessed through https://vegas2.qimrberghofer.edu.au/. The command line version can be downloaded from https://vegas2.qimrberghofer.edu.au/zVEGAS2offline.tgz. The command line version is developed in Perl, R and shell scripting languages; source code is available for further development.


2016 ◽  
Author(s):  
Valentina Iotchkova ◽  
Graham R.S. Ritchie ◽  
Matthias Geihs ◽  
Sandro Morganella ◽  
Josine L. Min ◽  
...  

Loci discovered by genome-wide association studies (GWAS) predominantly map outside protein-coding genes. The interpretation of functional consequences of non-coding variants can be greatly enhanced by catalogs of regulatory genomic regions in cell lines and primary tissues. However, robust and readily applicable methods are still lacking to systematically evaluate the contribution of these regions to genetic variation implicated in diseases or quantitative traits. Here we propose a novel approach that leverages GWAS findings with regulatory or functional annotations to classify features relevant to a phenotype of interest. Within our framework, we account for major sources of confounding that current methods do not offer. We further assess enrichment statistics for 27 GWAS traits within regulatory regions from the ENCODE and Roadmap projects. We characterise unique enrichment patterns for traits and annotations, driving novel biological insights. The method is implemented in standalone software and R package to facilitate its application by the research community.


2018 ◽  
Author(s):  
Philippe Henry

AbstractCannabis can elicit various reactions in different consumers. In order to shed light on the mechanisms underlying the human-cannabis relationship, we begin to investigate the genetic basis of this differential response. The web-based platform OpenSNP was used to collect selfreported genetic and phenotypic data. Participants either reported a positively or negative affinity to cannabis. A total of 26 individuals were retained, 10 of which indicated several negative responses and the remaining 16 indicating strong affinity for Cannabis. A total of 325’895 single nucleotide polymorphisms (SNPs) were retained. The software TASSEL 5 was used to run a genome-wide association study (GWAS), with a generalized liner model (GLM) and1000 permutations. The analysis yielded a set of 45 SNPs that were significantly associated with the reported affinity to cannabis, including one strong outlier found in the MYO16 gene. A diagnostic process is proposed by which individuals can be assessed for their affinity to cannabis. We believe this type of tool may be helpful in alleviating some of the stigma associated with cannabis use in individuals sensitive to THC and other cannabis constituents such as myrcene, which may potentiate negative responses.


JAMIA Open ◽  
2021 ◽  
Vol 4 (3) ◽  
Author(s):  
Elias DeVoe ◽  
Gavin R Oliver ◽  
Roman Zenka ◽  
Patrick R Blackburn ◽  
Margot A Cousin ◽  
...  

Abstract Motivation Genomic data are prevalent, leading to frequent encounters with uninterpreted variants or mutations with unknown mechanisms of effect. Researchers must manually aggregate data from multiple sources and across related proteins, mentally translating effects between the genome and proteome, to attempt to understand mechanisms. Materials and methods P2T2 presents diverse data and annotation types in a unified protein-centric view, facilitating the interpretation of coding variants and hypothesis generation. Information from primary sequence, domain, motif, and structural levels are presented and also organized into the first Paralog Annotation Analysis across the human proteome. Results Our tool assists research efforts to interpret genomic variation by aggregating diverse, relevant, and proteome-wide information into a unified interactive web-based interface. Additionally, we provide a REST API enabling automated data queries, or repurposing data for other studies. Conclusion The unified protein-centric interface presented in P2T2 will help researchers interpret novel variants identified through next-generation sequencing. Code and server link available at github.com/GenomicInterpretation/p2t2.


PLoS Genetics ◽  
2021 ◽  
Vol 17 (1) ◽  
pp. e1008761
Author(s):  
Laura Natalia Balarezo-Cisneros ◽  
Steven Parker ◽  
Marcin G. Fraczek ◽  
Soukaina Timouma ◽  
Ping Wang ◽  
...  

Non-coding RNAs (ncRNAs), including the more recently identified Stable Unannotated Transcripts (SUTs) and Cryptic Unstable Transcripts (CUTs), are increasingly being shown to play pivotal roles in the transcriptional and post-transcriptional regulation of genes in eukaryotes. Here, we carried out a large-scale screening of ncRNAs in Saccharomyces cerevisiae, and provide evidence for SUT and CUT function. Phenotypic data on 372 ncRNA deletion strains in 23 different growth conditions were collected, identifying ncRNAs responsible for significant cellular fitness changes. Transcriptome profiles were assembled for 18 haploid ncRNA deletion mutants and 2 essential ncRNA heterozygous deletants. Guided by the resulting RNA-seq data we analysed the genome-wide dysregulation of protein coding genes and non-coding transcripts. Novel functional ncRNAs, SUT125, SUT126, SUT035 and SUT532 that act in trans by modulating transcription factors were identified. Furthermore, we described the impact of SUTs and CUTs in modulating coding gene expression in response to different environmental conditions, regulating important biological process such as respiration (SUT125, SUT126, SUT035, SUT432), steroid biosynthesis (CUT494, SUT053, SUT468) or rRNA processing (SUT075 and snR30). Overall, these data capture and integrate the regulatory and phenotypic network of ncRNAs and protein-coding genes, providing genome-wide evidence of the impact of ncRNAs on cellular homeostasis.


2021 ◽  
Author(s):  
Philipp Schönnenbeck ◽  
Tilman Schell ◽  
Susanne Gerber ◽  
Markus Pfenninger

AbstractMotivationThe question of determining whether a Single-Nucleotide Polymorphism (SNP) or a variant in general leads to a change in the amino acid sequence of a protein coding gene is often a laborious and time-consuming challenge. Here, we introduce the tbg file format for storing genomic data and tbg-tools, a user-friendly toolbox for the faster analysis of SNPs. The file format stores information for each nucleotide in each gene, allowing to predict which change in the amino acid sequence will be caused by a variant in the nucleotide sequence. Our new tool therefore has the potential to make biological sense of the unprecedented amount of genome-wide genetic variation that researchers currently face.ResultsThe new tab-separated file for storing the nucleotide data can be easily analyzed and used for a wide variety of biological research. It is also possible to automate some of these analyses using the additional analysis tools from tbg-toolsAvailabilitytbg-tools is written in Python and allows the installation from the command line. It can be found on https://github.com/Croxa/[email protected]


2019 ◽  
Author(s):  
Katarzyna Murat ◽  
Björn Grüning ◽  
Paulina Wiktoria Poterlowicz ◽  
Gillian Westgate ◽  
Desmond J Tobin ◽  
...  

AbstractBackgroundEpigenome-wide association studies (EWAS) analyse genome-wide activity of epigenetic marks in cohorts of different individuals to find associations between epigenetic variation and phenotype. One of the most common technique used in EWAS studies is the Infinium Methylation Assay, which quantifies the DNA methylation level of over 450k loci. Although a number of bioinformatics tools have been developed to analyse the assay they require some programming skills and experience to use them.ResultsWe have developed a collection of user-friendly tools for the Galaxy platform for those without experience aimed at DNA methylation analysis using the Infinium Methylation Assay. Our tool suite is integrated into Galaxy (http://galaxyproject.org), web based platform. This allows users to analyse data from the Infinium Methylation Assay in the easiest possible way.ConclusionsThe EWAS suite provides a group of integrated tools that combine analytical methods into a range of handy analysis pipelines. Our tool suite is available from the Galaxy test toolshed, GitHub repository and also as a Docker image. The aim of this project is to make EWAS analysis more flexible and accessible to everyone.


Sign in / Sign up

Export Citation Format

Share Document