MaveRegistry: a collaboration platform for multiplexed assays of variant effect

Abstract Summary Multiplexed assays of variant effect (MAVEs) are capable of experimentally testing all possible single nucleotide or amino acid variants in selected genomic regions, generating ‘variant effect maps’, which provide biochemical insight and functional evidence to enable more rapid and accurate clinical interpretation of human variation. Because the international community applying MAVE approaches is growing rapidly, we developed the online MaveRegistry platform to catalyze collaboration, reduce redundant efforts, allow stakeholders to nominate targets and enable tracking and sharing of progress on ongoing MAVE projects. Availability and implementation MaveRegistry service: https://registry.varianteffect.org. MaveRegistry source code: https://github.com/kvnkuang/maveregistry-front-end.

Download Full-text

MaveRegistry: a collaboration platform for multiplexed assays of variant effect

10.1101/2020.10.14.339499 ◽

2020 ◽

Author(s):

Da Kuang ◽

Jochen Weile ◽

Nishka Kishore ◽

Alan F. Rubin ◽

Stanley Fields ◽

...

Keyword(s):

Amino Acid ◽

International Community ◽

Human Variation ◽

Single Nucleotide ◽

Clinical Interpretation ◽

Link Type ◽

Variant Effect ◽

Genomic Regions ◽

Amino Acid Variants

AbstractSummaryMultiplexed assays of variant effect (MAVEs) are capable of experimentally testing all possible single nucleotide or amino acid variants in selected genomic regions, generating ‘variant effect maps’, which provide biochemical insight and functional evidence to enable more rapid and accurate clinical interpretation of human variation. Because the international community applying MAVE approaches is growing rapidly, we developed the online MaveRegistry platform to catalyze collaboration, reduce redundant efforts, allow stakeholders to nominate targets, and enable tracking and sharing of progress on ongoing MAVE projects.Availability and implementationhttps://[email protected]

Download Full-text

Comprehensive variant effect predictions of single nucleotide variants in model organisms

10.1101/313031 ◽

2018 ◽

Cited By ~ 3

Author(s):

Omar Wagih ◽

Bede Busby ◽

Marco Galardini ◽

Danish Memon ◽

Athanasios Typas ◽

...

Keyword(s):

Amino Acid ◽

Protein Complex ◽

Model Organisms ◽

Single Nucleotide Variants ◽

Cellular Mechanisms ◽

Single Nucleotide ◽

Post Translational Modifications ◽

Variant Effect ◽

Coding Variants ◽

The Impact

AbstractThe effect of single nucleotide variants (SNVs) in coding and non-coding regions is of great interest in genetics. Although many computational methods aim to elucidate the effects of SNVs on cellular mechanisms, it is not straightforward to comprehensively cover different molecular effects. To address this we compiled and benchmarked sequence and structure-based variant effect predictors and we analyzed the impact of nearly all possible amino acid and nucleotide variants in the reference genomes of H. sapiens, S. cerevisiae and E. coli. Studied mechanisms include protein stability, interaction interfaces, post-translational modifications and transcription factor binding sites. We apply this resource to the study of natural and disease coding variants. We also show how variant effects can be aggregated to generate protein complex burden scores that uncover protein complex to phenotype associations based on a set of newly generated growth profiles of 93 sequenced S. cerevisiae strains in 43 conditions. This resource is available through mutfunc, a tool by which users can query precomputed predictions by providing amino acid or nucleotide-level variants.

Download Full-text

quasitools: A Collection of Tools for Viral Quasispecies Analysis

10.1101/733238 ◽

2019 ◽

Cited By ~ 2

Author(s):

Eric Marinier ◽

Eric Enns ◽

Camy Tran ◽

Matthew Fogel ◽

Cole Peters ◽

...

Keyword(s):

Amino Acid ◽

Genetic Distance ◽

Open Source ◽

Source Code ◽

Viral Quasispecies ◽

Consensus Sequences ◽

Link Type ◽

Version 2.0 ◽

Amino Acid Variants

AbstractSummaryquasitools is a collection of newly-developed, open-source tools for analyzing viral quasispcies data. The application suite includes tools with the ability to create consensus sequences, call nucleotide, codon, and amino acid variants, calculate the complexity of a quasispecies, and measure the genetic distance between two similar quasispecies. These tools may be run independently or in user-created workflows.AvailabilityThe quasitools suite is a freely available application licensed under the Apache License, Version 2.0. The source code, documentation, and file specifications are available at: https://phac-nml.github.io/quasitools/[email protected]

Download Full-text

KEC: unique sequence search by K-mer exclusion

Bioinformatics ◽

10.1093/bioinformatics/btab196 ◽

2021 ◽

Author(s):

Pavel Beran ◽

Dagmar Stehlíková ◽

Stephen P Cohen ◽

Vladislav Čurn

Keyword(s):

Amino Acid ◽

Nucleic Acid ◽

Source Code ◽

Unique Sequence ◽

Supplementary Information ◽

Supplementary Data ◽

Laptop Computers ◽

Sequence Search ◽

Target Sequences ◽

Cross Reference

Abstract Summary Searching for amino acid or nucleic acid sequences unique to one organism may be challenging depending on size of the available datasets. K-mer elimination by cross-reference (KEC) allows users to quickly and easily find unique sequences by providing target and non-target sequences. Due to its speed, it can be used for datasets of genomic size and can be run on desktop or laptop computers with modest specifications. Availability and implementation KEC is freely available for non-commercial purposes. Source code and executable binary files compiled for Linux, Mac and Windows can be downloaded from https://github.com/berybox/KEC. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genetic Variants in Smoking-Related Genes in Two Smoking Cessation Programs: A Cross-Sectional Study

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126597 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6597

Author(s):

Gloria Pérez-Rubio ◽

Luis Alberto López-Flores ◽

Ana Paula Cupertino ◽

Francisco Cartujano-Barrera ◽

Luz Myriam Reynales-Shigematsu ◽

...

Keyword(s):

Smoking Cessation ◽

Cross Sectional Study ◽

Nucleotide Polymorphisms ◽

Sectional Study ◽

Cross Sectional ◽

Single Nucleotide ◽

Genotype Frequencies ◽

Quitting Smoking ◽

Genes Encoding ◽

Genomic Regions

Previous studies have identified variants in genes encoding proteins associated with the degree of addiction, smoking onset, and cessation. We aimed to describe thirty-one single nucleotide polymorphisms (SNPs) in seven candidate genomic regions spanning six genes associated with tobacco-smoking in a cross-sectional study from two different interventions for quitting smoking: (1) thirty-eight smokers were recruited via multimedia to participate in e-Decídete! program (e-Dec) and (2) ninety-four attended an institutional smoking cessation program on-site. SNPs genotyping was done by real-time PCR using TaqMan probes. The analysis of alleles and genotypes was carried out using the EpiInfo v7. on-site subjects had more years smoking and tobacco index than e-Dec smokers (p < 0.05, both); in CYP2A6 we found differences in the rs28399433 (p < 0.01), the e-Dec group had a higher frequency of TT genotype (0.78 vs. 0.35), and TG genotype frequency was higher in the on-site group (0.63 vs. 0.18), same as GG genotype (0.03 vs. 0.02). Moreover, three SNPs in NRXN1, two in CHRNA3, and two in CHRNA5 had differences in genotype frequencies (p < 0.01). Cigarettes per day were different (p < 0.05) in the metabolizer classification by CYP2A6 alleles. In conclusion, subjects attending a mobile smoking cessation intervention smoked fewer cigarettes per day, by fewer years, and by fewer cumulative pack-years. There were differences in the genotype frequencies of SNPs in genes related to nicotine metabolism and nicotine dependence. Slow metabolizers smoked more cigarettes per day than intermediate and normal metabolizers.

Download Full-text

Candidate Genes for the High-Altitude Adaptations of Two Mountain Pine Taxa

International Journal of Molecular Sciences ◽

10.3390/ijms22073477 ◽

2021 ◽

Vol 22 (7) ◽

pp. 3477

Author(s):

Julia Zaborowska ◽

Bartosz Łabiszak ◽

Annika Perry ◽

Stephen Cavers ◽

Witold Wachowiak

Keyword(s):

Population Structure ◽

Candidate Genes ◽

Genetic Basis ◽

Redox Homeostasis ◽

Snp Markers ◽

Regulation Of Transcription ◽

Single Nucleotide ◽

Homeostasis Regulation ◽

Genomic Regions ◽

Mountain Plants

Mountain plants, challenged by vegetation time contractions and dynamic changes in environmental conditions, developed adaptations that help them to balance their growth, reproduction, survival, and regeneration. However, knowledge regarding the genetic basis of species adaptation to higher altitudes remain scarce for most plant species. Here, we attempted to identify such corresponding genomic regions of high evolutionary importance in two closely related European pines, Pinus mugo and P. uncinata, contrasting them with a reference lowland relative—P. sylvestris. We genotyped 438 samples at thousands of single nucleotide polymorphism (SNP) markers, tested their genetic differentiation and population structure followed by outlier detection and gene ontology annotations. Markers clearly differentiated the species and uncovered patterns of population structure in two of them. In P. uncinata three Pyrenean sites were grouped together, while two outlying populations constituted a separate cluster. In P. sylvestris, Spanish population appeared distinct from the remaining four European sites. Between mountain pines and the reference species, 35 candidate genes for altitude-dependent selection were identified, including such encoding proteins responsible for photosynthesis, photorespiration and cell redox homeostasis, regulation of transcription, and mRNA processing. In comparison between two mountain pines, 75 outlier SNPs were found in proteins involved mainly in the gene expression and metabolism.

Download Full-text

Modeling Linkage Disequilibrium and Identifying Recombination Hotspots Using Single-Nucleotide Polymorphism Data

Genetics ◽

10.1093/genetics/165.4.2213 ◽

2003 ◽

Vol 165 (4) ◽

pp. 2213-2233 ◽

Cited By ~ 41

Author(s):

Na Li ◽

Matthew Stephens

Keyword(s):

Linkage Disequilibrium ◽

Recombination Rate ◽

Population Sample ◽

Simulated Data ◽

Region Of Interest ◽

Population Data ◽

Recombination Rates ◽

Single Nucleotide ◽

Recombination Hotspots ◽

Genomic Regions

AbstractWe introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a “block-like” structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.

Download Full-text

The Role of Arg13 in Protein Phosphatase M tPphA from Thermosynechococcus elongatus

Enzyme Research ◽

10.1155/2012/272706 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Jiyong Su ◽

Karl Forchhammer

Keyword(s):

Amino Acid ◽

Protein Phosphatase ◽

Arginine Residue ◽

Side Chain ◽

Wild Type ◽

Catalytic Center ◽

Nitrophenyl Phosphate ◽

Reduced Activity ◽

Amino Acid Variants

A highly conserved arginine residue is close to the catalytic center of PPM/PP2C-type protein phosphatases. Different crystal structures of PPM/PP2C homologues revealed that the guanidinium side chain of this arginine residue can adopt variable conformations and may bind ligands, suggesting an important role of this residue during catalysis. In this paper, we randomly mutated Arginine 13 of tPphA, a PPM/PP2C-type phosphatase from Thermosynechococcus elongatus, and obtained 18 different amino acid variants. The generated variants were tested towards p-nitrophenyl phosphate and various phosphopeptides. Towards p-nitrophenyl phosphate as substrate, twelve variants showed 3–7 times higher Km values than wild-type tPphA and four variants (R13D, R13F, R13L, and R13W) completely lost activity. Strikingly, these variants were still able to dephosphorylate phosphopeptides, although with strongly reduced activity. The specific inability of some Arg-13 variants to hydrolyze p-nitrophenyl phosphate highlights the importance of additional substrate interactions apart from the substrate phosphate for catalysis. The properties of the R13 variants indicate that this residue assists in substrate binding.

Download Full-text

Breaking the Stereo Barrier of Amino Acid Attachment to tRNA by a Single Nucleotide

Journal of Molecular Biology ◽

10.1016/j.jmb.2005.02.023 ◽

2005 ◽

Vol 348 (3) ◽

pp. 513-521 ◽

Cited By ~ 13

Author(s):

Svetlana Shitivelband ◽

Ya-Ming Hou

Keyword(s):

Amino Acid ◽

Single Nucleotide

Download Full-text

A NOTE ON PHASING LONG GENOMIC REGIONS USING LOCAL HAPLOTYPE PREDICTIONS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720006002272 ◽

2006 ◽

Vol 04 (03) ◽

pp. 639-647 ◽

Cited By ~ 6

Author(s):

ELEAZAR ESKIN ◽

RODED SHARAN ◽

ERAN HALPERIN

Keyword(s):

Large Scale ◽

Computational Cost ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Novel Approach ◽

Maximum Likelihood Criterion ◽

The Common ◽

Genomic Regions ◽

High Computational Cost ◽

Combining Information

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .

Download Full-text