scholarly journals Predicted Succinated Dehydrogenase Subunit Variant Pathogenicity: Why Are SDHB Variants “Bad”?

2021 ◽  
Vol 5 (Supplement_1) ◽  
pp. A71-A72
Author(s):  
Lucinda Gruber ◽  
L James Maher

Abstract Variants in the 4 genes encoding subunits A-D of succinate dehydrogenase (SDH) are associated with paraganglioma and pheochromocytoma. Intuitively, loss-of-function variants affecting any of the subunits should equally diminish SDH function leading to succinate accumulation and tumorigenesis after loss of heterozygosity. However, variants in SDHB are associated with a higher prevalence of metastatic disease and a more aggressive clinical course. Evaluation of the SDH protein structure shows the fraction of amino acids in contact with other subunits or essential prosthetic groups to be: 13% (SDHA), 40% (SDHB), 28% (SDHC), and 28% (SDHD). We therefore hypothesized that SDHB missense variants are more penetrant because a larger fraction alter sensitive interfaces with other SDH subunits or essential molecular features (e.g. the three SDHB iron-sulfur clusters). We also wondered if truncating variants are more common for SDHB than other subunits. To test these hypotheses, we combined three databases (Genome Aggregation Database, ClinVar-NCBI-NIH, and Leiden Open Variant Database) and our institution’s data to create a pool of all known SDH variants. We categorized variants as truncating or missense and evaluated missense variants in the context of the SDH protein structure, scoring each variant in relation to important structures/interfaces and the severity of the amino acid change. This provided an ad hoc impact score for each variant, where a higher score predicts a more deleterious effect. We compared these scores to those obtained using the “Sorting Intolerant from Tolerant” (SIFT) tool that predicts impacts of amino acid changes based on evolutionary sequence conservation. SIFT scores of 0 to 0.05 predict deleterious effects. Both mean impact and SIFT scores could be weighted for the prevalence of each variant in the population. Our database included 2333 total SDH variants: SDHA (838, 36%), SDHB (703, 30%), SDHC (381, 16%), and SDHD (412, 18%). The fractions of truncating variants were 38%, 50%, 51%, and 53% for A-D subunits, respectively. When weighted for prevalence, these fractions were 0.39%, 6.8%, 8.2%, and 0.2%. The number of truncating variants per coding region length and the distribution of locations were similar between subunits. Ad hoc impact scores for A-D subunits were 3.08, 14.9, 9.93, and 11.0, respectively and, when weighted for prevalence, were 0.28, 3.25, 6.32, and 1.15. Mean SIFT scores for subunits A-D were: 0.185, 0.162, 0.238, and 0.410 respectively, and, when weighted for prevalence, were 0.58, 0.70, 0.22, and 0.018. Our results do not support the hypothesis SDHB variants predict a worse clinical outcome because average SDHB variants are, by chance, more biochemically severe. This suggests that SDHB loss may uniquely impact SDH biochemical function.

AMB Express ◽  
2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Neeraja Punde ◽  
Jennifer Kooken ◽  
Dagmar Leary ◽  
Patricia M. Legler ◽  
Evelina Angov

Abstract Codon usage frequency influences protein structure and function. The frequency with which codons are used potentially impacts primary, secondary and tertiary protein structure. Poor expression, loss of function, insolubility, or truncation can result from species-specific differences in codon usage. “Codon harmonization” more closely aligns native codon usage frequencies with those of the expression host particularly within putative inter-domain segments where slower rates of translation may play a role in protein folding. Heterologous expression of Plasmodium falciparum genes in Escherichia coli has been a challenge due to their AT-rich codon bias and the highly repetitive DNA sequences. Here, codon harmonization was applied to the malarial antigen, CelTOS (Cell-traversal protein for ookinetes and sporozoites). CelTOS is a highly conserved P. falciparum protein involved in cellular traversal through mosquito and vertebrate host cells. It reversibly refolds after thermal denaturation making it a desirable malarial vaccine candidate. Protein expressed in E. coli from a codon harmonized sequence of P. falciparum CelTOS (CH-PfCelTOS) was compared with protein expressed from the native codon sequence (N-PfCelTOS) to assess the impact of codon usage on protein expression levels, solubility, yield, stability, structural integrity, recognition with CelTOS-specific mAbs and immunogenicity in mice. While the translated proteins were expected to be identical, the translated products produced from the codon-harmonized sequence differed in helical content and showed a smaller distribution of polypeptides in mass spectra indicating lower heterogeneity of the codon harmonized version and fewer amino acid misincorporations. Substitutions of hydrophobic-to-hydrophobic amino acid were observed more commonly than any other. CH-PfCelTOS induced significantly higher antibody levels compared with N-PfCelTOS; however, no significant differences in either IFN-γ or IL-4 cellular responses were detected between the two antigens.


2020 ◽  
Vol 117 (45) ◽  
pp. 28201-28211
Author(s):  
Sumaiya Iqbal ◽  
Eduardo Pérez-Palma ◽  
Jakob B. Jespersen ◽  
Patrick May ◽  
David Hoksza ◽  
...  

Interpretation of the colossal number of genetic variants identified from sequencing applications is one of the major bottlenecks in clinical genetics, with the inference of the effect of amino acid-substituting missense variations on protein structure and function being especially challenging. Here we characterize the three-dimensional (3D) amino acid positions affected in pathogenic and population variants from 1,330 disease-associated genes using over 14,000 experimentally solved human protein structures. By measuring the statistical burden of variations (i.e., point mutations) from all genes on 40 3D protein features, accounting for the structural, chemical, and functional context of the variations’ positions, we identify features that are generally associated with pathogenic and population missense variants. We then perform the same amino acid-level analysis individually for 24 protein functional classes, which reveals unique characteristics of the positions of the altered amino acids: We observe up to 46% divergence of the class-specific features from the general characteristics obtained by the analysis on all genes, which is consistent with the structural diversity of essential regions across different protein classes. We demonstrate that the function-specific 3D features of the variants match the readouts of mutagenesis experiments for BRCA1 and PTEN, and positively correlate with an independent set of clinically interpreted pathogenic and benign missense variants. Finally, we make our results available through a web server to foster accessibility and downstream research. Our findings represent a crucial step toward translational genetics, from highlighting the impact of mutations on protein structure to rationalizing the variants’ pathogenicity in terms of the perturbed molecular mechanisms.


Genetics ◽  
2001 ◽  
Vol 158 (4) ◽  
pp. 1491-1503 ◽  
Author(s):  
Thomas J Fowler ◽  
Michael F Mitton ◽  
Lisa J Vaillancourt ◽  
Carlene A Raper

AbstractSchizophyllum commune has thousands of mating types defined in part by numerous lipopeptide pheromones and their G-protein-coupled receptors. These molecules are encoded within multiple versions of two redundantly functioning B mating-type loci, Bα and Bβ. Compatible combinations of pheromones and receptors, produced by individuals of different B mating types, trigger a pathway of fertilization required for sexual development. Analysis of the Bβ2 mating-type locus revealed a large cluster of genes encoding a single pheromone receptor and eight different pheromones. Phenotypic effects of mutations within these genes indicated that small changes in both types of molecules could significantly alter their specificity of interaction. For example, a conservative amino acid substitution in a pheromone resulted in a gain of function toward one receptor and a loss of function with another. A two-amino-acid deletion from a receptor precluded the mutant pheromone from activating the mutant receptor, yet this receptor was activated by other pheromones. Sequence comparisons provided clues toward understanding how so many variants of these multigenic loci could have evolved through duplication and mutational divergence. A three-step model for the origin of new variants comparable to those found in nature is presented.


Genetics ◽  
1995 ◽  
Vol 139 (2) ◽  
pp. 921-939 ◽  
Author(s):  
J Callis ◽  
T Carpenter ◽  
C W Sun ◽  
R D Vierstra

Abstract The Arabidopsis thaliana ecotype Columbia ubiquitin gene family consists of 14 members that can be divided into three types of ubiquitin genes; polyubiquitin genes, ubiquitin-like genes and ubiquitin extension genes. The isolation and characterization of eight ubiquitin sequences, consisting of four polyubiquitin genes and four ubiquitin-like genes, are described here, and their relationships to each other and to previously identified Arabidopsis ubiquitin genes were analyzed. The polyubiquitin genes, UBQ3, UBQ10, UBQ11 and UBQ14, contain tandem repeats of the 228-bp ubiquitin coding region. Together with a previously described polyubiquitin gene, UBQ4, they differ in synonymous substitutions, number of ubiquitin coding regions, number and nature of nonubiquitin C-terminal amino acid(s) and chromosomal location, dividing into two subtypes; the UBQ3/UBQ4 and UBQ10/UBQ11/UBQ14 subtypes. Ubiquitin-like genes, UBQ7, UBQ8, UBQ9 and UBQ12, also contain tandem repeats of the ubiquitin coding region, but at least one repeat per gene encodes a protein with amino acid substitutions. Nucleotide comparisons, Ks value determinations and neighbor-joining analyses were employed to determine intra- and intergenic relationships. In general, the rate of synonymous substitution is too high to discern related repeats. Specific exceptions provide insight into gene relationships. The observed nucleotide relationships are consistent with previously described models involving gene duplications followed by both unequal crossing-over and gene conversion events.


2020 ◽  
Vol 48 (W1) ◽  
pp. W132-W139
Author(s):  
Sumaiya Iqbal ◽  
David Hoksza ◽  
Eduardo Pérez-Palma ◽  
Patrick May ◽  
Jakob B Jespersen ◽  
...  

Abstract Human genome sequencing efforts have greatly expanded, and a plethora of missense variants identified both in patients and in the general population is now publicly accessible. Interpretation of the molecular-level effect of missense variants, however, remains challenging and requires a particular investigation of amino acid substitutions in the context of protein structure and function. Answers to questions like ‘Is a variant perturbing a site involved in key macromolecular interactions and/or cellular signaling?’, or ‘Is a variant changing an amino acid located at the protein core or part of a cluster of known pathogenic mutations in 3D?’ are crucial. Motivated by these needs, we developed MISCAST (missense variant to protein structure analysis web suite; http://miscast.broadinstitute.org/). MISCAST is an interactive and user-friendly web server to visualize and analyze missense variants in protein sequence and structure space. Additionally, a comprehensive set of protein structural and functional features have been aggregated in MISCAST from multiple databases, and displayed on structures alongside the variants to provide users with the biological context of the variant location in an integrated platform. We further made the annotated data and protein structures readily downloadable from MISCAST to foster advanced offline analysis of missense variants by a wide biological community.


2021 ◽  
Author(s):  
Anne Rovelet-Lecrux ◽  
Sebastien Feuillette ◽  
Laetitia Miguel ◽  
Catherine Schramm ◽  
Segolene Pernet ◽  
...  

The SorLA protein, encoded by the SORL1 gene, is a major player in Alzheimer disease (AD) pathophysiology. Functional and genetic studies demonstrated that SorLA deficiency results in increased production of Aβ peptides, and thus a higher risk of AD. A large number of SORL1 missense variants have been identified in AD patients, but their functional consequences remain largely undefined. Here, we identified a new pathophysiological mechanism, by which rare SORL1 missense variants identified in AD patients result in altered maturation and trafficking of SorLA protein. An initial screening, based on the overexpression of 71 SorLA variants in HEK293 cells, revealed that 15 of them (S114R, R332W, G543E, S564G, S577P, R654W, R729W, D806N, Y934C, D1535N, D1545E, P1654L, Y1816C, W1862C, P1914S) induced a maturation and trafficking-deficient phenotype. Three of these variations (R332W, S577P, and R654W) and two maturation-competent variations (S124R and N371T) were further studied in details in CRISPR/Cas9-modified hiPSCs. When expressed at endogenous levels, the R332W, S577P, and R654W SorLA variants also showed a maturation defective profile. We further demonstrated that these variants were largely retained in the endoplasmic reticulum, resulting in a reduction in the delivery of SorLA mature protein to the plasma membrane and to the endosomal system. Importantly, expression of the R332W and R654W variants in hiPSCs were associated with a clear increase of Aβ secretion, demonstrating a loss-of-function effect of these SorLA variants regarding this ultimate readout, and a direct link with AD pathophysiology. Finally, structural analysis of the impact of missense variations on SorLA protein structure indicated that impaired cellular trafficking of SorLA protein could be due to subtle variations of the protein structure resulting from changes in the interatomic interactions.


2020 ◽  
Author(s):  
Adhideb Ghosh ◽  
Alexander A. Navarini

AbstractFunctional interpretation is crucial when facing on average 20,000 missense variants per human exome, as the great majority are not associated with any underlying disease. In silico bioinformatics tools can predict the deleteriousness of variants or assess their functional impact by assigning scores, but they cannot predict whether the variant in question results in gain or loss of function at the protein level. Here, we show that machine learning can effectively predict this biological function polarity of missense variants. The new method adapts weighted gradient boosting machine approach on a set of damaging variants (1,288 loss of function and 218 gain of function variants) as annotated by the tools SIFT, PolyPhen2 and CADD. Area under the ROC curve of 0.85 illustrates high discriminative power of the classifier. Predictive performance of the classifier remains consistent against an independent set of damaging variants as highlighted by the area under the ROC curve of 0.83. This new approach may help to guide biological experiments on the clinical relevance of damaging genetic variants.Author summaryMissense variant occurs when a single genetic alteration in DNA takes place and as a result a new amino acid is translated into the protein. This amino acid change can inactivate the existing protein function causing loss-of-function or produce a new function causing gain-of-function. Therefore, it is very important to interpret these functional consequences of missense variants as they often turn out to be disease causing. Each individual’s genome sequence has thousands of missense variants, out of which very few are actually associated with any underlying disease. Various computational tools have been developed to predict whether missense variants are damaging or not, but none of them can actually predict whether the damaging missense variants cause gain-of-function or loss-of-function. We have developed a new ensemble classifier to predict this biological function polarity at the protein level. The classifier combines the prediction scores of three existing bioinformatics tools and applies machine learning to make effective predictions. We have validated our classifier against an independent data set to show its high predictive power and robustness. The predictions made by our machine learning tool can be used as indicators of biological function polarity, but with further evidence on pathogenicity.


2020 ◽  
Author(s):  
Emmanuelle Masson ◽  
Vinciane Rebours ◽  
Louis Buscail ◽  
Frédérique Frete ◽  
Mael Pagenault ◽  
...  

ABSTRACTA gain-of-function missense variant in the CELA3B gene, p.Arg90Cys (c.268C>T), has recently been reported to cause pancreatitis in an extended pedigree. Herein, we sequenced the CELA3B gene in 644 genetically unexplained French chronic pancreatitis (CP) patients (all unrelated) and 566 controls. No predicted loss-of-function variants were identified. None of the six low frequency or common missense variants detected showed significant association with CP. Nor did the aggregate rare/very rare missense variants (n=14) show any significant association with CP. However, p.Arg90Leu (c.269G>T), which was found in 4 patients but no controls and affects the same amino acid as p.Arg90Cys, serves to revert p.Arg90 to the human elastase ancestral allele. Since p.Arg90Leu has previously been shown to exert a similar functional effect to p.Arg90Cys, our findings not only confirm the involvement of CELA3B in the etiology of CP but also pinpoint a new evolutionarily adaptive site in the human genome.


2019 ◽  
Vol 26 (2) ◽  
pp. 108-131 ◽  
Author(s):  
Norio Matsushima ◽  
Shintaro Takatsuka ◽  
Hiroki Miyashita ◽  
Robert H. Kretsinger

Mutations in the genes encoding Leucine Rich Repeat (LRR) containing proteins are associated with over sixty human diseases; these include high myopia, mitochondrial encephalomyopathy, and Crohn’s disease. These mutations occur frequently within the LRR domains and within the regions that shield the hydrophobic core of the LRR domain. The amino acid sequences of fifty-five LRR proteins have been published. They include Nod-Like Receptors (NLRs) such as NLRP1, NLRP3, NLRP14, and Nod-2, Small Leucine Rich Repeat Proteoglycans (SLRPs) such as keratocan, lumican, fibromodulin, PRELP, biglycan, and nyctalopin, and F-box/LRR-repeat proteins such as FBXL2, FBXL4, and FBXL12. For example, 363 missense mutations have been identified. Replacement of arginine, proline, or cysteine by another amino acid, or the reverse, is frequently observed. The diverse effects of the mutations are discussed based on the known structures of LRR proteins. These mutations influence protein folding, aggregation, oligomerization, stability, protein-ligand interactions, disulfide bond formation, and glycosylation. Most of the mutations cause loss of function and a few, gain of function.


Sign in / Sign up

Export Citation Format

Share Document