scholarly journals Identification of pathogenic missense mutations using protein stability predictors

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Lukas Gerasimavicius ◽  
Xin Liu ◽  
Joseph A. Marsh

Abstract Attempts at using protein structures to identify disease-causing mutations have been dominated by the idea that most pathogenic mutations are disruptive at a structural level. Therefore, computational stability predictors, which assess whether a mutation is likely to be stabilising or destabilising to protein structure, have been commonly used when evaluating new candidate disease variants, despite not having been developed specifically for this purpose. We therefore tested 13 different stability predictors for their ability to discriminate between pathogenic and putatively benign missense variants. We find that one method, FoldX, significantly outperforms all other predictors in the identification of disease variants. Moreover, we demonstrate that employing predicted absolute energy change scores improves performance of nearly all predictors in distinguishing pathogenic from benign variants. Importantly, however, we observe that the utility of computational stability predictors is highly heterogeneous across different proteins, and that they are all inferior to the best performing variant effect predictors for identifying pathogenic mutations. We suggest that this is largely due to alternate molecular mechanisms other than protein destabilisation underlying many pathogenic mutations. Thus, better ways of incorporating protein structural information and molecular mechanisms into computational variant effect predictors will be required for improved disease variant prioritisation.

2020 ◽  
Author(s):  
Lukas Gerasimavicius ◽  
Xin Liu ◽  
Joseph A Marsh

AbstractAttempts at using protein structures to identify disease-causing mutations have been dominated by the idea that most pathogenic mutations are disruptive at a structural level. Therefore, computational stability predictors, which assess whether a mutation is likely to be stabilising or destabilising to protein structure, have been commonly used when evaluating new candidate disease variants, despite not having been developed specifically for this purpose. We therefore tested 12 different stability predictors for their ability to discriminate between pathogenic and putatively benign missense variants. We find that one method, FoldX, considerably outperforms all others in the identification of disease variants. Moreover, we demonstrate that employing absolute energy change scores improves performance of nearly all predictors. Importantly, however, we observe that the utility of computational stability predictors is highly heterogeneous across different proteins, and that they are all are inferior to the best performing variant effect predictors for identifying pathogenic mutations. We suggest that this is largely due to alternate molecular mechanisms other than protein destabilisation underlying many pathogenic mutations. Thus, better ways of incorporating protein structural information and molecular mechanisms into computational variant effect predictors will be required for improved disease variant prioritisation.


2021 ◽  
Author(s):  
Lukas Gerasimavicius ◽  
Benjamin J Livesey ◽  
Joseph A Marsh

Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Taking protein structure into account has therefore provided great insight into the molecular mechanisms underlying human genetic disease. While there has been much focus on how mutations can disrupt protein structure and thus cause a loss of function (LOF), alternative mechanisms, specifically dominant-negative (DN) and gain-of-function (GOF) effects, are less understood. Here, we have investigated the protein-level effects of pathogenic missense mutations associated with different molecular mechanisms. We observe striking differences between recessive vs dominant, and LOF vs non-LOF mutations, with dominant, non-LOF disease mutations having much milder effects on protein structure, and DN mutations being highly enriched at protein interfaces. We also find that nearly all computational variant effect predictors underperform on non-LOF mutations, even those based solely on sequence conservation. However, we do find that non-LOF mutations could potentially be identified by their tendency to cluster in space. Overall, our work suggests that many pathogenic mutations that act via DN and GOF mutations are likely being missed by current variant prioritisation strategies, but that there is considerable scope to improve computational predictions through consideration of molecular disease mechanisms.


2020 ◽  
Vol 117 (11) ◽  
pp. 5977-5986 ◽  
Author(s):  
Greg Slodkowicz ◽  
Nick Goldman

Understanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to colocalize on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens previously unexplored strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: Components of immune response and metabolic enzymes. This gives a coherent picture of pathogens and xenobiotics as important drivers of adaptive evolution of mammals.


2021 ◽  
Author(s):  
Sandeep Kaur ◽  
Neblina Sikta ◽  
Andrea Schafferhans ◽  
Nicola Bordin ◽  
Mark J. Cowley ◽  
...  

AbstractMotivationVariant analysis is a core task in bioinformatics that requires integrating data from many sources. This process can be helped by using 3D structures of proteins, which can provide a spatial context that can provide insight into how variants affect function. Many available tools can help with mapping variants onto structures; but each has specific restrictions, with the result that many researchers fail to benefit from valuable insights that could be gained from structural data.ResultsTo address this, we have created a streamlined system for incorporating 3D structures into variant analysis. Variants can be easily specified via URLs that are easily readable and writable, and use the notation recommended by the Human Genome Variation Society (HGVS). For example, ‘https://aquaria.app/SARS-CoV-2/S/?N501Y’ specifies the N501Y variant of SARS-CoV-2 S protein. In addition to mapping variants onto structures, our system provides summary information from multiple external resources, including COSMIC, CATH-FunVar, and PredictProtein. Furthermore, our system identifies and summarizes structures containing the variant, as well as the variant-position. Our system supports essentially any mutation for any well-studied protein, and uses all available structural data — including models inferred via very remote homology — integrated into a system that is fast and simple to use. By giving researchers easy, streamlined access to a wealth of structural information during variant analysis, our system will help in revealing novel insights into the molecular mechanisms underlying protein function in health and disease.AvailabilityOur resource is freely available at the project home page (https://aquaria.app). After peer review, the code will be openly available via a GPL version 2 license at https://github.com/ODonoghueLab/Aquaria. PSSH2, the database of sequence-to-structure alignments, is also freely available for download at https://zenodo.org/record/[email protected] informationNone.


2019 ◽  
Author(s):  
Greg Slodkowicz ◽  
Nick Goldman

AbstractUnderstanding the molecular basis of adaptation to the environment is a central question in evolutionary biology, yet linking detected signatures of positive selection to molecular mechanisms remains challenging. Here we demonstrate that combining sequence-based phylogenetic methods with structural information assists in making such mechanistic interpretations on a genomic scale. Our integrative analysis shows that positively selected sites tend to co-localise on protein structures and that positively selected clusters are found in functionally important regions of proteins, indicating that positive selection can contravene the well-known principle of evolutionary conservation of functionally important regions. This unexpected finding, along with our discovery that positive selection acts on structural clusters, opens new strategies for the development of better models of protein evolution. Remarkably, proteins where we detect the strongest evidence of clustering belong to just two functional groups: components of immune response and metabolic enzymes. This gives a coherent picture of immune response and xenobiotic metabolism as the drivers of adaptive evolution of mammals.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Truong Khanh Linh Dang ◽  
Thach Nguyen ◽  
Michael Habeck ◽  
Mehmet Gültas ◽  
Stephan Waack

Abstract Background Conformational transitions are implicated in the biological function of many proteins. Structural changes in proteins can be described approximately as the relative movement of rigid domains against each other. Despite previous efforts, there is a need to develop new domain segmentation algorithms that are capable of analysing the entire structure database efficiently and do not require the choice of protein-dependent tuning parameters such as the number of rigid domains. Results We develop a graph-based method for detecting rigid domains in proteins. Structural information from multiple conformational states is represented by a graph whose nodes correspond to amino acids. Graph clustering algorithms allow us to reduce the graph and run the Viterbi algorithm on the associated line graph to obtain a segmentation of the input structures into rigid domains. In contrast to many alternative methods, our approach does not require knowledge about the number of rigid domains. Moreover, we identified default values for the algorithmic parameters that are suitable for a large number of conformational ensembles. We test our algorithm on examples from the DynDom database and illustrate our method on various challenging systems whose structural transitions have been studied extensively. Conclusions The results strongly suggest that our graph-based algorithm forms a novel framework to characterize structural transitions in proteins via detecting their rigid domains. The web server is available at http://azifi.tz.agrar.uni-goettingen.de/webservice/.


2014 ◽  
Vol 70 (a1) ◽  
pp. C491-C491
Author(s):  
Jürgen Haas ◽  
Alessandro Barbato ◽  
Tobias Schmidt ◽  
Steven Roth ◽  
Andrew Waterhouse ◽  
...  

Computational modeling and prediction of three-dimensional macromolecular structures and complexes from their sequence has been a long standing goal in structural biology. Over the last two decades, a paradigm shift has occurred: starting from a large "knowledge gap" between the huge number of protein sequences compared to a small number of experimentally known structures, today, some form of structural information – either experimental or computational – is available for the majority of amino acids encoded by common model organism genomes. Methods for structure modeling and prediction have made substantial progress of the last decades, and template based homology modeling techniques have matured to a point where they are now routinely used to complement experimental techniques. However, computational modeling and prediction techniques often fall short in accuracy compared to high-resolution experimental structures, and it is often difficult to convey the expected accuracy and structural variability of a specific model. Retrospectively assessing the quality of blind structure prediction in comparison to experimental reference structures allows benchmarking the state-of-the-art in structure prediction and identifying areas which need further development. The Critical Assessment of Structure Prediction (CASP) experiment has for the last 20 years assessed the progress in the field of protein structure modeling based on predictions for ca. 100 blind prediction targets per experiment which are carefully evaluated by human experts. The "Continuous Model EvaluatiOn" (CAMEO) project aims to provide a fully automated blind assessment for prediction servers based on weekly pre-released sequences of the Protein Data Bank PDB. CAMEO has been made possible by the development of novel scoring methods such as lDDT, which are robust against domain movements to allow for automated continuous structure comparison without human intervention.


2021 ◽  
Vol 12 ◽  
Author(s):  
Alongkorn Kurilung ◽  
Vincent Perreten ◽  
Nuvee Prapasarakul

Leptospira weilii belongs to the pathogenic Leptospira group and is a causal agent of human and animal leptospirosis in many world regions. L. weilii can produce varied clinical presentations from asymptomatic through acute to chronic infections and occupy several ecological niches. Nevertheless, the genomic feature and genetic basis behind the host adaptability of L. weilii remain elusive due to limited information. Therefore, this study aimed to examine the complete circular genomes of two new L. weilii serogroup Mini strains (CUDO6 and CUD13) recovered from the urine of asymptomatic dogs in Thailand and then compared with the 17 genomes available for L. weilii. Variant calling analysis (VCA) was also undertaken to gain potential insight into the missense mutations, focusing on the known pathogenesis-related genes. Whole genome sequences revealed that the CUDO6 and CUD13 strains each contained two chromosomes and one plasmid, with average genome size and G+C content of 4.37 Mbp and 40.7%, respectively. Both strains harbored almost all the confirmed pathogenesis-related genes in Leptospira. Two novel plasmid sequences, pDO6 and pD13, were identified in the strains CUDO6 and CUD13. Both plasmids contained genes responsible for stress response that may play important roles in bacterial adaptation during persistence in the kidneys. The core-single nucleotide polymorphisms phylogeny demonstrated that both strains had a close genetic relationship. Amongst the 19 L. weilii strains analyzed, the pan-genome analysis showed an open pan-genome structure, correlated with their high genetic diversity. VCA identified missense mutations in genes involved in endoflagella, lipopolysaccharide (LPS) structure, mammalian cell entry protein, and hemolytic activities, and may be associated with host-adaptation in the strains. Missense mutations of the endoflagella genes of CUDO6 and CUD13 were associated with loss of motility. These findings extend the knowledge about the pathogenic molecular mechanisms and genomic evolution of this important zoonotic pathogen.


2020 ◽  
Author(s):  
Andreas Schedlbauer ◽  
Idoia Iturrioz ◽  
Borja Ochoa-Lizarralde ◽  
Tammo Diercks ◽  
Jorge Pedro López-Alonso ◽  
...  

While a structural description of the molecular mechanisms guiding ribosome assembly in eukaryotic systems is emerging, bacteria employ an unrelated core set of assembly factors for which high-resolution structural information is still missing. To address this, we used single-particle cryo-EM to visualize the effects of bacterial ribosome assembly factors RimP, RbfA, RsmA, and RsgA on the conformational landscape of the 30S ribosomal subunit and obtained eight snapshots representing late steps in the folding of the decoding center. Analysis of these structures identifies a conserved secondary structure switch in the 16S rRNA central to decoding site maturation, and suggests both a sequential order of action and molecular mechanisms for the assembly factors in coordinating and controlling this switch. Structural and mechanistic parallels between bacterial and eukaryotic systems indicate common folding features inherent to all ribosomes.


Sign in / Sign up

Export Citation Format

Share Document