Ancestral sequence reconstruction: accounting for structural information by averaging over replacement matrices

2018 ◽  
Vol 35 (15) ◽  
pp. 2562-2568
Author(s):  
Asher Moshe ◽  
Tal Pupko

Abstract Motivation Ancestral sequence reconstruction (ASR) is widely used to understand protein evolution, structure and function. Current ASR methodologies do not fully consider differences in evolutionary constraints among positions imposed by the three-dimensional (3D) structure of the protein. Here, we developed an ASR algorithm that allows different protein sites to evolve according to different mixtures of replacement matrices. We show that assigning replacement matrices to protein positions based on their solvent accessibility leads to ASR with higher log-likelihoods compared to naïve models that assume a single replacement matrix for all sites. Improved ASR log-likelihoods are also demonstrated when solvent accessibility is predicted from protein sequences rather than inferred from a known 3D structure. Finally, we show that using such structure-aware mixture models results in substantial differences in the inferred ancestral sequences. Availability and implementation http://fastml.tau.ac.il. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Ryutaro Furukawa ◽  
Wakako Toma ◽  
Koji Yamazaki ◽  
Satoshi Akanuma

Abstract Enzymes have high catalytic efficiency and low environmental impact, and are therefore potentially useful tools for various industrial processes. Crucially, however, natural enzymes do not always have the properties required for specific processes. It may be necessary, therefore, to design, engineer, and evolve enzymes with properties that are not found in natural enzymes. In particular, the creation of enzymes that are thermally stable and catalytically active at low temperature is desirable for processes involving both high and low temperatures. In the current study, we designed two ancestral sequences of 3-isopropylmalate dehydrogenase by an ancestral sequence reconstruction technique based on a phylogenetic analysis of extant homologous amino acid sequences. Genes encoding the designed sequences were artificially synthesized and expressed in Escherichia coli. The reconstructed enzymes were found to be slightly more thermally stable than the extant thermophilic homologue from Thermus thermophilus. Moreover, they had considerably higher low-temperature catalytic activity as compared with the T. thermophilus enzyme. Detailed analyses of their temperature-dependent specific activities and kinetic properties showed that the reconstructed enzymes have catalytic properties similar to those of mesophilic homologues. Collectively, our study demonstrates that ancestral sequence reconstruction can produce a thermally stable enzyme with catalytic properties adapted to low-temperature reactions.


2020 ◽  
Vol 36 (11) ◽  
pp. 3372-3378
Author(s):  
Alexander Gress ◽  
Olga V Kalinina

Abstract Motivation In proteins, solvent accessibility of individual residues is a factor contributing to their importance for protein function and stability. Hence one might wish to calculate solvent accessibility in order to predict the impact of mutations, their pathogenicity and for other biomedical applications. A direct computation of solvent accessibility is only possible if all atoms of a protein three-dimensional structure are reliably resolved. Results We present SphereCon, a new precise measure that can estimate residue relative solvent accessibility (RSA) from limited data. The measure is based on calculating the volume of intersection of a sphere with a cone cut out in the direction opposite of the residue with surrounding atoms. We propose a method for estimating the position and volume of residue atoms in cases when they are not known from the structure, or when the structural data are unreliable or missing. We show that in cases of reliable input structures, SphereCon correlates almost perfectly with the directly computed RSA, and outperforms other previously suggested indirect methods. Moreover, SphereCon is the only measure that yields accurate results when the identities of amino acids are unknown. A significant novel feature of SphereCon is that it can estimate RSA from inter-residue distance and contact matrices, without any information about the actual atom coordinates. Availability and implementation https://github.com/kalininalab/spherecon. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 474 (1) ◽  
pp. 1-19 ◽  
Author(s):  
Yosephine Gumulya ◽  
Elizabeth M.J. Gillam

A central goal in molecular evolution is to understand the ways in which genes and proteins evolve in response to changing environments. In the absence of intact DNA from fossils, ancestral sequence reconstruction (ASR) can be used to infer the evolutionary precursors of extant proteins. To date, ancestral proteins belonging to eubacteria, archaea, yeast and vertebrates have been inferred that have been hypothesized to date from between several million to over 3 billion years ago. ASR has yielded insights into the early history of life on Earth and the evolution of proteins and macromolecular complexes. Recently, however, ASR has developed from a tool for testing hypotheses about protein evolution to a useful means for designing novel proteins. The strength of this approach lies in the ability to infer ancestral sequences encoding proteins that have desirable properties compared with contemporary forms, particularly thermostability and broad substrate range, making them good starting points for laboratory evolution. Developments in technologies for DNA sequencing and synthesis and computational phylogenetic analysis have led to an escalation in the number of ancient proteins resurrected in the last decade and greatly facilitated the use of ASR in the burgeoning field of synthetic biology. However, the primary challenge of ASR remains in accurately inferring ancestral states, despite the uncertainty arising from evolutionary models, incomplete sequences and limited phylogenetic trees. This review will focus, firstly, on the use of ASR to uncover links between sequence and phenotype and, secondly, on the practical application of ASR in protein engineering.


Author(s):  
Milos Musil ◽  
Rayyan Tariq Khan ◽  
Andy Beier ◽  
Jan Stourac ◽  
Hannes Konegger ◽  
...  

Abstract There is a great interest in increasing proteins’ stability to widen their usability in numerous biomedical and biotechnological applications. However, native proteins cannot usually withstand the harsh industrial environment, since they are evolved to function under mild conditions. Ancestral sequence reconstruction is a well-established method for deducing the evolutionary history of genes. Besides its applicability to discover the most probable evolutionary ancestors of the modern proteins, ancestral sequence reconstruction has proven to be a useful approach for the design of highly stable proteins. Recently, several computational tools were developed, which make the ancestral reconstruction algorithms accessible to the community, while leaving the most crucial steps of the preparation of the input data on users’ side. FireProtASR aims to overcome this obstacle by constructing a fully automated workflow, allowing even the unexperienced users to obtain ancestral sequences based on a sequence query as the only input. FireProtASR is complemented with an interactive, easy-to-use web interface and is freely available at https://loschmidt.chemi.muni.cz/fireprotasr/.


2021 ◽  
Vol 69 ◽  
pp. 131-141
Author(s):  
Matthew A. Spence ◽  
Joe A. Kaczmarski ◽  
Jake W. Saunders ◽  
Colin J. Jackson

2019 ◽  
Vol 36 (1) ◽  
pp. 96-103 ◽  
Author(s):  
Jinfang Zheng ◽  
Xu Hong ◽  
Juan Xie ◽  
Xiaoxue Tong ◽  
Shiyong Liu

AbstractMotivationThe main function of protein–RNA interaction is to regulate the expression of genes. Therefore, studying protein–RNA interactions is of great significance. The information of three-dimensional (3D) structures reveals that atomic interactions are particularly important. The calculation method for modeling a 3D structure of a complex mainly includes two strategies: free docking and template-based docking. These two methods are complementary in protein–protein docking. Therefore, integrating these two methods may improve the prediction accuracy.ResultsIn this article, we compare the difference between the free docking and the template-based algorithm. Then we show the complementarity of these two methods. Based on the analysis of the calculation results, the transition point is confirmed and used to integrate two docking algorithms to develop P3DOCK. P3DOCK holds the advantages of both algorithms. The results of the three docking benchmarks show that P3DOCK is better than those two non-hybrid docking algorithms. The success rate of P3DOCK is also higher (3–20%) than state-of-the-art hybrid and non-hybrid methods. Finally, the hierarchical clustering algorithm is utilized to cluster the P3DOCK’s decoys. The clustering algorithm improves the success rate of P3DOCK. For ease of use, we provide a P3DOCK webserver, which can be accessed at www.rnabinding.com/P3DOCK/P3DOCK.html. An integrated protein–RNA docking benchmark can be downloaded from http://rnabinding.com/P3DOCK/benchmark.html.Availability and implementationwww.rnabinding.com/P3DOCK/P3DOCK.html.Supplementary informationSupplementary data are available at Bioinformatics online.


2018 ◽  
Vol 35 (7) ◽  
pp. 1783-1797 ◽  
Author(s):  
Ricardo Assunção Vialle ◽  
Asif U Tamuri ◽  
Nick Goldman

2019 ◽  
Vol 400 (3) ◽  
pp. 367-381 ◽  
Author(s):  
Kristina Straub ◽  
Mona Linde ◽  
Cosimo Kropp ◽  
Samuel Blanquart ◽  
Patrick Babinger ◽  
...  

Abstract For evolutionary studies, but also for protein engineering, ancestral sequence reconstruction (ASR) has become an indispensable tool. The first step of every ASR protocol is the preparation of a representative sequence set containing at most a few hundred recent homologs whose composition determines decisively the outcome of a reconstruction. A common approach for sequence selection consists of several rounds of manual recompilation that is driven by embedded phylogenetic analyses of the varied sequence sets. For ASR of a geranylgeranylglyceryl phosphate synthase, we additionally utilized FitSS4ASR, which replaces this time-consuming protocol with an efficient and more rational approach. FitSS4ASR applies orthogonal filters to a set of homologs to eliminate outlier sequences and those bearing only a weak phylogenetic signal. To demonstrate the usefulness of FitSS4ASR, we determined experimentally the oligomerization state of eight predecessors, which is a delicate and taxon-specific property. Corresponding ancestors deduced in a manual approach and by means of FitSS4ASR had the same dimeric or hexameric conformation; this concordance testifies to the efficiency of FitSS4ASR for sequence selection. FitSS4ASR-based results of two other ASR experiments were added to the Supporting Information. Program and documentation are available at https://gitlab.bioinf.ur.de/hek61586/FitSS4ASR.


Sign in / Sign up

Export Citation Format

Share Document