Ancestral sequence reconstruction: accounting for structural information by averaging over replacement matrices

Asher Moshe; Tal Pupko

doi:10.1093/bioinformatics/bty1031

Ancestral sequence reconstruction: accounting for structural information by averaging over replacement matrices

Bioinformatics ◽

10.1093/bioinformatics/bty1031 ◽

2018 ◽

Vol 35 (15) ◽

pp. 2562-2568

Author(s):

Asher Moshe ◽

Tal Pupko

Keyword(s):

Structural Information ◽

Solvent Accessibility ◽

3D Structure ◽

Three Dimensional ◽

Ancestral Sequence ◽

Supplementary Information ◽

Ancestral Sequence Reconstruction ◽

Ancestral Sequences ◽

Sequence Reconstruction ◽

And Function

Abstract Motivation Ancestral sequence reconstruction (ASR) is widely used to understand protein evolution, structure and function. Current ASR methodologies do not fully consider differences in evolutionary constraints among positions imposed by the three-dimensional (3D) structure of the protein. Here, we developed an ASR algorithm that allows different protein sites to evolve according to different mixtures of replacement matrices. We show that assigning replacement matrices to protein positions based on their solvent accessibility leads to ASR with higher log-likelihoods compared to naïve models that assume a single replacement matrix for all sites. Improved ASR log-likelihoods are also demonstrated when solvent accessibility is predicted from protein sequences rather than inferred from a known 3D structure. Finally, we show that using such structure-aware mixture models results in substantial differences in the inferred ancestral sequences. Availability and implementation http://fastml.tau.ac.il. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Ancestral sequence reconstruction produces thermally stable enzymes with mesophilic enzyme-like catalytic properties

Scientific Reports ◽

10.1038/s41598-020-72418-4 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 2

Author(s):

Ryutaro Furukawa ◽

Wakako Toma ◽

Koji Yamazaki ◽

Satoshi Akanuma

Keyword(s):

Low Temperature ◽

Catalytic Properties ◽

Amino Acid Sequences ◽

Ancestral Sequence ◽

Kinetic Properties ◽

Reconstruction Technique ◽

Thermally Stable ◽

Ancestral Sequence Reconstruction ◽

Ancestral Sequences ◽

Sequence Reconstruction

Abstract Enzymes have high catalytic efficiency and low environmental impact, and are therefore potentially useful tools for various industrial processes. Crucially, however, natural enzymes do not always have the properties required for specific processes. It may be necessary, therefore, to design, engineer, and evolve enzymes with properties that are not found in natural enzymes. In particular, the creation of enzymes that are thermally stable and catalytically active at low temperature is desirable for processes involving both high and low temperatures. In the current study, we designed two ancestral sequences of 3-isopropylmalate dehydrogenase by an ancestral sequence reconstruction technique based on a phylogenetic analysis of extant homologous amino acid sequences. Genes encoding the designed sequences were artificially synthesized and expressed in Escherichia coli. The reconstructed enzymes were found to be slightly more thermally stable than the extant thermophilic homologue from Thermus thermophilus. Moreover, they had considerably higher low-temperature catalytic activity as compared with the T. thermophilus enzyme. Detailed analyses of their temperature-dependent specific activities and kinetic properties showed that the reconstructed enzymes have catalytic properties similar to those of mesophilic homologues. Collectively, our study demonstrates that ancestral sequence reconstruction can produce a thermally stable enzyme with catalytic properties adapted to low-temperature reactions.

Download Full-text

SphereCon—a method for precise estimation of residue relative solvent accessible area from limited structural information

Bioinformatics ◽

10.1093/bioinformatics/btaa159 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3372-3378

Author(s):

Alexander Gress ◽

Olga V Kalinina

Keyword(s):

Protein Function ◽

Structural Information ◽

Solvent Accessibility ◽

Three Dimensional ◽

Structural Data ◽

Supplementary Information ◽

Dimensional Structure ◽

Relative Solvent Accessibility ◽

Precise Measure ◽

The Impact

Abstract Motivation In proteins, solvent accessibility of individual residues is a factor contributing to their importance for protein function and stability. Hence one might wish to calculate solvent accessibility in order to predict the impact of mutations, their pathogenicity and for other biomedical applications. A direct computation of solvent accessibility is only possible if all atoms of a protein three-dimensional structure are reliably resolved. Results We present SphereCon, a new precise measure that can estimate residue relative solvent accessibility (RSA) from limited data. The measure is based on calculating the volume of intersection of a sphere with a cone cut out in the direction opposite of the residue with surrounding atoms. We propose a method for estimating the position and volume of residue atoms in cases when they are not known from the structure, or when the structural data are unreliable or missing. We show that in cases of reliable input structures, SphereCon correlates almost perfectly with the directly computed RSA, and outperforms other previously suggested indirect methods. Moreover, SphereCon is the only measure that yields accurate results when the identities of amino acids are unknown. A significant novel feature of SphereCon is that it can estimate RSA from inter-residue distance and contact matrices, without any information about the actual atom coordinates. Availability and implementation https://github.com/kalininalab/spherecon. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Exploring the past and the future of protein evolution with ancestral sequence reconstruction: the ‘retro’ approach to protein engineering

Biochemical Journal ◽

10.1042/bcj20160507 ◽

2016 ◽

Vol 474 (1) ◽

pp. 1-19 ◽

Cited By ~ 39

Author(s):

Yosephine Gumulya ◽

Elizabeth M.J. Gillam

Keyword(s):

Protein Engineering ◽

Protein Evolution ◽

Phylogenetic Trees ◽

Ancestral Sequence ◽

Novel Proteins ◽

Ancestral Sequence Reconstruction ◽

Ancestral Sequences ◽

Sequence Reconstruction ◽

Ancient Proteins ◽

Life On Earth

A central goal in molecular evolution is to understand the ways in which genes and proteins evolve in response to changing environments. In the absence of intact DNA from fossils, ancestral sequence reconstruction (ASR) can be used to infer the evolutionary precursors of extant proteins. To date, ancestral proteins belonging to eubacteria, archaea, yeast and vertebrates have been inferred that have been hypothesized to date from between several million to over 3 billion years ago. ASR has yielded insights into the early history of life on Earth and the evolution of proteins and macromolecular complexes. Recently, however, ASR has developed from a tool for testing hypotheses about protein evolution to a useful means for designing novel proteins. The strength of this approach lies in the ability to infer ancestral sequences encoding proteins that have desirable properties compared with contemporary forms, particularly thermostability and broad substrate range, making them good starting points for laboratory evolution. Developments in technologies for DNA sequencing and synthesis and computational phylogenetic analysis have led to an escalation in the number of ancient proteins resurrected in the last decade and greatly facilitated the use of ASR in the burgeoning field of synthetic biology. However, the primary challenge of ASR remains in accurately inferring ancestral states, despite the uncertainty arising from evolutionary models, incomplete sequences and limited phylogenetic trees. This review will focus, firstly, on the use of ASR to uncover links between sequence and phenotype and, secondly, on the practical application of ASR in protein engineering.

Download Full-text

FireProtASR: A Web Server for Fully Automated Ancestral Sequence Reconstruction

Briefings in Bioinformatics ◽

10.1093/bib/bbaa337 ◽

2020 ◽

Author(s):

Milos Musil ◽

Rayyan Tariq Khan ◽

Andy Beier ◽

Jan Stourac ◽

Hannes Konegger ◽

...

Keyword(s):

Ancestral Sequence ◽

Reconstruction Algorithms ◽

Ancestral Reconstruction ◽

Web Interface ◽

Computational Tools ◽

Ancestral Sequence Reconstruction ◽

Ancestral Sequences ◽

Native Proteins ◽

Sequence Reconstruction ◽

History Of

Abstract There is a great interest in increasing proteins’ stability to widen their usability in numerous biomedical and biotechnological applications. However, native proteins cannot usually withstand the harsh industrial environment, since they are evolved to function under mild conditions. Ancestral sequence reconstruction is a well-established method for deducing the evolutionary history of genes. Besides its applicability to discover the most probable evolutionary ancestors of the modern proteins, ancestral sequence reconstruction has proven to be a useful approach for the design of highly stable proteins. Recently, several computational tools were developed, which make the ancestral reconstruction algorithms accessible to the community, while leaving the most crucial steps of the preparation of the input data on users’ side. FireProtASR aims to overcome this obstacle by constructing a fully automated workflow, allowing even the unexperienced users to obtain ancestral sequences based on a sequence query as the only input. FireProtASR is complemented with an interactive, easy-to-use web interface and is freely available at https://loschmidt.chemi.muni.cz/fireprotasr/.

Download Full-text

Faculty Opinions recommendation of An experimental phylogeny to benchmark ancestral sequence reconstruction.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726740118.793524093 ◽

2016 ◽

Author(s):

Reinhard Sterner ◽

Rainer Merkl

Keyword(s):

Ancestral Sequence ◽

Ancestral Sequence Reconstruction ◽

Sequence Reconstruction

Download Full-text

Ancestral sequence reconstruction for protein engineers

Current Opinion in Structural Biology ◽

10.1016/j.sbi.2021.04.001 ◽

2021 ◽

Vol 69 ◽

pp. 131-141

Author(s):

Matthew A. Spence ◽

Joe A. Kaczmarski ◽

Jake W. Saunders ◽

Colin J. Jackson

Keyword(s):

Ancestral Sequence ◽

Ancestral Sequence Reconstruction ◽

Sequence Reconstruction

Download Full-text

P3DOCK: a protein–RNA docking webserver based on template-based and template-free docking

Bioinformatics ◽

10.1093/bioinformatics/btz478 ◽

2019 ◽

Vol 36 (1) ◽

pp. 96-103 ◽

Cited By ~ 4

Author(s):

Jinfang Zheng ◽

Xu Hong ◽

Juan Xie ◽

Xiaoxue Tong ◽

Shiyong Liu

Keyword(s):

Success Rate ◽

Clustering Algorithm ◽

Hybrid Methods ◽

3D Structure ◽

Three Dimensional ◽

Protein Docking ◽

Ease Of Use ◽

Supplementary Information ◽

Main Function ◽

Rna Interaction

AbstractMotivationThe main function of protein–RNA interaction is to regulate the expression of genes. Therefore, studying protein–RNA interactions is of great significance. The information of three-dimensional (3D) structures reveals that atomic interactions are particularly important. The calculation method for modeling a 3D structure of a complex mainly includes two strategies: free docking and template-based docking. These two methods are complementary in protein–protein docking. Therefore, integrating these two methods may improve the prediction accuracy.ResultsIn this article, we compare the difference between the free docking and the template-based algorithm. Then we show the complementarity of these two methods. Based on the analysis of the calculation results, the transition point is confirmed and used to integrate two docking algorithms to develop P3DOCK. P3DOCK holds the advantages of both algorithms. The results of the three docking benchmarks show that P3DOCK is better than those two non-hybrid docking algorithms. The success rate of P3DOCK is also higher (3–20%) than state-of-the-art hybrid and non-hybrid methods. Finally, the hierarchical clustering algorithm is utilized to cluster the P3DOCK’s decoys. The clustering algorithm improves the success rate of P3DOCK. For ease of use, we provide a P3DOCK webserver, which can be accessed at www.rnabinding.com/P3DOCK/P3DOCK.html. An integrated protein–RNA docking benchmark can be downloaded from http://rnabinding.com/P3DOCK/benchmark.html.Availability and implementationwww.rnabinding.com/P3DOCK/P3DOCK.html.Supplementary informationSupplementary data are available at Bioinformatics online.

Download Full-text

Alignment Modulates Ancestral Sequence Reconstruction Accuracy

Molecular Biology and Evolution ◽

10.1093/molbev/msy055 ◽

2018 ◽

Vol 35 (7) ◽

pp. 1783-1797 ◽

Cited By ~ 25

Author(s):

Ricardo Assunção Vialle ◽

Asif U Tamuri ◽

Nick Goldman

Keyword(s):

Ancestral Sequence ◽

Reconstruction Accuracy ◽

Ancestral Sequence Reconstruction ◽

Sequence Reconstruction

Download Full-text

Ancestral sequence reconstruction as a tool to understand natural history and guide synthetic biology: realizing and extending the vision of Zuckerkandl and Pauling

Ancestral Sequence Reconstruction ◽

10.1093/acprof:oso/9780199299188.003.0002 ◽

2007 ◽

pp. 20-33 ◽

Cited By ~ 5

Author(s):

Eric A. Gaucher

Keyword(s):

Synthetic Biology ◽

Natural History ◽

Ancestral Sequence ◽

Ancestral Sequence Reconstruction ◽

Sequence Reconstruction

Download Full-text

Sequence selection by FitSS4ASR alleviates ancestral sequence reconstruction as exemplified for geranylgeranylglyceryl phosphate synthase

Biological Chemistry ◽

10.1515/hsz-2018-0344 ◽

2019 ◽

Vol 400 (3) ◽

pp. 367-381 ◽

Cited By ~ 1

Author(s):

Kristina Straub ◽

Mona Linde ◽

Cosimo Kropp ◽

Samuel Blanquart ◽

Patrick Babinger ◽

...

Keyword(s):

Phylogenetic Signal ◽

Phylogenetic Analyses ◽

Specific Property ◽

Ancestral Sequence ◽

Rational Approach ◽

Ancestral Sequence Reconstruction ◽

Representative Sequence ◽

Sequence Selection ◽

Sequence Reconstruction ◽

Phosphate Synthase

Abstract For evolutionary studies, but also for protein engineering, ancestral sequence reconstruction (ASR) has become an indispensable tool. The first step of every ASR protocol is the preparation of a representative sequence set containing at most a few hundred recent homologs whose composition determines decisively the outcome of a reconstruction. A common approach for sequence selection consists of several rounds of manual recompilation that is driven by embedded phylogenetic analyses of the varied sequence sets. For ASR of a geranylgeranylglyceryl phosphate synthase, we additionally utilized FitSS4ASR, which replaces this time-consuming protocol with an efficient and more rational approach. FitSS4ASR applies orthogonal filters to a set of homologs to eliminate outlier sequences and those bearing only a weak phylogenetic signal. To demonstrate the usefulness of FitSS4ASR, we determined experimentally the oligomerization state of eight predecessors, which is a delicate and taxon-specific property. Corresponding ancestors deduced in a manual approach and by means of FitSS4ASR had the same dimeric or hexameric conformation; this concordance testifies to the efficiency of FitSS4ASR for sequence selection. FitSS4ASR-based results of two other ASR experiments were added to the Supporting Information. Program and documentation are available at https://gitlab.bioinf.ur.de/hek61586/FitSS4ASR.

Download Full-text