Error-estimation-guided rebuilding ofde novomodels increases the success rate ofab initiophasing

Recent advancements in computational methods for protein-structure prediction have made it possible to generate the high-qualityde novomodels required forab initiophasing of crystallographic diffraction data using molecular replacement. Despite those encouraging achievements inab initiophasing usingde novomodels, its success is limited only to those targets for which high-qualityde novomodels can be generated. In order to increase the scope of targets to whichab initiophasing withde novomodels can be successfully applied, it is necessary to reduce the errors in thede novomodels that are used as templates for molecular replacement. Here, an approach is introduced that can identify and rebuild the residues with larger errors, which subsequently reduces the overall Cαroot-mean-square deviation (CA-RMSD) from the native protein structure. The error in a predicted model is estimated from the average pairwise geometric distance per residue computed among selected lowest energy coarse-grained models. This score is subsequently employed to guide a rebuilding process that focuses on more error-prone residues in the coarse-grained models. This rebuilding methodology has been tested on ten protein targets that were unsuccessful using previous methods. The average CA-RMSD of the coarse-grained models was improved from 4.93 to 4.06 Å. For those models with CA-RMSD less than 3.0 Å, the average CA-RMSD was improved from 3.38 to 2.60 Å. These rebuilt coarse-grained models were then converted into all-atom models and refined to produce improvedde novomodels for molecular replacement. Seven diffraction data sets were successfully phased using rebuiltde novomodels, indicating the improved quality of these rebuiltde novomodels and the effectiveness of the rebuilding process. Software implementing this method, calledMORPHEUS, can be downloaded from http://www.riken.jp/zhangiru/software.html.

Download Full-text

Error estimation guided rebuilding of de novo models for ab initio phasing

Acta Crystallographica Section A Foundations and Advances ◽

10.1107/s2053273314085519 ◽

2014 ◽

Vol 70 (a1) ◽

pp. C1448-C1448

Author(s):

Rojan Shrestha ◽

David Simoncini ◽

Kam Zhang

Keyword(s):

Protein Structure ◽

Ab Initio ◽

Structure Prediction ◽

De Novo ◽

Coarse Grained ◽

Molecular Replacement ◽

High Quality ◽

Protein Targets ◽

Current State

Recent advancement in computational methods for protein structure prediction has made it possible to generate high quality de novo models required for ab initio phasing of crystallographic diffraction data using molecular replacement. Despite those encouraging achievements in ab initio phasing using de novo models, its success is limited only to those targets for which high quality de novo models can be generated. Here, an approach is introduced that can identify and rebuild the residues with larger errors, which subsequently reduces the overall C-alpha root mean square deviations (CA-RMSD) to the native protein structure. The error in a predicted model is estimated by the average pairwise geometric distance per residue computed among selected lowest energy coarse-grained models. This score is subsequently employed to guide a rebuilding process that focuses on more error-prone residues in the coarse-grained models. These rebuilt coarse-grained models were then turned into all-atom models and refined to produce improved de novo models for molecular replacement. This rebuilding methodology has been tested on ten protein targets that were unsuccessful with the current state-of-the-art methods. Seven diffraction datasets were successfully phased using rebuilt de novo models indicating the improved quality of these rebuilt de novo models and the effectiveness of this rebuilding process.

Download Full-text

FALCON2: a web server for high-quality prediction of protein tertiary structures

BMC Bioinformatics ◽

10.1186/s12859-021-04353-8 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Lupeng Kong ◽

Fusong Ju ◽

Haicang Zhang ◽

Shiwei Sun ◽

Dongbo Bu

Keyword(s):

Protein Structure ◽

Ab Initio ◽

Protein Structure Prediction ◽

Structure Prediction ◽

Web Server ◽

High Quality ◽

Tertiary Structures ◽

Target Proteins ◽

High Quality Protein ◽

Protein Functions

Abstract Background Accurate prediction of protein tertiary structures is highly desired as the knowledge of protein structures provides invaluable insights into protein functions. We have designed two approaches to protein structure prediction, including a template-based modeling approach (called ProALIGN) and an ab initio prediction approach (called ProFOLD). Briefly speaking, ProALIGN aligns a target protein with templates through exploiting the patterns of context-specific alignment motifs and then builds the final structure with reference to the homologous templates. In contrast, ProFOLD uses an end-to-end neural network to estimate inter-residue distances of target proteins and builds structures that satisfy these distance constraints. These two approaches emphasize different characteristics of target proteins: ProALIGN exploits structure information of homologous templates of target proteins while ProFOLD exploits the co-evolutionary information carried by homologous protein sequences. Recent progress has shown that the combination of template-based modeling and ab initio approaches is promising. Results In the study, we present FALCON2, a web server that integrates ProALIGN and ProFOLD to provide high-quality protein structure prediction service. For a target protein, FALCON2 executes ProALIGN and ProFOLD simultaneously to predict possible structures and selects the most likely one as the final prediction result. We evaluated FALCON2 on widely-used benchmarks, including 104 CASP13 (the 13th Critical Assessment of protein Structure Prediction) targets and 91 CASP14 targets. In-depth examination suggests that when high-quality templates are available, ProALIGN is superior to ProFOLD and in other cases, ProFOLD shows better performance. By integrating these two approaches with different emphasis, FALCON2 server outperforms the two individual approaches and also achieves state-of-the-art performance compared with existing approaches. Conclusions By integrating template-based modeling and ab initio approaches, FALCON2 provides an easy-to-use and high-quality protein structure prediction service for the community and we expect it to enable insights into a deep understanding of protein functions.

Download Full-text

A coarse-grained Langevin molecular dynamics approach to de novo protein structure prediction

Biochemical and Biophysical Research Communications ◽

10.1016/j.bbrc.2008.02.048 ◽

2008 ◽

Vol 369 (2) ◽

pp. 500-506 ◽

Cited By ~ 7

Author(s):

Takeshi N. Sasaki ◽

Hikmet Cetin ◽

Masaki Sasai

Keyword(s):

Molecular Dynamics ◽

Protein Structure ◽

Protein Structure Prediction ◽

Structure Prediction ◽

De Novo ◽

Coarse Grained

Download Full-text

A fragmentation and reassembly method forab initiophasing

Acta Crystallographica Section D Biological Crystallography ◽

10.1107/s1399004714025449 ◽

2015 ◽

Vol 71 (2) ◽

pp. 304-312 ◽

Cited By ~ 14

Author(s):

Rojan Shrestha ◽

Kam Y. J. Zhang

Keyword(s):

Ab Initio ◽

Structure Determination ◽

Model Building ◽

De Novo ◽

Sequence Information ◽

Molecular Replacement ◽

Protein Targets ◽

Current State ◽

Viable Approach ◽

Automated Model Building

Ab initiophasing withde novomodels has become a viable approach for structural solution from protein crystallographic diffraction data. This approach takes advantage of the known protein sequence information, predictsde novomodels and uses them for structure determination by molecular replacement. However, even the current state-of-the-artde novomodelling method has a limit as to the accuracy of the model predicted, which is sometimes insufficient to be used as a template for successful molecular replacement. A fragment-assembly phasing method has been developed that starts from an ensemble of low-accuracyde novomodels, disassembles them into fragments, places them independently in the crystallographic unit cell by molecular replacement and then reassembles them into a whole structure that can provide sufficient phase information to enable complete structure determination by automated model building. Tests on ten protein targets showed that the method could solve structures for eight of these targets, although the predictedde novomodels cannot be used as templates for successful molecular replacement since the best model for each target is on average more than 4.0 Å away from the native structure. The method has extended the applicability of theab initiophasing byde novomodels approach. The method can be used to solve structures when the bestde novomodels are still of low accuracy.

Download Full-text