scholarly journals Decline in direction in CASP experiments

2020 ◽  
Author(s):  
Sergey Feranchuk

BACKGROUND.CASP experiment, ''critical assessment of structure predictions'', intended to discover advances in an ability of scientific groups to predict a structure of unknown protein from its sequence. The target sequences of proteins to be folded are chosen on each round. The challenge to fold a target from CASP is complicated and the structures of CASP targets are in some way different from an overall pool of known protein structures. The purpose of the study was to detect and quantify a difference between CASP targets and typical structures from the protein databank.METHODS. An averaged local complexity of a protein fold was measured in units of entropy using several metrics which reduce a fragment of a fold to a binary distribution. A complexity was measured for targets from the previous rounds of CASP. A subset of PDB structures was prepared and an averaged complexity of PDB structures was estimated. The choice of the metrics in the measurement of complexity did simulate some of the approaches which were used to predict structures in CASP competition. A measurement of a modified complexity was performed, which was based on averaged distributions for fold fragments in common PDB structures.RESULTS. A difference of CASP targets was detected by a metrics which uses hashing of distances between closely located residues. And a modified version of this metrics which emulates wide-range distance maps was shown to be most easily adjusted to utilize the difference between CASP targets and typical PDB structures. This means that, for the case of CASP targets, the methods which were trained on templates from PDB by similar metrices will guess the template structures in a new round of CASP more successfully – with an increased gap in their ability to predict neutrally selected protein structures. This means that software, which relies on inter-residue distances and performs well in CASP, will perform poorly in general-purpose structure prediction.

2016 ◽  
Vol 2 (11) ◽  
pp. e1601274 ◽  
Author(s):  
Alberto Perez ◽  
Joseph A. Morrone ◽  
Emiliano Brini ◽  
Justin L. MacCallum ◽  
Ken A. Dill

We report a key proof of principle of a new acceleration method [Modeling Employing Limited Data (MELD)] for predicting protein structures by molecular dynamics simulation. It shows that such Boltzmann-satisfying techniques are now sufficiently fast and accurate to predict native protein structures in a limited test within the Critical Assessment of Structure Prediction (CASP) community-wide blind competition.


2021 ◽  
Author(s):  
Hongyi Xu ◽  
Xiaodong Zou ◽  
Martin Högbom ◽  
Hugo Lebrette

Microcrystal electron diffraction (MicroED) has the potential to considerably impact the field of structural biology. Indeed, the method can solve atomic structures of a wide range of molecules, beyond the reach of single particle cryo-electron microscopy, exploiting crystals too small for X-ray diffraction (XRD) even using X-ray free-electron lasers. However, until the first unknown protein structure – a R2-like ligand binding oxidase from Sulfolobus acidocaldarius (SaR2lox) – was recently solved at 3.0 Å resolution, MicroED had only been used to study known protein structures previously obtained by XRD. Here, after adapting sample preparation protocols, the structure of the SaR2lox protein originally solved by MicroED was redetermined by XRD at 2.1 Å resolution. In light of the higher resolution XRD data and taking into account experimental differences of the methods, the quality of the MicroED structure is examined. The analysis demonstrates that MicroED provided an overall accurate model, revealing biologically relevant information specific to SaR2lox, such as the absence of an ether cross-link, but did not allow to detect the presence of a ligand visible by XRD in the protein binding pocket. Furthermore, strengths and weaknesses of MicroED compared to XRD are discussed in the perspective of this real-life protein example. The study provides fundaments to help MicroED become a method of choice for solving novel protein structures.


2018 ◽  
Author(s):  
Jianfu Zhou ◽  
Alexandra E. Panaitiu ◽  
Gevorg Grigoryan

AbstractThe ability to routinely design functional proteins, in a targeted manner, would have enormous implications for biomedical research and therapeutic development. Computational protein design (CPD) offers the potential to fulfill this need, and though recent years have brought considerable progress in the field, major limitations remain. Current state-of-the-art approaches to CPD aim to capture the determinants of structure from physical principles. While this has led to many successful designs, it does have strong limitations associated with inaccuracies in physical modeling, such that a robust general solution to CPD has yet to be found. Here we propose a fundamentally novel design framework—one based on identifying and applying patterns of sequence-structure compatibility found in known proteins, rather than approximating them from models of inter-atomic interactions. Specifically, we systematically decompose the target structure to be designed into structural building blocks we call TERMs (tertiary motifs) and use rapid structure search against the Protein Data Bank (PDB) to identify sequence patterns associated with each TERM from known protein structures that contain it. These results are then combined to produce a sequence-level pseudo-energy model that can score any sequence for compatibility with the target structure. This model can then be used to extract the optimal-scoring sequence via combinatorial optimization or otherwise sample the sequence space predicted to be well compatible with folding to the target. Here we carry out extensive computational analyses, showing that our method, which we dub dTERMen (design with TERM energies): 1) produces native-like sequences given native crystallographic or NMR backbones, 2) produces sequence-structure compatibility scores that correlate with thermodynamic stability, and 3) is able to predict experimental success of designed sequences generated with other methods, and 4) designs sequences that are found to fold to the desired target by structure prediction more frequently than sequences designed with an atomistic method. As an experimental validation of dTERMen, we perform a total surface redesign of Red Fluorescent Protein mCherry, marking a total of 64 residues as variable. The single sequence identified as optimal by dTERMen harbors 48 mutations relative to mCherry, but nevertheless folds, is monomeric in solution, exhibits similar stability to chemical denaturation as mCherry, and even preserves the fluorescence property. Our results strongly argue that the PDB is now sufficiently large to enable proteins to be designed by using only examples of structural motifs from unrelated proteins. This is highly significant, given that the structural database will only continue to grow, and signals the possibility of a whole host of novel data-driven CPD methods. Because such methods are likely to have orthogonal strengths relative to existing techniques, they could represent an important step towards removing remaining barriers to robust CPD.


2005 ◽  
Vol 03 (04) ◽  
pp. 837-860 ◽  
Author(s):  
TIANSHOU ZHOU ◽  
LUONAN CHEN ◽  
YUN TANG ◽  
XIANGSUN ZHANG

Protein structure alignment plays a key role in protein structure prediction and fold family classification. An efficient method for multiple protein structure alignment in a mathematical manner is presented, based on deterministic annealing technique. The alignment problem is mapped onto a nonlinear continuous optimization problem (NCOP) with common consensus chain, matching assignment matrices and atomic coordinates as variables. At each step in the annealing procedure, the NCOP is decomposed into as many subproblems as the number of protein chains, each of which is actually an independent pairwise structure alignment between a protein chain and the consensus chain and hence can be efficiently solved by the parallel computation technique. The proposed method is robust with respect to choice of iteration parameters for a wide range of proteins, and performs well in both multiple and pairwise structure alignment cases, compared with existing alignment methods.


2014 ◽  
Vol 11 (95) ◽  
pp. 20131147 ◽  
Author(s):  
Agnel Praveen Joseph ◽  
Alexandre G. de Brevern

Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.


Author(s):  
Ivan Anishchenko ◽  
Tamuka M. Chidyausiku ◽  
Sergey Ovchinnikov ◽  
Samuel J. Pellock ◽  
David Baker

AbstractThere has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.


2021 ◽  
Vol 22 (11) ◽  
pp. 6032
Author(s):  
Donghyuk Suh ◽  
Jai Woo Lee ◽  
Sun Choi ◽  
Yoonji Lee

The new advances in deep learning methods have influenced many aspects of scientific research, including the study of the protein system. The prediction of proteins’ 3D structural components is now heavily dependent on machine learning techniques that interpret how protein sequences and their homology govern the inter-residue contacts and structural organization. Especially, methods employing deep neural networks have had a significant impact on recent CASP13 and CASP14 competition. Here, we explore the recent applications of deep learning methods in the protein structure prediction area. We also look at the potential opportunities for deep learning methods to identify unknown protein structures and functions to be discovered and help guide drug–target interactions. Although significant problems still need to be addressed, we expect these techniques in the near future to play crucial roles in protein structural bioinformatics as well as in drug discovery.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Wendy M. Billings ◽  
Connor J. Morris ◽  
Dennis Della Corte

AbstractThe prediction of amino acid contacts from protein sequence is an important problem, as protein contacts are a vital step towards the prediction of folded protein structures. We propose that a powerful concept from deep learning, called ensembling, can increase the accuracy of protein contact predictions by combining the outputs of different neural network models. We show that ensembling the predictions made by different groups at the recent Critical Assessment of Protein Structure Prediction (CASP13) outperforms all individual groups. Further, we show that contacts derived from the distance predictions of three additional deep neural networks—AlphaFold, trRosetta, and ProSPr—can be substantially improved by ensembling all three networks. We also show that ensembling these recent deep neural networks with the best CASP13 group creates a superior contact prediction tool. Finally, we demonstrate that two ensembled networks can successfully differentiate between the folds of two highly homologous sequences. In order to build further on these findings, we propose the creation of a better protein contact benchmark set and additional open-source contact prediction methods.


2019 ◽  
Vol 50 (4) ◽  
pp. 693-702 ◽  
Author(s):  
Christine Holyfield ◽  
Sydney Brooks ◽  
Allison Schluterman

Purpose Augmentative and alternative communication (AAC) is an intervention approach that can promote communication and language in children with multiple disabilities who are beginning communicators. While a wide range of AAC technologies are available, little is known about the comparative effects of specific technology options. Given that engagement can be low for beginning communicators with multiple disabilities, the current study provides initial information about the comparative effects of 2 AAC technology options—high-tech visual scene displays (VSDs) and low-tech isolated picture symbols—on engagement. Method Three elementary-age beginning communicators with multiple disabilities participated. The study used a single-subject, alternating treatment design with each technology serving as a condition. Participants interacted with their school speech-language pathologists using each of the 2 technologies across 5 sessions in a block randomized order. Results According to visual analysis and nonoverlap of all pairs calculations, all 3 participants demonstrated more engagement with the high-tech VSDs than the low-tech isolated picture symbols as measured by their seconds of gaze toward each technology option. Despite the difference in engagement observed, there was no clear difference across the 2 conditions in engagement toward the communication partner or use of the AAC. Conclusions Clinicians can consider measuring engagement when evaluating AAC technology options for children with multiple disabilities and should consider evaluating high-tech VSDs as 1 technology option for them. Future research must explore the extent to which differences in engagement to particular AAC technologies result in differences in communication and language learning over time as might be expected.


1970 ◽  
Vol 19 (2) ◽  
pp. 217-226
Author(s):  
S. M. Minhaz Ud-Dean ◽  
Mahdi Muhammad Moosa

Protein structure prediction and evaluation is one of the major fields of computational biology. Estimation of dihedral angle can provide information about the acceptability of both theoretically predicted and experimentally determined structures. Here we report on the sequence specific dihedral angle distribution of high resolution protein structures available in PDB and have developed Sasichandran, a tool for sequence specific dihedral angle prediction and structure evaluation. This tool will allow evaluation of a protein structure in pdb format from the sequence specific distribution of Ramachandran angles. Additionally, it will allow retrieval of the most probable Ramachandran angles for a given sequence along with the sequence specific data. Key words: Torsion angle, φ-ψ distribution, sequence specific ramachandran plot, Ramasekharan, protein structure appraisal D.O.I. 10.3329/ptcb.v19i2.5439 Plant Tissue Cult. & Biotech. 19(2): 217-226, 2009 (December)


Sign in / Sign up

Export Citation Format

Share Document