scholarly journals PROBABILISTIC ENSEMBLES FOR IMPROVED INFERENCE IN PROTEIN-STRUCTURE DETERMINATION

2012 ◽  
Vol 10 (01) ◽  
pp. 1240009 ◽  
Author(s):  
AMEET SONI ◽  
JUDE SHAVLIK

Protein X-ray crystallography — the most popular method for determining protein structures — remains a laborious process requiring a great deal of manual crystallographer effort to interpret low-quality protein images. Automating this process is critical in creating a high-throughput protein-structure determination pipeline. Previously, our group developed ACMI, a probabilistic framework for producing protein-structure models from electron-density maps produced via X-ray crystallography. ACMI uses a Markov Random Field to model the three-dimensional (3D) location of each non-hydrogen atom in a protein. Calculating the best structure in this model is intractable, so ACMI uses approximate inference methods to estimate the optimal structure. While previous results have shown ACMI to be the state-of-the-art method on this task, its approximate inference algorithm remains computationally expensive and susceptible to errors. In this work, we develop Probabilistic Ensembles in ACMI (PEA), a framework for leveraging multiple, independent runs of approximate inference to produce estimates of protein structures. Our results show statistically significant improvements in the accuracy of inference resulting in more complete and accurate protein structures. In addition, PEA provides a general framework for advanced approximate inference methods in complex problem domains.

2021 ◽  
Vol 11 (Suppl_1) ◽  
pp. S13-S13
Author(s):  
Valery Novoseletsky ◽  
Mikhail Lozhnikov ◽  
Grigoriy Armeev ◽  
Aleksandr Kudriavtsev ◽  
Alexey Shaytan ◽  
...  

Background: Protein structure determination using X-ray free-electron laser (XFEL) includes analysis and merging a large number of snapshot diffraction patterns. Convolutional neural networks are widely used to solve numerous computer vision problems, e.g. image classification, and can be used for diffraction pattern analysis. But the task of protein structure determination with the use of CNNs only is not yet solved. Methods: We simulated the diffraction patterns using the Condor software library and obtained more than 1000 diffraction patterns for each structure with simulation parameters resembling real ones. To classify diffraction patterns, we tried two approaches, which are widely known in the area of image classification: a classic VGG network and residual networks. Results: 1. Recognition of a protein class (GPCRs vs globins). Globins and GPCR-like proteins are typical α-helical proteins. Each of these protein families has a large number of representatives (including those with known structure) but we used only 8 structures from every family. 12,000 of diffraction patterns were used for training and 4,000 patterns for testing. Results indicate that all considered networks are able to recognize the protein family type with high accuracy. 2. Recognition of the number of protein molecules in the liposome. We considered the usage of lyposomes as carriers of membrane or globular proteins for sample delivery in XFEL experiments in order to improve the X-ray beam hit rate. Three sets of diffractograms for liposomes of various radius were calculated, including diffractograms for empty liposomes, liposomes loaded with 5 bacteriorhodopsin molecules, and liposomes loaded with 10 bacteriorhodopsin molecules. The training set consisted of 23625 diffraction patterns, and test set of 7875 patterns. We found that all networks used in our study were able to identify the number of protein molecules in liposomes independent of the liposome radius. Our findings make this approach rather promising for the usage of liposomes as protein carriers in XFEL experiments. Conclusion: Thus, the performed numerical experiments show that the use of neural network algorithms for the recognition of diffraction images from single macromolecular particles makes it possible to determine changes in the structure at the angstrom scale.


2021 ◽  
Vol 8 (3) ◽  
pp. 103-111
Author(s):  
Krishna R Gupta ◽  
Uttam Patle ◽  
Uma Kabra ◽  
P. Mishra ◽  
Milind J Umekar

Three-dimensional protein structure prediction from amino acid sequence has been a thought-provoking task for decades, but it of pivotal importance as it provides a better understanding of its function. In recent years, the methods for prediction of protein structures have advanced considerably. Computational techniques and increase in protein sequence and structure databases have influence the laborious protein structure determination process. Still there is no single method which can predict all the protein structures. In this review, we describe the four stages of protein structure determination. We have also explored the currenttechniques used to uncover the protein structure and highpoint best suitable method for a given protein.


2014 ◽  
Author(s):  
Anders S Christensen

In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure is determined using only chemical shifts recorded and assigned through automated processes. The CA-RMSD to the experimental X-ray for this structure is 1.1 Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100 residues. For Rhodopsin (225 residues) a structure is found at 2.5 Å CA-RMSD from the experimental X-ray structure, and a structure is determined for the Savinase protein (269 residues) with 2.9 Å CA-RMSD from the experimental X-ray structure.


2019 ◽  
Author(s):  
Xian Wei ◽  
Zhicheng Li ◽  
Shijian Li ◽  
Xubiao Peng ◽  
Qing Zhao

AbstractThe protein nuclear magnetic resonance (NMR) structure determination is one of the most extensively studied problems due to its increasing importance in biological function analysis. We adopt a novel method, based on one of the matrix completion (MC) techniques–the Riemannian approach, to solve the protein structure determination problem. We formulate the protein structure in terms of low-rank matrix which can be solved by an optimization problem in the Riemannian spectrahedron manifold whose objective function has been delimited with the derived boundary condition. Two efficient algorithms in Riemannian approach-the trust-region (Tr) algorithm and the conjugate gradient (Cg) algorithm are used to reconstruct protein structures. We first use the two algorithms in a toy model and show that the Tr algorithm is more robust. Afterwards, we rebuild the protein structure from the NOE distance information deposited in NMR Restraints Grid (http://restraintsgrid.bmrb.wisc.edu/NRG/MRGridServlet). A dataset with both X-ray crystallographic structure and NMR structure deposited in Protein Data Bank (PDB) is used to statistically evaluate the performance of our method. By comparing both our rebuilt structures and NMR counterparts with the “standard” X-ray structures, we conclude that our rebuilt structures have similar (sometimes even smaller) RMSDs relative to “standard” X-ray structures in contrast with the reference NMR structures. Besides, we also validate our method by comparing the Z-scores between our rebuilt structures with reference structures using Protein Structure Validation Software suit. All the validation scores indicate that the Riemannian approach in MC techniques is valid in reconstructing the protein structures from NOE distance information. The software based on Riemannian approach is freely available athttps://github.com/xubiaopeng/Protein_Recon_MCRiemman.Author summaryMatrix Completion is a technique widely used in many aspects, such as the global positioning in sensor networks, collaborative filtering in recommendation system for many companies and face recognition, etc. In biology, distance geometry used to be a popular method for reconstructing protein structures related to NMR experiment. However, due to the low quality of the reconstructed results, those methods were replaced by other dynamic methods such as ARIA, CYANA and UNIO. Recently, a new MC technique named Riemannian approach is introduced and proved mathematically, which promotes us to apply it in protein structure determination from NMR measurements. In this paper, by combining the Riemannian approach and some post-processing procedures together, we reconstruct the protein structures from the incomplete distance information measured by NMR. By evaluating our results and comparing with the corresponding PDB NMR deposits, we show that the current Riemannian approach method is valid and at least comparable with (if not better than) the state-of-art methods in NMR structure determination.


2014 ◽  
Author(s):  
Anders S Christensen

In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure is determined using only chemical shifts recorded and assigned through automated processes. The CA-RMSD to the experimental X-ray for this structure is 1.1 Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100 residues. For Rhodopsin (225 residues) a structure is found at 2.5 Å CA-RMSD from the experimental X-ray structure, and a structure is determined for the Savinase protein (269 residues) with 2.9 Å CA-RMSD from the experimental X-ray structure.


2019 ◽  
Vol 20 (10) ◽  
pp. 2442 ◽  
Author(s):  
Teppei Ikeya ◽  
Peter Güntert ◽  
Yutaka Ito

To date, in-cell NMR has elucidated various aspects of protein behaviour by associating structures in physiological conditions. Meanwhile, current studies of this method mostly have deduced protein states in cells exclusively based on ‘indirect’ structural information from peak patterns and chemical shift changes but not ‘direct’ data explicitly including interatomic distances and angles. To fully understand the functions and physical properties of proteins inside cells, it is indispensable to obtain explicit structural data or determine three-dimensional (3D) structures of proteins in cells. Whilst the short lifetime of cells in a sample tube, low sample concentrations, and massive background signals make it difficult to observe NMR signals from proteins inside cells, several methodological advances help to overcome the problems. Paramagnetic effects have an outstanding potential for in-cell structural analysis. The combination of a limited amount of experimental in-cell data with software for ab initio protein structure prediction opens an avenue to visualise 3D protein structures inside cells. Conventional nuclear Overhauser effect spectroscopy (NOESY)-based structure determination is advantageous to elucidate the conformations of side-chain atoms of proteins as well as global structures. In this article, we review current progress for the structure analysis of proteins in living systems and discuss the feasibility of its future works.


Sign in / Sign up

Export Citation Format

Share Document