PROBABILISTIC ENSEMBLES FOR IMPROVED INFERENCE IN PROTEIN-STRUCTURE DETERMINATION

Protein X-ray crystallography — the most popular method for determining protein structures — remains a laborious process requiring a great deal of manual crystallographer effort to interpret low-quality protein images. Automating this process is critical in creating a high-throughput protein-structure determination pipeline. Previously, our group developed ACMI, a probabilistic framework for producing protein-structure models from electron-density maps produced via X-ray crystallography. ACMI uses a Markov Random Field to model the three-dimensional (3D) location of each non-hydrogen atom in a protein. Calculating the best structure in this model is intractable, so ACMI uses approximate inference methods to estimate the optimal structure. While previous results have shown ACMI to be the state-of-the-art method on this task, its approximate inference algorithm remains computationally expensive and susceptible to errors. In this work, we develop Probabilistic Ensembles in ACMI (PEA), a framework for leveraging multiple, independent runs of approximate inference to produce estimates of protein structures. Our results show statistically significant improvements in the accuracy of inference resulting in more complete and accurate protein structures. In addition, PEA provides a general framework for advanced approximate inference methods in complex problem domains.

Download Full-text

Faculty Opinions recommendation of Comparisons of NMR spectral quality and success in crystallization demonstrate that NMR and X-ray crystallography are complementary methods for small protein structure determination.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1029453.344422 ◽

2005 ◽

Author(s):

Deyou Zheng

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Protein Structure Determination ◽

Spectral Quality ◽

Small Protein ◽

X Ray ◽

Complementary Methods ◽

X Ray Crystallography

Download Full-text

Comparisons of NMR Spectral Quality and Success in Crystallization Demonstrate that NMR and X-ray Crystallography Are Complementary Methods for Small Protein Structure Determination

Journal of the American Chemical Society ◽

10.1021/ja053564h ◽

2005 ◽

Vol 127 (47) ◽

pp. 16505-16511 ◽

Cited By ~ 51

Author(s):

David A. Snyder ◽

Yang Chen ◽

Natalia G. Denissova ◽

Thomas Acton ◽

James M. Aramini ◽

...

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Protein Structure Determination ◽

Spectral Quality ◽

Small Protein ◽

X Ray ◽

Complementary Methods ◽

X Ray Crystallography

Download Full-text

Abstract P-5: Neural Network Approaches to Classify 3D Protein Structures from the Data of X-ray Laser Radiation Diffraction from Single Particles

International Journal of Biomedicine ◽

10.21103/ijbm.11.suppl_1.p5 ◽

2021 ◽

Vol 11 (Suppl_1) ◽

pp. S13-S13

Author(s):

Valery Novoseletsky ◽

Mikhail Lozhnikov ◽

Grigoriy Armeev ◽

Aleksandr Kudriavtsev ◽

Alexey Shaytan ◽

...

Keyword(s):

Neural Network ◽

Protein Structure ◽

Image Classification ◽

Structure Determination ◽

Protein Structures ◽

Protein Structure Determination ◽

Family Type ◽

X Ray ◽

Protein Molecules ◽

Diffraction Patterns

Background: Protein structure determination using X-ray free-electron laser (XFEL) includes analysis and merging a large number of snapshot diffraction patterns. Convolutional neural networks are widely used to solve numerous computer vision problems, e.g. image classification, and can be used for diffraction pattern analysis. But the task of protein structure determination with the use of CNNs only is not yet solved. Methods: We simulated the diffraction patterns using the Condor software library and obtained more than 1000 diffraction patterns for each structure with simulation parameters resembling real ones. To classify diffraction patterns, we tried two approaches, which are widely known in the area of image classification: a classic VGG network and residual networks. Results: 1. Recognition of a protein class (GPCRs vs globins). Globins and GPCR-like proteins are typical α-helical proteins. Each of these protein families has a large number of representatives (including those with known structure) but we used only 8 structures from every family. 12,000 of diffraction patterns were used for training and 4,000 patterns for testing. Results indicate that all considered networks are able to recognize the protein family type with high accuracy. 2. Recognition of the number of protein molecules in the liposome. We considered the usage of lyposomes as carriers of membrane or globular proteins for sample delivery in XFEL experiments in order to improve the X-ray beam hit rate. Three sets of diffractograms for liposomes of various radius were calculated, including diffractograms for empty liposomes, liposomes loaded with 5 bacteriorhodopsin molecules, and liposomes loaded with 10 bacteriorhodopsin molecules. The training set consisted of 23625 diffraction patterns, and test set of 7875 patterns. We found that all networks used in our study were able to identify the number of protein molecules in liposomes independent of the liposome radius. Our findings make this approach rather promising for the usage of liposomes as protein carriers in XFEL experiments. Conclusion: Thus, the performed numerical experiments show that the use of neural network algorithms for the recognition of diffraction images from single macromolecular particles makes it possible to determine changes in the structure at the angstrom scale.

Download Full-text

Analytical tools in protein structure determination

International Journal of Pharmaceutical Chemistry and Analysis ◽

10.18231/j.ijpca.2021.021 ◽

2021 ◽

Vol 8 (3) ◽

pp. 103-111

Author(s):

Krishna R Gupta ◽

Uttam Patle ◽

Uma Kabra ◽

P. Mishra ◽

Milind J Umekar

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Structure Prediction ◽

Protein Structures ◽

Three Dimensional ◽

Protein Structure Determination ◽

Computational Techniques ◽

Determination Process ◽

Structure Databases ◽

Analytical Tools

Three-dimensional protein structure prediction from amino acid sequence has been a thought-provoking task for decades, but it of pivotal importance as it provides a better understanding of its function. In recent years, the methods for prediction of protein structures have advanced considerably. Computational techniques and increase in protein sequence and structure databases have influence the laborious protein structure determination process. Still there is no single method which can predict all the protein structures. In this review, we describe the four stages of protein structure determination. We have also explored the currenttechniques used to uncover the protein structure and highpoint best suitable method for a given protein.

Download Full-text

Protein structure determination using chemical shifts

10.7287/peerj.preprints.374v1 ◽

2014 ◽

Author(s):

Anders S Christensen

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Energy Function ◽

Chemical Shifts ◽

Protein A ◽

Protein Structures ◽

Protein Structure Determination ◽

Coarse Grained ◽

X Ray ◽

Small Proteins

In this thesis, a protein structure determination using chemical shifts is presented. The method is implemented in the open source PHAISTOS protein simulation framework. The method combines sampling from a generative model with a coarse-grained force field and an energy function that includes chemical shifts. The method is benchmarked on folding simulations of five small proteins. In four cases the resulting structures are in excellent agreement with experimental data, the fifth case fail likely due to inaccuracies in the energy function. For the Chymotrypsin Inhibitor protein, a structure is determined using only chemical shifts recorded and assigned through automated processes. The CA-RMSD to the experimental X-ray for this structure is 1.1 Å. Additionally, the method is combined with very sparse NOE-restraints and evolutionary distance restraints and tested on several protein structures >100 residues. For Rhodopsin (225 residues) a structure is found at 2.5 Å CA-RMSD from the experimental X-ray structure, and a structure is determined for the Savinase protein (269 residues) with 2.9 Å CA-RMSD from the experimental X-ray structure.

Download Full-text

Faculty Opinions recommendation of Comparisons of NMR spectral quality and success in crystallization demonstrate that NMR and X-ray crystallography are complementary methods for small protein structure determination.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1029453.358543 ◽

2006 ◽

Author(s):

Thomas Szyperski

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Protein Structure Determination ◽

Spectral Quality ◽

Small Protein ◽

X Ray ◽

Complementary Methods ◽

X Ray Crystallography

Download Full-text

Protein structure determination using Riemannian approach

10.1101/599761 ◽

2019 ◽

Author(s):

Xian Wei ◽

Zhicheng Li ◽

Shijian Li ◽

Xubiao Peng ◽

Qing Zhao

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Matrix Completion ◽

Protein Structures ◽

Protein Structure Determination ◽

Nmr Structure ◽

Distance Information ◽

Nmr Structure Determination ◽

X Ray ◽

Link Type

AbstractThe protein nuclear magnetic resonance (NMR) structure determination is one of the most extensively studied problems due to its increasing importance in biological function analysis. We adopt a novel method, based on one of the matrix completion (MC) techniques–the Riemannian approach, to solve the protein structure determination problem. We formulate the protein structure in terms of low-rank matrix which can be solved by an optimization problem in the Riemannian spectrahedron manifold whose objective function has been delimited with the derived boundary condition. Two efficient algorithms in Riemannian approach-the trust-region (Tr) algorithm and the conjugate gradient (Cg) algorithm are used to reconstruct protein structures. We first use the two algorithms in a toy model and show that the Tr algorithm is more robust. Afterwards, we rebuild the protein structure from the NOE distance information deposited in NMR Restraints Grid (http://restraintsgrid.bmrb.wisc.edu/NRG/MRGridServlet). A dataset with both X-ray crystallographic structure and NMR structure deposited in Protein Data Bank (PDB) is used to statistically evaluate the performance of our method. By comparing both our rebuilt structures and NMR counterparts with the “standard” X-ray structures, we conclude that our rebuilt structures have similar (sometimes even smaller) RMSDs relative to “standard” X-ray structures in contrast with the reference NMR structures. Besides, we also validate our method by comparing the Z-scores between our rebuilt structures with reference structures using Protein Structure Validation Software suit. All the validation scores indicate that the Riemannian approach in MC techniques is valid in reconstructing the protein structures from NOE distance information. The software based on Riemannian approach is freely available athttps://github.com/xubiaopeng/Protein_Recon_MCRiemman.Author summaryMatrix Completion is a technique widely used in many aspects, such as the global positioning in sensor networks, collaborative filtering in recommendation system for many companies and face recognition, etc. In biology, distance geometry used to be a popular method for reconstructing protein structures related to NMR experiment. However, due to the low quality of the reconstructed results, those methods were replaced by other dynamic methods such as ARIA, CYANA and UNIO. Recently, a new MC technique named Riemannian approach is introduced and proved mathematically, which promotes us to apply it in protein structure determination from NMR measurements. In this paper, by combining the Riemannian approach and some post-processing procedures together, we reconstruct the protein structures from the incomplete distance information measured by NMR. By evaluating our results and comparing with the corresponding PDB NMR deposits, we show that the current Riemannian approach method is valid and at least comparable with (if not better than) the state-of-art methods in NMR structure determination.

Download Full-text

Protein structure determination using chemical shifts

10.7287/peerj.preprints.374 ◽

2014 ◽

Author(s):

Anders S Christensen

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Energy Function ◽

Chemical Shifts ◽

Protein A ◽

Protein Structures ◽

Protein Structure Determination ◽

Coarse Grained ◽

X Ray ◽

Small Proteins

Download Full-text

Protein Structure Determination by X-Ray Crystallography

Bioinformatics - Methods in Molecular Biology™ ◽

10.1007/978-1-60327-159-2_3 ◽

2008 ◽

pp. 63-87 ◽

Cited By ~ 42

Author(s):

Andrea Ilari ◽

Carmelinda Savino

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Protein Structure Determination ◽

X Ray ◽

X Ray Crystallography

Download Full-text

Protein Structure Determination in Living Cells

International Journal of Molecular Sciences ◽

10.3390/ijms20102442 ◽

2019 ◽

Vol 20 (10) ◽

pp. 2442 ◽

Cited By ~ 2

Author(s):

Teppei Ikeya ◽

Peter Güntert ◽

Yutaka Ito

Keyword(s):

Protein Structure ◽

Structure Determination ◽

Structure Prediction ◽

Structural Information ◽

Nuclear Overhauser Effect ◽

Protein Structures ◽

Three Dimensional ◽

Structural Data ◽

Sample Tube ◽

In Cells

To date, in-cell NMR has elucidated various aspects of protein behaviour by associating structures in physiological conditions. Meanwhile, current studies of this method mostly have deduced protein states in cells exclusively based on ‘indirect’ structural information from peak patterns and chemical shift changes but not ‘direct’ data explicitly including interatomic distances and angles. To fully understand the functions and physical properties of proteins inside cells, it is indispensable to obtain explicit structural data or determine three-dimensional (3D) structures of proteins in cells. Whilst the short lifetime of cells in a sample tube, low sample concentrations, and massive background signals make it difficult to observe NMR signals from proteins inside cells, several methodological advances help to overcome the problems. Paramagnetic effects have an outstanding potential for in-cell structural analysis. The combination of a limited amount of experimental in-cell data with software for ab initio protein structure prediction opens an avenue to visualise 3D protein structures inside cells. Conventional nuclear Overhauser effect spectroscopy (NOESY)-based structure determination is advantageous to elucidate the conformations of side-chain atoms of proteins as well as global structures. In this article, we review current progress for the structure analysis of proteins in living systems and discuss the feasibility of its future works.

Download Full-text