DERIVING TOPOLOGY AND SEQUENCE ALIGNMENT FOR THE HELIX SKELETON IN LOW-RESOLUTION PROTEIN DENSITY MAPS

2008 ◽  
Vol 06 (01) ◽  
pp. 183-201 ◽  
Author(s):  
YONGGANG LU ◽  
JING HE ◽  
CHARLIE E. M. STRAUSS

Cryoelectron microscopy (cryoEM) is an experimental technique to determine the three-dimensional (3D) structure of large protein complexes. Currently, this technique is able to generate protein density maps at 6–9 Å resolution, at which the skeleton of the structure (which is composed of α-helices and β-sheets) can be visualized. As a step towards predicting the entire backbone of the protein from the protein density map, we developed a method to predict the topology and sequence alignment for the skeleton helices. Our method combines the geometrical information of the skeleton helices with the Rosetta ab initio structure prediction method to derive a consensus topology and sequence alignment for the skeleton helices. We tested the method with 60 proteins. For 45 proteins, the majority of the skeleton helices were assigned a correct topology from one of our top ten predictions. The offsets of the alignment for most of the assigned helices were within ±2 amino acids in the sequence. We also analyzed the use of the skeleton helices as a clustering tool for the decoy structures generated by Rosetta. Our comparison suggests that the topology clustering is a better method than a general overlap clustering method to enrich the ranking of decoys, particularly when the decoy pool is small.

Author(s):  
Badri Adhikari

AbstractProtein structure prediction continues to stand as an unsolved problem in bioinformatics and biomedicine. Deep learning algorithms and the availability of metagenomic sequences have led to the development of new approaches to predict inter-residue distances—the key intermediate step. Different from the recently successful methods which frame the problem as a multi-class classification problem, this article introduces a real-valued distance prediction method REALDIST. Using a representative set of 43 thousand protein chains, a variant of deep ResNet is trained to predict real-valued distance maps. The contacts derived from the real-valued distance maps predicted by this method, on the most difficult CASP13 free-modeling protein datasets, demonstrate a long-range top-L precision of 52%, which is 17% higher than the top CASP13 predictor Raptor-X and slightly higher than the more recent trRosetta method. Similar improvements are observed on the CAMEO ‘hard’ and ‘very hard’ datasets. Three-dimensional (3D) structure prediction guided by real-valued distances reveals that for short proteins the mean accuracy of the 3D models is slightly higher than the top human predictor AlphaFold and server predictor Quark in the CASP13 competition.


Genes ◽  
2018 ◽  
Vol 9 (9) ◽  
pp. 432 ◽  
Author(s):  
Chandran Nithin ◽  
Pritha Ghosh ◽  
Janusz Bujnicki

RNA-protein (RNP) interactions play essential roles in many biological processes, such as regulation of co-transcriptional and post-transcriptional gene expression, RNA splicing, transport, storage and stabilization, as well as protein synthesis. An increasing number of RNP structures would aid in a better understanding of these processes. However, due to the technical difficulties associated with experimental determination of macromolecular structures by high-resolution methods, studies on RNP recognition and complex formation present significant challenges. As an alternative, computational prediction of RNP interactions can be carried out. Structural models obtained by theoretical predictive methods are, in general, less reliable compared to models based on experimental measurements but they can be sufficiently accurate to be used as a basis for to formulating functional hypotheses. In this article, we present an overview of computational methods for 3D structure prediction of RNP complexes. We discuss currently available methods for macromolecular docking and for scoring 3D structural models of RNP complexes in particular. Additionally, we also review benchmarks that have been developed to assess the accuracy of these methods.


Author(s):  
Raghunath Satpathy

Proteins play a vital molecular role in all living organisms. Experimentally, it is difficult to predict the protein structure, however alternatively theoretical prediction method holds good for it. The 3D structure prediction of proteins is very much important in biology and this leads to the discovery of different useful drugs, enzymes, and currently this is considered as an important research domain. The prediction of proteins is related to identification of its tertiary structure. From the computational point of view, different models (protein representations) have been developed along with certain efficient optimization methods to predict the protein structure. The bio-inspired computation is used mostly for optimization process during solving protein structure. These algorithms now a days has received great interests and attention in the literature. This chapter aim basically for discussing the key features of recently developed five different types of bio-inspired computational algorithms, applied in protein structure prediction problems.


Sequencing ◽  
2013 ◽  
Vol 2013 ◽  
pp. 1-10 ◽  
Author(s):  
Amitava Moulick ◽  
Debashis Mukhopadhyay ◽  
Shonima Talapatra ◽  
Nirmalya Ghoshal ◽  
Sarmistha Sen Raychaudhuri

Plantago ovata Forsk is a medicinally important plant. Metallothioneins are cysteine rich proteins involved in the detoxification of heavy metals. Molecular cloning and modeling of MT from P. ovata is not reported yet. The present investigation will describe the isolation, structure prediction, characterization, and expression under copper stress of type 2 metallothionein (MT2) from this species. The gene of the protein comprises three exons and two introns. The deduced protein sequence contains 81 amino acids with a calculated molecular weight of about 8.1 kDa and a theoretical pI value of 4.77. The transcript level of this protein was increased in response to copper stress. Homology modeling was used to construct a three-dimensional structure of P. ovata MT2. The 3D structure model of P. ovata MT2 will provide a significant clue for further structural and functional study of this protein.


2020 ◽  
Vol 36 (11) ◽  
pp. 3385-3392
Author(s):  
Zi-Lin Liu ◽  
Jing-Hao Hu ◽  
Fan Jiang ◽  
Yun-Dong Wu

Abstract Motivation High-throughput sequencing discovers many naturally occurring disulfide-rich peptides or cystine-rich peptides (CRPs) with diversified bioactivities. However, their structure information, which is very important to peptide drug discovery, is still very limited. Results We have developed a CRP-specific structure prediction method called Cystine-Rich peptide Structure Prediction (CRiSP), based on a customized template database with cystine-specific sequence alignment and three machine-learning predictors. The modeling accuracy is significantly better than several popular general-purpose structure modeling methods, and our CRiSP can provide useful model quality estimations. Availability and implementation The CRiSP server is freely available on the website at http://wulab.com.cn/CRISP. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
◽  
Oluwatosin Oluwadare

Sixteen years after the sequencing of the human genome, the Human Genome Project (HGP), and 17 years after the introduction of Chromosome Conformation Capture (3C) technologies, three-dimensional (3-D) inference and big data remains problematic in the field of genomics, and specifically, in the field of 3C data analysis. Three-dimensional inference involves the reconstruction of a genome's 3D structure or, in some cases, ensemble of structures from contact interaction frequencies extracted from a variant of the 3C technology called the Hi-C technology. Further questions remain about chromosome topology and structure; enhancer-promoter interactions; location of genes, gene clusters, and transcription factors; the relationship between gene expression and epigenetics; and chromosome visualization at a higher scale, among others. In this dissertation, four major contributions are described, first, 3DMax, a tool for chromosome and genome 3-D structure prediction from H-C data using optimization algorithm, second, GSDB, a comprehensive and common repository that contains 3D structures for Hi-C datasets from novel 3D structure reconstruction tools developed over the years, third, ClusterTAD, a method for topological associated domains (TAD) extraction from Hi-C data using unsupervised learning algorithm. Finally, we introduce a tool called, GenomeFlow, a comprehensive graphical tool to facilitate the entire process of modeling and analysis of 3D genome organization. It is worth noting that GenomeFlow and GSDB are the first of their kind in the 3D chromosome and genome research field. All the methods are available as software tools that are freely available to the scientific community.


2020 ◽  
Author(s):  
Jin Li ◽  
Jinbo Xu

AbstractInter-residue distance prediction by deep ResNet (convolutional residual neural network) has greatly advanced protein structure prediction. Currently the most successful structure prediction methods predict distance by discretizing it into dozens of bins. Here we study how well real-valued distance can be predicted and how useful it is for 3D structure modeling by comparing it with discrete-valued prediction based upon the same deep ResNet. Different from the recent methods that predict only a single real value for the distance of an atom pair, we predict both the mean and standard deviation of a distance and then employ a novel method to fold a protein by the predicted mean and deviation. Our findings include: 1) tested on the CASP13 FM (free-modeling) targets, our real-valued distance prediction obtains 81% precision on top L/5 long-range contact prediction, much better than the best CASP13 results (70%); 2) our real-valued prediction can predict correct folds for the same number of CASP13 FM targets as the best CASP13 group, despite generating only 20 decoys for each target; 3) our method greatly outperforms a very new real-valued prediction method DeepDist in both contact prediction and 3D structure modeling; and 4) when the same deep ResNet is used, our real-valued distance prediction has 1-6% higher contact and distance accuracy than our own discrete-valued prediction, but less accurate 3D structure models.


2021 ◽  
Author(s):  
Marina A Pak ◽  
Karina A Markhieva ◽  
Mariia S Novikova ◽  
Dmitry S Petrov ◽  
Ilya S Vorobyev ◽  
...  

AlphaFold changed the field of structural biology by achieving three-dimensional (3D) structure prediction from protein sequence at experimental quality. The astounding success even led to claims that the protein folding problem is "solved". However, protein folding problem is more than just structure prediction from sequence. Presently, it is unknown if the AlphaFold-triggered revolution could help to solve other problems related to protein folding. Here we assay the ability of AlphaFold to predict the impact of single mutations on protein stability (ΔΔG) and function. To study the question we extracted metrics from AlphaFold predictions before and after single mutation in a protein and correlated the predicted change with the experimentally known ΔΔG values. Additionally, we correlated the AlphaFold predictions on the impact of a single mutation on structure with a large scale dataset of single mutations in GFP with the experimentally assayed levels of fluorescence. We found a very weak or no correlation between AlphaFold output metrics and change of protein stability or fluorescence. Our results imply that AlphaFold cannot be immediately applied to other problems or applications in protein folding.


Sign in / Sign up

Export Citation Format

Share Document