Metallothionein: Protein structure prediction and sequence analyses in pigeon pea(Cajanuscajan)

2017 ◽  
Vol 4 (04) ◽  
Author(s):  
Sakshi Chaudhary ◽  
Anil Kumar Singh ◽  
Jeshima Khan Yasin

Metallothioneins are a special group of small proteins capable of detoxifying non-essential metal ions present in excess within a plant cell. Metallothioneins are cysteine-rich diverse classes of heavy metal binding protein molecules which are essential for plant growth.These proteins are present in all taxa, except eubacteria. The similarity in protein sequences provides a basis for the method which predicts structural features of a protein with that of a known protein structure. Structural similarity of entire sequence or large sequence fragment enables prediction and modeling of entire structural domain, while distribution of local features of known protein structure make it possible to predict such features in structure of unknown or uncharacterised proteins.In this study, from available genomic resources metallothionein of pigeonpea was identified, structure of metallothionein was predicted and validated. We have presented a step-wise methodology to model a given protein and to validate the structures.

2021 ◽  
Author(s):  
Ho-min Park ◽  
Yunseol Park ◽  
Joris Vankerschaver ◽  
Arnout Van Messem ◽  
Wesley De Neve ◽  
...  

Protein therapeutics play an important role in controlling the functions and activities of disease-causing proteins in modern medicine. Despite protein therapeutics having several advantages over traditional small-molecule therapeutics, further development has been hindered by drug complexity and delivery issues. However, recent progress in deep learning-based protein structure prediction approaches such as AlphaFold opens new opportunities to exploit the complexity of these macro-biomolecules for highly-specialised design to inhibit, regulate or even manipulate specific disease-causing proteins. Anti-CRISPR proteins are small proteins from bacteriophages that counter-defend against the prokaryotic adaptive immunity of CRISPR-Cas systems. They are unique examples of natural protein therapeutics that have been optimized by the host-parasite evolutionary arms race to inhibit a wide variety of host proteins. Here, we show that these Anti-CRISPR proteins display diverse inhibition mechanisms through accurate structural prediction and functional analysis. We find that these phage-derived proteins are extremely distinct in structure, some of which have no homologues in the current protein structure domain. Furthermore, we find a novel family of Anti-CRISPR proteins which are structurally homologous to the recently-discovered mechanism of manipulating host proteins through enzymatic activity, rather than through direct inference. Using highly accurate structure prediction, we present a wide variety of protein-manipulating strategies of anti-CRISPR proteins for future protein drug design.


2014 ◽  
Author(s):  
Lars A Bratholm ◽  
Anders Steen Christensen ◽  
Thomas Hamelryck ◽  
Jan H Jensen

Protein chemical shifts are routinely used to augment molecular mechanics force fields in protein structure simulations, with weights of the chemical shift restraints determined empirically. These weights, however, might not be an optimal descriptor of a given protein structure and predictive model, and a bias is introduced which might result in incorrect structures. In the inferential structure determination framework, both the unknown structure and the disagreement between experimental and back-calculated data are formulated as a joint probability distribution, thus utilizing the full information content of the data. Here, we present the formulation of such a probability distribution where the error in chemical shift prediction is described by either a Gaussian or Cauchy distribution. The methodology is demonstrated and compared to a set of empirically weighted potentials through Markov chain Monte Carlo simulations of three small proteins (ENHD, Protein G and the SMN Tudor Domain) using the PROFASI force field and the chemical shift predictor CamShift. Using a clustering-criterion for identifying the best structure, together with the addition of a solvent exposure scoring term, the simulations suggests that sampling both the structure and the uncertainties in chemical shift prediction leads more accurate structures compared to conventional methods using empirical determined weights. The Cauchy distribution, using either sampled uncertainties or predetermined weights, did, however, result in overall better convergence to the native fold, suggesting that both types of distribution might be useful in different aspects of the protein structure prediction.


2021 ◽  
Author(s):  
Janani Durairaj ◽  
Mehmet Akdel ◽  
Dick de Ridder ◽  
Aalt D.J. van Dijk

The growing prevalence and popularity of protein structure data, both experimental and computationally modelled, necessitates fast tools and algorithms to enable exploratory and interpretable structure-based machine learning. Alignment-free approaches have been developed for divergent proteins, but proteins sharing functional and structural similarity are often better understood via structural alignment, which has typically been too computationally expensive for larger datasets. Here, we introduce the concept of rotation-invariant shape-mers to multiple structure alignment, creating a structure aligner that scales well with the number of proteins and allows for aligning over a thousand structures in 20 minutes. We demonstrate how alignment-free shape-mer counts and aligned structural features, when used in machine learning tasks, can adapt to different levels of functional hierarchy in protein kinases, pinpointing residues and structural fragments that play a role in catalytic activity.


2021 ◽  
Author(s):  
Chunxiang Peng ◽  
Xiaogen Zhou ◽  
Yuhao Xia ◽  
Yang Zhang ◽  
Guijun Zhang

With the development of protein structure prediction methods and biological experimental determination techniques, the structure of single-domain proteins can be relatively easier to be modeled or experimentally solved. However, more than 80% of eukaryotic proteins and 67% of prokaryotic proteins contain multiple domains. Constructing a unified multi-domain protein structure database will promote the research of multi-domain proteins, especially in the modeling of multi-domain protein structures. In this work, we develop a unified multi-domain protein structure database (MPDB). Based on MPDB, we also develop a server with two functional modules: (1) the culling module, which filters the whole MPDB according to input criteria; (2) the detection module, which identifies structural analogues of the full-chain according to the structural similarity between input domain models and the protein in MPDB. The module can discover the potential analogue structures, which will contribute to high-quality multi-domain protein structure modeling.


2021 ◽  
Author(s):  
Konstantin Weissenow ◽  
Michael Heinzinger ◽  
Burkhard Rost

All state-of-the-art (SOTA) protein structure predictions rely on evolutionary information captured in multiple sequence alignments (MSAs), primarily on evolutionary couplings (co-evolution). Such information is not available for all proteins and is computationally expensive to generate. Prediction models based on Artificial Intelligence (AI) using only single sequences as input are easier and cheaper but perform so poorly that speed becomes irrelevant. Here, we described the first competitive AI solution exclusively inputting embeddings extracted from pre-trained protein Language Models (pLMs), namely from the transformer pLM ProtT5, from single sequences into a relatively shallow (few free parameters) convolutional neural network (CNN) trained on inter-residue distances, i.e. protein structure in 2D. The major advance originated from processing the attention heads learned by ProtT5. Although these models required at no point any MSA, they matched the performance of methods relying on co-evolution. Although not reaching the very top, our lean approach came close at substantially lower costs thereby speeding up development and each future prediction. By generating protein-specific rather than family-averaged predictions, these new solutions could distinguish between structural features differentiating members of the same family of proteins with similar structure predicted alike by all other top methods.


2014 ◽  
Author(s):  
Lars A Bratholm ◽  
Anders Steen Christensen ◽  
Thomas Hamelryck ◽  
Jan H Jensen

Protein chemical shifts are routinely used to augment molecular mechanics force fields in protein structure simulations, with weights of the chemical shift restraints determined empirically. These weights, however, might not be an optimal descriptor of a given protein structure and predictive model, and a bias is introduced which might result in incorrect structures. In the inferential structure determination framework, both the unknown structure and the disagreement between experimental and back-calculated data are formulated as a joint probability distribution, thus utilizing the full information content of the data. Here, we present the formulation of such a probability distribution where the error in chemical shift prediction is described by either a Gaussian or Cauchy distribution. The methodology is demonstrated and compared to a set of empirically weighted potentials through Markov chain Monte Carlo simulations of three small proteins (ENHD, Protein G and the SMN Tudor Domain) using the PROFASI force field and the chemical shift predictor CamShift. Using a clustering-criterion for identifying the best structure, together with the addition of a solvent exposure scoring term, the simulations suggests that sampling both the structure and the uncertainties in chemical shift prediction leads more accurate structures compared to conventional methods using empirical determined weights. The Cauchy distribution, using either sampled uncertainties or predetermined weights, did, however, result in overall better convergence to the native fold, suggesting that both types of distribution might be useful in different aspects of the protein structure prediction.


2021 ◽  
Author(s):  
Mariana Hoyer Moreira ◽  
Fabio C. L. Almeida ◽  
Tatiana Domitrovic ◽  
Fernando L. Palhano

Defensins are small proteins, usually ranging from 4 to 6 kDa, amphipathic, disulfide-rich, and with a small or even absent hydrophobic core. Since a hydrophobic core is generally found in globular proteins that fold in an aqueous solvent, the peculiar fold of defensins can challenge tertiary protein structure predictors. We performed a PDB-wide survey of small proteins (4-6 kDa) to understand the similarities of defensins with other small disulfide-rich proteins. We found no differences when we compared defensins with non-defensins regarding the proportion and exposition to the solvent of apolar, polar, and charged residues. Then we divided all small proteins (4-6 kDa) deposited in PDB into two groups, one group with at least one disulfide bond (bonded, defensins included) and another group without any disulfide bond (unbonded). The group of bonded proteins presented apolar residues more exposed to the solvent than the unbonded group. The ab initio algorithm for tertiary protein structure prediction Robetta was more accurate to predict unbonded than bonded proteins. Our work highlights one more layer of complexity for the tertiary protein prediction structure: small disulfide-rich proteins' ability to fold even with a poor hydrophobic core.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Siyuan Liu ◽  
Tong Wang ◽  
Qijiang Xu ◽  
Bin Shao ◽  
Jian Yin ◽  
...  

Abstract Background Fragment libraries play a key role in fragment-assembly based protein structure prediction, where protein fragments are assembled to form a complete three-dimensional structure. Rich and accurate structural information embedded in fragment libraries has not been systematically extracted and used beyond fragment assembly. Methods To better leverage the valuable structural information for protein structure prediction, we extracted seven types of structural information from fragment libraries. We broadened the usage of such structural information by transforming fragment libraries into protein-specific potentials for gradient-descent based protein folding and encoding fragment libraries as structural features for protein property prediction. Results Fragment libraires improved the accuracy of protein folding and outperformed state-of-the-art algorithms with respect to predicted properties, such as torsion angles and inter-residue distances. Conclusion Our work implies that the rich structural information extracted from fragment libraries can complement sequence-derived features to help protein structure prediction.


2018 ◽  
Author(s):  
Daniel R. F. Bonetti ◽  
Gesiel Rios Lopes ◽  
Alexandre C. B. Delbem ◽  
Paulo S. L. Souza ◽  
Kalinka C. Branco ◽  
...  

This paper compares the runtime of three distinct parallel algorithms for the evaluation of an ab initio and full-atom approach based on GA and celllist technique, in order to minimize the van der Waals energy. The three parallel algorithms are developed in C and use one of these programming models: MPI, OpenMP or hybrid (MPI+OpenMP). Our preliminary results show that van der Waals Energy are executed faster and with better speedups when using hybrid and more flexible parallel algorithms to predict the structure of larger proteins. We also show that for small proteins the communication of MPI imposes a high overhead for the parallel execution and, thus the OpenMP presents a better relation cost x benefit in such cases.


Sign in / Sign up

Export Citation Format

Share Document