scholarly journals In-silico prediction and modeling of the Entamoeba histolytica proteins: Serine-rich Entamoeba histolytica protein and peroxiredoxin

Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

Background: Amoebiasis is the third most common parasitic cause of morbidity and mortality particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence arises a necessity for a better diagnostic approach. Serine-rich Entamoeba histolytica protein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal in E. histolytica virulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential which are not available till date. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using relevant in-silico methods. Methods:The amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out. Results: The protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be a extracellular protein, peroxiredoxin was a peripheral membrane protein, while Gal/GalAc was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All three proteins exhibited similarity in their structures and were mostly composed of loops. Discussion:The structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of three sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.

2016 ◽  
Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

Background: Amoebiasis is the third most common parasitic cause of morbidity and mortality particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence arises a necessity for a better diagnostic approach. Serine-rich Entamoeba histolytica protein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal in E. histolytica virulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential which are not available till date. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using relevant in-silico methods. Methods:The amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out. Results: The protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be a extracellular protein, peroxiredoxin was a peripheral membrane protein, while Gal/GalAc was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All three proteins exhibited similarity in their structures and were mostly composed of loops. Discussion:The structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of three sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3160 ◽  
Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

BackgroundAmoebiasis is the third most common parasitic cause of morbidity and mortality, particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence there arises a necessity for a better diagnostic approach. Serine-richEntamoeba histolyticaprotein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal inE. histolyticavirulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential. However, studies in this aspect are scant. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using appropriatein-silicomethods.MethodsThe amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out.ResultsThe protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be an extracellular protein, peroxiredoxin a peripheral membrane protein while Gal/GalNAc lectin was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc lectin, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All the three proteins exhibited similarity in their structures and were mostly composed of loops.DiscussionThe structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures of SREHP and peroxiredoxin predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.


2017 ◽  
Vol 15 (03) ◽  
pp. 1750009 ◽  
Author(s):  
Bruno Grisci ◽  
Márcio Dorn

The development of computational methods to accurately model three-dimensional protein structures from sequences of amino acid residues is becoming increasingly important to the structural biology field. This paper addresses the challenge of predicting the tertiary structure of a given amino acid sequence, which has been reported to belong to the NP-Complete class of problems. We present a new method, namely NEAT–FLEX, based on NeuroEvolution of Augmenting Topologies (NEAT) to extract structural features from (ABS) proteins that are determined experimentally. The proposed method manipulates structural information from the Protein Data Bank (PDB) and predicts the conformational flexibility (FLEX) of residues of a target amino acid sequence. This information may be used in three-dimensional structure prediction approaches as a way to reduce the conformational search space. The proposed method was tested with 24 different amino acid sequences. Evolving neural networks were compared against a traditional error back-propagation algorithm; results show that the proposed method is a powerful way to extract and represent structural information from protein molecules that are determined experimentally.


2011 ◽  
Vol 8 (3) ◽  
pp. 158-175
Author(s):  
Gualberto Asencio Cortés ◽  
Jesús A. Aguilar-Ruiz

SummaryThe prediction of protein structures is a current issue of great significance in structural bioinformatics. More specifically, the prediction of the tertiary structure of a protein con- sists in determining its three-dimensional conformation based solely on its amino acid sequence. This study proposes a method in which protein fragments are assembled according to their physicochemical similarities, using information extracted from known protein structures. Many approaches cited in the literature use the physicochemical properties of amino acids, generally hydrophobicity, polarity and charge, to predict structure. In our method, implemented with parallel multithreading, we used a set of 30 physicochemical amino acid properties selected from the AAindex database. Several protein tertiary structure prediction methods produce a contact map. Our proposed method produces a distance map, which provides more information about the structure of a protein than a contact map. We performed several preliminary analysis of the protein physicochemical data distributions using 3D surfaces. Three main pattern types were found in 3D surfaces, thus it is possible to extract rules in order to predict distances between amino acids according to their physicochemical properties. We performed an experimental validation of our method using five non-homologous protein sets and we showed the generality of this method and its prediction quality using the amino acid properties considered. Finally, we included a study of the algorithm efficiency according to the number of most similar fragments considered and we notably improved the precision with the studied proteins sets.


Author(s):  
Rahmat Eko Sanjaya ◽  
Kartika Dwi Asni Putri ◽  
Anita Kurniati ◽  
Ali Rohman ◽  
Ni Nyoman Tri Puspaningsih

Abstract Background Hydrolysis of cellulose-based biomass by cellulases produce fermented sugar for making biofuels, such as bioethanol. Cellulases hydrolyze the β-1,4-glycosidic linkage of cellulose and can be obtained from cultured and uncultured microorganisms. Uncultured microorganisms are a source for exploring novel cellulase genes through the metagenomic approach. Metagenomics concerns the extraction, cloning, and analysis of the entire genetic complement of a habitat without cultivating microbes. The glycoside hydrolase 5 family (GH5) is a cellulase family, as the largest group of glycoside hydrolases. Numerous variants of GH5-cellulase family have been identified through the metagenomic approach, including CelGH5 in this study. University-CoE-Research Center for Biomolecule Engineering, Universitas Airlangga successfully isolated CelGH5 from waste decomposition of oil palm empty fruit bunches (OPEFB) soil by metagenomics approach. The properties and structural characteristics of GH5-cellulases from uncultured microorganisms can be studied using computational tools and software. Results The GH5-cellulase family from uncultured microorganisms was characterized using standard computational-based tools. The amino acid sequences and 3D-protein structures were retrieved from the GenBank Database and Protein Data Bank. The physicochemical analysis revealed the sequence length was roughly 332–751 amino acids, with the molecular weight range around 37–83 kDa, dominantly negative charges with pI values below 7. Alanine was the most abundant amino acid making up the GH5-cellulase family and the percentage of hydrophobic amino acids was more than hydrophilic. Interestingly, ten endopeptidases with the highest average number of cleavage sites were found. Another uniqueness demonstrated that there was also a difference in stability between in silico and wet lab. The II values indicated CelGH5 and ACA61162.1 as unstable enzymes, while the wet lab showed they were stable at broad pH range. The program of SOPMA, PDBsum, ProSA, and SAVES provided the secondary and tertiary structure analysis. The predominant secondary structure was the random coil, and tertiary structure has fulfilled the structure quality of QMEAN4, ERRAT, Ramachandran plot, and Z score. Conclusion This study can afford the new insights about the physicochemical and structural properties of the GH5-cellulase family from uncultured microorganisms. Furthermore, in silico analysis could be valuable in selecting a highly efficient cellulases for enhanced enzyme production.


Author(s):  
Ivan Anishchenko ◽  
Tamuka M. Chidyausiku ◽  
Sergey Ovchinnikov ◽  
Samuel J. Pellock ◽  
David Baker

AbstractThere has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting distance maps, which as expected are quite featureless. We then carried out Monte Carlo sampling in amino acid sequence space, optimizing the contrast (KL-divergence) between the distance distributions predicted by the network and the background distribution. Optimization from different random starting points resulted in a wide range of proteins with diverse sequences and all alpha, all beta sheet, and mixed alpha-beta structures. We obtained synthetic genes encoding 129 of these network hallucinated sequences, expressed and purified the proteins in E coli, and found that 27 folded to monomeric stable structures with circular dichroism spectra consistent with the hallucinated structures. Thus deep networks trained to predict native protein structures from their sequences can be inverted to design new proteins, and such networks and methods should contribute, alongside traditional physically based models, to the de novo design of proteins with new functions.


2021 ◽  
Author(s):  
Akhil Padarti ◽  
Ofek Belkin ◽  
Johnathan Abou-Fadel ◽  
Jun Zhang

Purpose: The objective of this study is to validate the existence of dual cores within the typical phosphotyrosine binding (PTB) domain and to identify potentially damaging and pathogenic nonsynonymous coding single nuclear polymorphisms (nsSNPs) in the canonical PTB domain of the CCM2 gene that causes cerebral cavernous malformations (CCMs). Methods: The nsSNPs within the coding sequence for PTB domain of human CCM2 gene, retrieved from exclusive database search, were analyzed for their functional and structural impact using a series of bioinformatic tools. The effects of the mutations on tertiary structure of the PTB domain in human CCM2 protein were predicted to examine the effect of the nsSNPs on tertiary structure on PTB Cores. Results: Our mutation analysis, through alignment of protein structures between wildtype CCM2 and mutant, indicated that the structural impacts of pathogenic nsSNPs is biophysically limited to only the spatially adjacent substituted amino acid site with minimal structural influence on the adjacent core of the PTB domain, suggesting both cores are independently functional and essential for proper CCM2 function. Conclusion: Utilizing a combination of protein conservation and structure-based analysis, we analyzed the structural effects of inherited pathogenic mutations within the CCM2 PTB domain. Our results indicated that the pathogenic amino acid substitutions lead to only subtle changes locally confined to the surrounding tertiary structure of the PTB core within which it resides, while no structural disturbance to the neighboring PTB core was observed, reaffirming the presence of dual functional cores in the PTB domain.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Kun Tian ◽  
Xin Zhao ◽  
Xiaogeng Wan ◽  
Stephen S.-T. Yau

AbstractProtein structure can provide insights that help biologists to predict and understand protein functions and interactions. However, the number of known protein structures has not kept pace with the number of protein sequences determined by high-throughput sequencing. Current techniques used to determine the structure of proteins are complex and require a lot of time to analyze the experimental results, especially for large protein molecules. The limitations of these methods have motivated us to create a new approach for protein structure prediction. Here we describe a new approach to predict of protein structures and structure classes from amino acid sequences. Our prediction model performs well in comparison with previous methods when applied to the structural classification of two CATH datasets with more than 5000 protein domains. The average accuracy is 92.5% for structure classification, which is higher than that of previous research. We also used our model to predict four known protein structures with a single amino acid sequence, while many other existing methods could only obtain one possible structure for a given sequence. The results show that our method provides a new effective and reliable tool for protein structure prediction research.


2020 ◽  
Author(s):  
Chittaranjan Baruah ◽  
PAPARI DEVI ◽  
DHIRENDRA K SHARMA

BACKGROUND Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a positive-sense, single-stranded RNA coronavirus. The virus is the causative agent of coronavirus disease 2019 (COVID-19) and is contagious through human-to-human transmission. The RNA genome of SARS-CoV-2 encodes 29 proteins, though one may not get expressed. 15 proteins are not yet having experimental structures for investigation on possible drug targets. OBJECTIVE The present study reports sequence analysis, complete coordinate tertiary structure prediction and in silico sequence-based and structure-based functional characterization of full SARS-CoV-2 proteome based on the NCBI reference sequence NC_045512 (29903 bp ss-RNA). METHODS A total of 25 polypeptides have been analyzed out of which 15 proteins are not yet having experimental structures and only 10 are having experimental structures with known PDB IDs. Out of 15 newly predicted structures six (6) were predicted using comparative modeling and nine (09) proteins having no significant similarity with so far available PDB structures were modeled using ab-initio modeling. QMEANDisCo 4.0.0 and ProQ3 for global and local (per-residue) quality estimates is used for structure verification. RESULTS The all-atom model of tertiary structure of high quality and may be useful for structure-based drug designing targets. The study has identified along with nine major targets sixteen nonstructural proteins (NSPs), which may be equally important from the drug design angle. Tunnel analysis revealed the presence of large number of tunnels in NSP3, ORF 6 protein and membrane glycoprotein indicating a large number of transport pathways for small ligands influencing their reactivity. CONCLUSIONS The 15 theoretical structures would perhaps be useful for the scientific community for advanced computational analysis on interactions of each protein for detailed functional analysis of active sites towards structure based drug designing or to study potential vaccines, if at all, towards preventing epidemics and pandemics in absence of complete experimental structure. CLINICALTRIAL The protein structures have been deposited to ModelArchive.


Sign in / Sign up

Export Citation Format

Share Document