In silico Proteome analysis of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Preprint)

2020 ◽  
Author(s):  
Chittaranjan Baruah ◽  
PAPARI DEVI ◽  
DHIRENDRA K SHARMA

BACKGROUND Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a positive-sense, single-stranded RNA coronavirus. The virus is the causative agent of coronavirus disease 2019 (COVID-19) and is contagious through human-to-human transmission. The RNA genome of SARS-CoV-2 encodes 29 proteins, though one may not get expressed. 15 proteins are not yet having experimental structures for investigation on possible drug targets. OBJECTIVE The present study reports sequence analysis, complete coordinate tertiary structure prediction and in silico sequence-based and structure-based functional characterization of full SARS-CoV-2 proteome based on the NCBI reference sequence NC_045512 (29903 bp ss-RNA). METHODS A total of 25 polypeptides have been analyzed out of which 15 proteins are not yet having experimental structures and only 10 are having experimental structures with known PDB IDs. Out of 15 newly predicted structures six (6) were predicted using comparative modeling and nine (09) proteins having no significant similarity with so far available PDB structures were modeled using ab-initio modeling. QMEANDisCo 4.0.0 and ProQ3 for global and local (per-residue) quality estimates is used for structure verification. RESULTS The all-atom model of tertiary structure of high quality and may be useful for structure-based drug designing targets. The study has identified along with nine major targets sixteen nonstructural proteins (NSPs), which may be equally important from the drug design angle. Tunnel analysis revealed the presence of large number of tunnels in NSP3, ORF 6 protein and membrane glycoprotein indicating a large number of transport pathways for small ligands influencing their reactivity. CONCLUSIONS The 15 theoretical structures would perhaps be useful for the scientific community for advanced computational analysis on interactions of each protein for detailed functional analysis of active sites towards structure based drug designing or to study potential vaccines, if at all, towards preventing epidemics and pandemics in absence of complete experimental structure. CLINICALTRIAL The protein structures have been deposited to ModelArchive.

Author(s):  
Chittaranjan Baruah ◽  
Papari Devi ◽  
Dhirendra K. Sharma

ABSTRACTSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (2019-nCoV), is a positive-sense, single-stranded RNA coronavirus. The virus is the causative agent of coronavirus disease 2019 (COVID-19) and is contagious through human-to-human transmission. The present study reports sequence analysis, complete coordinate tertiary structure prediction and in silico sequence-based and structure-based functional characterization of full SARS-CoV-2 proteome based on the NCBI reference sequence NC_045512 (29903 bp ss-RNA) which is identical to GenBank entry MN908947 and MT415321. The proteome includes 12 major proteins namely orf1ab polyprotein (includes 15 proteins), surface glycoprotein, ORF3a protein, envelope protein, membrane glycoprotein, ORF6 protein, ORF7a protein, orf7b, ORF8, Nucleocapsid phosphoprotein and ORF10 protein. Each protein of orf1ab polyprotein group has been studied separately. A total of 25 polypeptides have been analyzed out of which 15 proteins are not yet having experimental structures and only 10 are having experimental structures with known PDB IDs. Out of 15 newly predicted structures six (6) were predicted using comparative modeling and nine (09) proteins having no significant similarity with so far available PDB structures were modeled using ab-initio modeling. Structure verification using recent tools QMEANDisCo 4.0.0 and ProQ3 for global and local (per-residue) quality estimates indicate that the all-atom model of tertiary structure of high quality and may be useful for structure-based drug designing targets. The study has identified nine major targets (spike protein, envelop protein, membrane protein, nucleocapsid protein, 2’-O-ribose methyltransferase, endoRNAse, 3’-to-5’ exonuclease, RNA-dependent RNA polymerase and helicase) for which drug design targets could be considered. There are other 16 nonstructural proteins (NSPs), which may also be percieved from the drug design angle. The protein structures have been deposited to ModelArchive. Tunnel analysis revealed the presence of large number of tunnels in NSP3, ORF 6 protein and membrane glycoprotein indicating a large number of transport pathways for small ligands influencing their reactivity.


2021 ◽  
Vol 12 (3) ◽  
pp. 3259-3304

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that transmitted from animal to human became a life-threatening pandemic in 2020. Scientists are currently testing several drugs to eradicate the COVID-19 outbreak. However, there is no 100 % effective drug or vaccine against SARS-CoV-2 has been discovered so far. In this study, we explored the structure prediction and functional analysis of 75 Malaysia SARS-CoV-2 strain’s structural and accessory proteins without the presence of experimental models. Physiochemical analysis, secondary structure analysis, structure prediction, functional characterization, active site identification, and evolutionary analysis based on the amino acid sequences retrieved from National Centre for Biotechnology Information (NCBI). Three-dimensional (3-D) protein structures were built using the Swiss model. The quality of protein models was verified by ERRAT, PROCHECK, and Verify 3D tools. Active prediction analysis revealed the high potential active sites of proteins where the anti-viral drug or vaccine may bind and inhibit the viral activities. Molecular phylogenetic analysis of ORF10, ORF8, and ORF6 proteins from five different species was analyzed. The results from this analysis proved that Homo sapiens SARS-CoV-2 had high genetic similarity with the bat coronavirus. These analyses may help in designing structure-based anti-viral drugs or to develop potential vaccines for SARS-CoV-2.


2016 ◽  
Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

Background: Amoebiasis is the third most common parasitic cause of morbidity and mortality particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence arises a necessity for a better diagnostic approach. Serine-rich Entamoeba histolytica protein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal in E. histolytica virulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential which are not available till date. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using relevant in-silico methods. Methods:The amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out. Results: The protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be a extracellular protein, peroxiredoxin was a peripheral membrane protein, while Gal/GalAc was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All three proteins exhibited similarity in their structures and were mostly composed of loops. Discussion:The structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of three sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.


2016 ◽  
Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

Background: Amoebiasis is the third most common parasitic cause of morbidity and mortality particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence arises a necessity for a better diagnostic approach. Serine-rich Entamoeba histolytica protein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal in E. histolytica virulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential which are not available till date. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using relevant in-silico methods. Methods:The amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out. Results: The protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be a extracellular protein, peroxiredoxin was a peripheral membrane protein, while Gal/GalAc was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All three proteins exhibited similarity in their structures and were mostly composed of loops. Discussion:The structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of three sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.


2020 ◽  
Vol 17 (2) ◽  
pp. 125-132
Author(s):  
Marjanu Hikmah Elias ◽  
Noraziah Nordin ◽  
Nazefah Abdul Hamid

Background: Chronic Myeloid Leukaemia (CML) is associated with the BCRABL1 gene, which plays a central role in the pathogenesis of CML. Thus, it is crucial to suppress the expression of BCR-ABL1 in the treatment of CML. MicroRNA is known to be a gene expression regulator and is thus a good candidate for molecularly targeted therapy for CML. Objective: This study aims to identify the microRNAs from edible plants targeting the 3’ Untranslated Region (3’UTR) of BCR-ABL1. Methods: In this in silico analysis, the sequence of 3’UTR of BCR-ABL1 was obtained from Ensembl Genome Browser. PsRNATarget Analysis Server and MicroRNA Target Prediction (miRTar) Server were used to identify miRNAs that have binding conformity with 3’UTR of BCR-ABL1. The MiRBase database was used to validate the species of plants expressing the miRNAs. The RNAfold web server and RNA COMPOSER were used for secondary and tertiary structure prediction, respectively. Results: In silico analyses revealed that cpa-miR8154, csi-miR3952, gma-miR4414-5p, mdm-miR482c, osa-miR1858a and osa-miR1858b show binding conformity with strong molecular interaction towards 3’UTR region of BCR-ABL1. However, only cpa-miR- 8154, osa-miR-1858a and osa-miR-1858b showed good target site accessibility. Conclusion: It is predicted that these microRNAs post-transcriptionally inhibit the BCRABL1 gene and thus could be a potential molecular targeted therapy for CML. However, further studies involving in vitro, in vivo and functional analyses need to be carried out to determine the ability of these miRNAs to form the basis for targeted therapy for CML.


PeerJ ◽  
2017 ◽  
Vol 5 ◽  
pp. e3160 ◽  
Author(s):  
Kumar Manochitra ◽  
Subhash Chandra Parija

BackgroundAmoebiasis is the third most common parasitic cause of morbidity and mortality, particularly in countries with poor hygienic settings. There exists an ambiguity in the diagnosis of amoebiasis, and hence there arises a necessity for a better diagnostic approach. Serine-richEntamoeba histolyticaprotein (SREHP), peroxiredoxin and Gal/GalNAc lectin are pivotal inE. histolyticavirulence and are extensively studied as diagnostic and vaccine targets. For elucidating the cellular function of these proteins, details regarding their respective quaternary structures are essential. However, studies in this aspect are scant. Hence, this study was carried out to predict the structure of these target proteins and characterize them structurally as well as functionally using appropriatein-silicomethods.MethodsThe amino acid sequences of the proteins were retrieved from National Centre for Biotechnology Information database and aligned using ClustalW. Bioinformatic tools were employed in the secondary structure and tertiary structure prediction. The predicted structure was validated, and final refinement was carried out.ResultsThe protein structures predicted by i-TASSER were found to be more accurate than Phyre2 based on the validation using SAVES server. The prediction suggests SREHP to be an extracellular protein, peroxiredoxin a peripheral membrane protein while Gal/GalNAc lectin was found to be a cell-wall protein. Signal peptides were found in the amino-acid sequences of SREHP and Gal/GalNAc lectin, whereas they were not present in the peroxiredoxin sequence. Gal/GalNAc lectin showed better antigenicity than the other two proteins studied. All the three proteins exhibited similarity in their structures and were mostly composed of loops.DiscussionThe structures of SREHP and peroxiredoxin were predicted successfully, while the structure of Gal/GalNAc lectin could not be predicted as it was a complex protein composed of sub-units. Also, this protein showed less similarity with the available structural homologs. The quaternary structures of SREHP and peroxiredoxin predicted from this study would provide better structural and functional insights into these proteins and may aid in development of newer diagnostic assays or enhancement of the available treatment modalities.


Author(s):  
Arun G. Ingale

To predict the structure of protein from a primary amino acid sequence is computationally difficult. An investigation of the methods and algorithms used to predict protein structure and a thorough knowledge of the function and structure of proteins are critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this chapter sheds light on the methods used for protein structure prediction. This chapter covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, it presents an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction, giving unique insight into the future applications of the modeled protein structures. In this chapter, current protein structure prediction methods are reviewed for a milieu on structure prediction, the prediction of structural fundamentals, tertiary structure prediction, and functional imminent. The basic ideas and advances of these directions are discussed in detail.


2019 ◽  
Vol 36 (1) ◽  
pp. 104-111
Author(s):  
Shuichiro Makigaki ◽  
Takashi Ishida

Abstract Motivation Template-based modeling, the process of predicting the tertiary structure of a protein by using homologous protein structures, is useful if good templates can be found. Although modern homology detection methods can find remote homologs with high sensitivity, the accuracy of template-based models generated from homology-detection-based alignments is often lower than that from ideal alignments. Results In this study, we propose a new method that generates pairwise sequence alignments for more accurate template-based modeling. The proposed method trains a machine learning model using the structural alignment of known homologs. It is difficult to directly predict sequence alignments using machine learning. Thus, when calculating sequence alignments, instead of a fixed substitution matrix, this method dynamically predicts a substitution score from the trained model. We evaluate our method by carefully splitting the training and test datasets and comparing the predicted structure’s accuracy with that of state-of-the-art methods. Our method generates more accurate tertiary structure models than those produced from alignments obtained by other methods. Availability and implementation https://github.com/shuichiro-makigaki/exmachina. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document