In silico Proteome analysis of Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Preprint)
BACKGROUND Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a positive-sense, single-stranded RNA coronavirus. The virus is the causative agent of coronavirus disease 2019 (COVID-19) and is contagious through human-to-human transmission. The RNA genome of SARS-CoV-2 encodes 29 proteins, though one may not get expressed. 15 proteins are not yet having experimental structures for investigation on possible drug targets. OBJECTIVE The present study reports sequence analysis, complete coordinate tertiary structure prediction and in silico sequence-based and structure-based functional characterization of full SARS-CoV-2 proteome based on the NCBI reference sequence NC_045512 (29903 bp ss-RNA). METHODS A total of 25 polypeptides have been analyzed out of which 15 proteins are not yet having experimental structures and only 10 are having experimental structures with known PDB IDs. Out of 15 newly predicted structures six (6) were predicted using comparative modeling and nine (09) proteins having no significant similarity with so far available PDB structures were modeled using ab-initio modeling. QMEANDisCo 4.0.0 and ProQ3 for global and local (per-residue) quality estimates is used for structure verification. RESULTS The all-atom model of tertiary structure of high quality and may be useful for structure-based drug designing targets. The study has identified along with nine major targets sixteen nonstructural proteins (NSPs), which may be equally important from the drug design angle. Tunnel analysis revealed the presence of large number of tunnels in NSP3, ORF 6 protein and membrane glycoprotein indicating a large number of transport pathways for small ligands influencing their reactivity. CONCLUSIONS The 15 theoretical structures would perhaps be useful for the scientific community for advanced computational analysis on interactions of each protein for detailed functional analysis of active sites towards structure based drug designing or to study potential vaccines, if at all, towards preventing epidemics and pandemics in absence of complete experimental structure. CLINICALTRIAL The protein structures have been deposited to ModelArchive.