scholarly journals Genomic Mutations and Changes in Protein Secondary Structure and Solvent Accessibility of SARS-CoV-2 (COVID-19 Virus)

2020 ◽  
Author(s):  
Thanh Thi Nguyen ◽  
Pubudu N. Pathirana ◽  
Thin Nguyen ◽  
Henry Nguyen ◽  
Asim Bhatti ◽  
...  

ABSTRACTSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6,324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Thanh Thi Nguyen ◽  
Pubudu N. Pathirana ◽  
Thin Nguyen ◽  
Quoc Viet Hung Nguyen ◽  
Asim Bhatti ◽  
...  

AbstractSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.


2012 ◽  
Vol 1 (1) ◽  
pp. 79-87 ◽  
Author(s):  
Shangping Wang ◽  
Harriëtte Oldenhof ◽  
Andres Hilfiker ◽  
Michael Harder ◽  
Willem F. Wolkers

2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Xue Zhao ◽  
Ran Zhang ◽  
Shunying Yu

Abstract Background 15q11–13 region is one of the most complex chromosomal regions in the human genome. UBE3A is an important candidate gene of autism spectrum disorder (ASD), which located at the 15q11–13 region and encodes ubiquitin-protein ligase E3A. Previous studies about UBE3A gene and ASD have shown inconsistent results and few studies were performed in Chinese population. This study aimed to detect the genetic mutations of UBE3A gene in Chinese Han population with ASD and analyze genetic association between these variants and ASD. Methods The samples consisted of 192 patients with autism according to the DSM-IV diagnostic criteria and 192 healthy controls. We searched for mutations at coding sequence (CDS) regions and their adjacent non-coding regions of UBE3A gene using the high resolution melting (HRM) and Sanger sequencing methods. We further increased sample size to validate the detected variants using HRM and conducted association analysis between case and control groups. Results A known single nucleotide polymorphism (T > C, rs150331504) located at the CDS4 and a known 5 bp insertion/deletion variation (AACTC+/−, rs71127053) located at the intron region of the upstream 288 bp of the CDS2 of UBE3A gene were detected using Sanger sequencing method. The ASD samples of case group were 391 for rs71127053, 384 for rs150331504 and 384 healthy controls, which were used to make an association analysis. The results of association analysis suggested that there were no significant difference about the allele and genotype frequencies of rs71127053 and rs150331504 between case and control groups after extending the sample size. Besides, rs150331504 is a synonymous mutation and we compared the secondary structure and minimum free energy (MFE) of mRNA harboring the allele T or C of rs150331504 using RNAfold software. We found that the centroid secondary structure apparently differs along with the polymorphisms of rs150331504 T > C, the results suggested that this variant might change the secondary structure of mRNA of UBE3A gene. We did not detect mutations in other coding regions of UBE3A gene. Conclusions These findings showed that UBE3A gene might not be a major disease gene in Chinese ASD cases.


2013 ◽  
Author(s):  
◽  
Xin Deng

Protein sequence and profile alignment has been used essentially in most bioinformatics tasks such as protein structure modeling, function prediction, and phylogenetic analysis. We designed a new algorithm MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into multiple protein sequence alignment. Our experiments showed that it improved multiple sequence alignment accuracy over most existing methods without using the structural information and performed comparably to the method using structural features and additional homologous sequences by slightly lower scores. We also developed HHpacom, a new profile-profile pairwise alignment by integrating secondary structure, solvent accessibility, torsion angle and inferred residue pair coupling information. The evaluation showed that the secondary structure, relative solvent accessibility and torsion angle information significantly improved the alignment accuracy in comparison with the state of the art methods HHsearch and HHsuite. The evolutionary constraint information did help in some cases, especially the alignments of the proteins which are of short lengths, typically 100 to 500 residues. Protein Model selection is also a key step in protein tertiary structure prediction. We developed two SVM model quality assessment methods taking query-template alignment as input. The assessment results illustrated that this could help improve the model selection, protein structure prediction and many other bioinformatics problems. Moreover, we also developed a protein tertiary structure prediction pipeline, of which many components were built in our group’s MULTICOM system. The MULTICOM performed well in the CASP10 (Critical Assessment of Techniques for Protein Structure Prediction) competition.


Sign in / Sign up

Export Citation Format

Share Document