SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity

C. N. Magnan; P. Baldi

doi:10.1093/bioinformatics/btu352

Porter, PaleAle 4.0: high-accuracy prediction of protein secondary structure and relative solvent accessibility

Bioinformatics ◽

10.1093/bioinformatics/btt344 ◽

2013 ◽

Vol 29 (16) ◽

pp. 2056-2058 ◽

Cited By ~ 61

Author(s):

C. Mirabello ◽

G. Pollastri

Keyword(s):

Secondary Structure ◽

Solvent Accessibility ◽

Protein Secondary Structure ◽

High Accuracy ◽

Relative Solvent Accessibility

Download Full-text

Genomic mutations and changes in protein secondary structure and solvent accessibility of SARS-CoV-2 (COVID-19 virus)

Scientific Reports ◽

10.1038/s41598-021-83105-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Thanh Thi Nguyen ◽

Pubudu N. Pathirana ◽

Thin Nguyen ◽

Quoc Viet Hung Nguyen ◽

Asim Bhatti ◽

...

Keyword(s):

Secondary Structure ◽

Solvent Accessibility ◽

Point Mutations ◽

Protein Secondary Structure ◽

Intervention Strategies ◽

Relative Solvent Accessibility ◽

Highly Pathogenic ◽

Coding Regions ◽

Pathogenic Virus ◽

And Control

AbstractSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.

Download Full-text

Genomic Mutations and Changes in Protein Secondary Structure and Solvent Accessibility of SARS-CoV-2 (COVID-19 Virus)

10.1101/2020.07.10.171769 ◽

2020 ◽

Author(s):

Thanh Thi Nguyen ◽

Pubudu N. Pathirana ◽

Thin Nguyen ◽

Henry Nguyen ◽

Asim Bhatti ◽

...

Keyword(s):

Secondary Structure ◽

Solvent Accessibility ◽

Point Mutations ◽

Protein Secondary Structure ◽

Intervention Strategies ◽

Relative Solvent Accessibility ◽

Highly Pathogenic ◽

Coding Regions ◽

Pathogenic Virus ◽

And Control

ABSTRACTSevere acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6,324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.

Download Full-text

Fast learning optimized prediction methodology for protein secondary structure prediction, relative solvent accessibility prediction and phosphorylation prediction

10.31274/etd-180810-4315 ◽

2011 ◽

Author(s):

Saraswathi Sundararajan

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Solvent Accessibility ◽

Protein Secondary Structure ◽

Relative Solvent Accessibility ◽

Protein Secondary Structure Prediction ◽

Fast Learning ◽

Solvent Accessibility Prediction

Download Full-text

Protein secondary structure and solvent accessibility of proteins in decellularized heart valve scaffolds

Biomedical Spectroscopy and Imaging ◽

10.3233/bsi-2012-0007 ◽

2012 ◽

Vol 1 (1) ◽

pp. 79-87 ◽

Cited By ~ 3

Author(s):

Shangping Wang ◽

Harriëtte Oldenhof ◽

Andres Hilfiker ◽

Michael Harder ◽

Willem F. Wolkers

Keyword(s):

Secondary Structure ◽

Heart Valve ◽

Solvent Accessibility ◽

Protein Secondary Structure

Download Full-text

Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information

BMC Bioinformatics ◽

10.1186/1471-2105-8-201 ◽

2007 ◽

Vol 8 (1) ◽

Cited By ~ 74

Author(s):

Gianluca Pollastri ◽

Alberto JM Martin ◽

Catherine Mooney ◽

Alessandro Vullo

Keyword(s):

Secondary Structure ◽

Solvent Accessibility ◽

Protein Secondary Structure ◽

Accurate Prediction ◽

Structure Information

Download Full-text

Determination of Protein Secondary Structure and Solvent Accessibility Using Site-Directed Fluorescence Labeling. Studies of T4 Lysozyme Using the Fluorescent Probe Monobromobimane†

Biochemistry ◽

10.1021/bi991331v ◽

1999 ◽

Vol 38 (49) ◽

pp. 16383-16393 ◽

Cited By ~ 43

Author(s):

Steven E. Mansoor ◽

Hassane S. Mchaourab ◽

David L. Farrens

Keyword(s):

Secondary Structure ◽

Fluorescent Probe ◽

Solvent Accessibility ◽

Protein Secondary Structure ◽

Fluorescence Labeling ◽

T4 Lysozyme

Download Full-text

Hermes: an ensemble machine learning architecture for protein secondary structure prediction

10.1101/640656 ◽

2019 ◽

Author(s):

Larry Bliss ◽

Ben Pascoe ◽

Samuel K Sheppard

Keyword(s):

Machine Learning ◽

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Cross Validation ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Lower Boundary ◽

Protein Secondary Structure ◽

Homologous Proteins

AbstractMotivationProtein structure predictions, that combine theoretical chemistry and bioinformatics, are an increasingly important technique in biotechnology and biomedical research, for example in the design of novel enzymes and drugs. Here, we present a new ensemble bi-layered machine learning architecture, that directly builds on ten existing pipelines providing rapid, high accuracy, 3-State secondary structure prediction of proteins.ResultsAfter training on 1348 solved protein structures, we evaluated the model with four independent datasets: JPRED4 - compiled by the authors of the successful predictor with the same name, and CASP11, CASP12 & CASP13 - assembled by the Critical Assessment of protein Structure Prediction consortium who run biannual experiments focused on objective testing of predictors. These rigorous, pre-established protocols included 7-fold cross-validation and blind testing. This led to a mean Hermes accuracy of 95.5%, significantly (p<0.05) better than the ten previously published models analysed in this paper. Furthermore, Hermes yielded a reduction in standard deviation, lower boundary outliers, and reduced dependency on solved structures of homologous proteins, as measured by NEFF score. This architecture provides advantages over other pipelines, while remaining accessible to users at any level of bioinformatics experience.Availability and ImplementationThe source code for Hermes is freely available at: https://github.com/HermesPrediction/Hermes. This page also includes the cross-validation with corresponding models, and all training/testing data presented in this study with predictions and accuracy.

Download Full-text

Improved computational methods of protein sequence alignment, model selection and tertiary structure prediction

10.32469/10355/46126 ◽

2013 ◽

Author(s):

◽

Xin Deng

Keyword(s):

Protein Structure ◽

Secondary Structure ◽

Model Selection ◽

Sequence Alignment ◽

Protein Sequence ◽

Structure Prediction ◽

Tertiary Structure ◽

Solvent Accessibility ◽

Relative Solvent Accessibility ◽

Tertiary Structure Prediction

Protein sequence and profile alignment has been used essentially in most bioinformatics tasks such as protein structure modeling, function prediction, and phylogenetic analysis. We designed a new algorithm MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into multiple protein sequence alignment. Our experiments showed that it improved multiple sequence alignment accuracy over most existing methods without using the structural information and performed comparably to the method using structural features and additional homologous sequences by slightly lower scores. We also developed HHpacom, a new profile-profile pairwise alignment by integrating secondary structure, solvent accessibility, torsion angle and inferred residue pair coupling information. The evaluation showed that the secondary structure, relative solvent accessibility and torsion angle information significantly improved the alignment accuracy in comparison with the state of the art methods HHsearch and HHsuite. The evolutionary constraint information did help in some cases, especially the alignments of the proteins which are of short lengths, typically 100 to 500 residues. Protein Model selection is also a key step in protein tertiary structure prediction. We developed two SVM model quality assessment methods taking query-template alignment as input. The assessment results illustrated that this could help improve the model selection, protein structure prediction and many other bioinformatics problems. Moreover, we also developed a protein tertiary structure prediction pipeline, of which many components were built in our groupâ€™s MULTICOM system. The MULTICOM performed well in the CASP10 (Critical Assessment of Techniques for Protein Structure Prediction) competition.

Download Full-text