CONCORD: a consensus method for protein secondary structure prediction via mixed integer linear optimization

Most of the protein structure prediction methods use a multi-step process, which often includes secondary structure prediction, contact prediction, fragment generation, clustering, etc. For many years, secondary structure prediction has been the workhorse for numerous methods aimed at predicting protein structure and function. This paper presents a new mixed integer linear optimization (MILP)-based consensus method: a Consensus scheme based On a mixed integer liNear optimization method for seCOndary stRucture preDiction (CONCORD). Based on seven secondary structure prediction methods, SSpro, DSC, PROF, PROFphd, PSIPRED, Predator and GorIV, the MILP-based consensus method combines the strengths of different methods, maximizes the number of correctly predicted amino acids and achieves a better prediction accuracy. The method is shown to perform well compared with the seven individual methods when tested on the PDBselect25 training protein set using sixfold cross validation. It also performs well compared with another set of 10 online secondary structure prediction servers (including several recent ones) when tested on the CASP9 targets ( http://predictioncenter.org/casp9/ ). The average Q3 prediction accuracy is 83.04 per cent for the sixfold cross validation of the PDBselect25 set and 82.3 per cent for the CASP9 targets. We have developed a MILP-based consensus method for protein secondary structure prediction. A web server, CONCORD, is available to the scientific community at http://helios.princeton.edu/CONCORD .

Download Full-text

Hermes: an ensemble machine learning architecture for protein secondary structure prediction

10.1101/640656 ◽

2019 ◽

Author(s):

Larry Bliss ◽

Ben Pascoe ◽

Samuel K Sheppard

Keyword(s):

Machine Learning ◽

Protein Structure ◽

Secondary Structure ◽

Structure Prediction ◽

Cross Validation ◽

Secondary Structure Prediction ◽

Protein Structures ◽

Lower Boundary ◽

Protein Secondary Structure ◽

Homologous Proteins

AbstractMotivationProtein structure predictions, that combine theoretical chemistry and bioinformatics, are an increasingly important technique in biotechnology and biomedical research, for example in the design of novel enzymes and drugs. Here, we present a new ensemble bi-layered machine learning architecture, that directly builds on ten existing pipelines providing rapid, high accuracy, 3-State secondary structure prediction of proteins.ResultsAfter training on 1348 solved protein structures, we evaluated the model with four independent datasets: JPRED4 - compiled by the authors of the successful predictor with the same name, and CASP11, CASP12 & CASP13 - assembled by the Critical Assessment of protein Structure Prediction consortium who run biannual experiments focused on objective testing of predictors. These rigorous, pre-established protocols included 7-fold cross-validation and blind testing. This led to a mean Hermes accuracy of 95.5%, significantly (p<0.05) better than the ten previously published models analysed in this paper. Furthermore, Hermes yielded a reduction in standard deviation, lower boundary outliers, and reduced dependency on solved structures of homologous proteins, as measured by NEFF score. This architecture provides advantages over other pipelines, while remaining accessible to users at any level of bioinformatics experience.Availability and ImplementationThe source code for Hermes is freely available at: https://github.com/HermesPrediction/Hermes. This page also includes the cross-validation with corresponding models, and all training/testing data presented in this study with predictions and accuracy.

Download Full-text

COMTOP: Protein Residue–Residue Contact Prediction through Mixed Integer Linear Optimization

Membranes ◽

10.3390/membranes11070503 ◽

2021 ◽

Vol 11 (7) ◽

pp. 503

Author(s):

Md. Selim Reza ◽

Huiling Zhang ◽

Md. Tofazzal Hossain ◽

Langxi Jin ◽

Shengzhong Feng ◽

...

Keyword(s):

Prediction Accuracy ◽

Linear Optimization ◽

Mixed Integer ◽

Consensus Method ◽

Contact Prediction ◽

Residue Contact ◽

Mixed Integer Linear Optimization ◽

Test Sets ◽

Contact Predictions ◽

Integer Linear Optimization

Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant ɑ-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors.

Download Full-text

Faculty Opinions recommendation of Protein secondary structure prediction using deep convolutional neural fields.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726067888.793548469 ◽

2018 ◽

Author(s):

Patrice Koehl

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Neural Fields ◽

Protein Secondary Structure Prediction

Download Full-text

A Systematic Review on Popularity, Application and Characteristics of Protein Secondary Structure Prediction Tools

Current Drug Discovery Technologies ◽

10.2174/1570163815666180227162157 ◽

2019 ◽

Vol 16 (2) ◽

pp. 159-172 ◽

Cited By ~ 3

Author(s):

Elaheh Kashani-Amin ◽

Ozra Tabatabaei-Malazy ◽

Amirhossein Sakhteman ◽

Bagher Larijani ◽

Azadeh Ebrahim-Habibi

Keyword(s):

Systematic Review ◽

Secondary Structure ◽

Structure Prediction ◽

Web Of Science ◽

Secondary Structure Prediction ◽

Structural Information ◽

Protein Secondary Structure ◽

Structural Features ◽

Prediction Tools ◽

Insight Into

Background: Prediction of proteins’ secondary structure is one of the major steps in the generation of homology models. These models provide structural information which is used to design suitable ligands for potential medicinal targets. However, selecting a proper tool between multiple Secondary Structure Prediction (SSP) options is challenging. The current study is an insight into currently favored methods and tools, within various contexts. Objective: A systematic review was performed for a comprehensive access to recent (2013-2016) studies which used or recommended protein SSP tools. Methods: Three databases, Web of Science, PubMed and Scopus were systematically searched and 99 out of the 209 studies were finally found eligible to extract data. Results: Four categories of applications for 59 retrieved SSP tools were: (I) prediction of structural features of a given sequence, (II) evaluation of a method, (III) providing input for a new SSP method and (IV) integrating an SSP tool as a component for a program. PSIPRED was found to be the most popular tool in all four categories. JPred and tools utilizing PHD (Profile network from HeiDelberg) method occupied second and third places of popularity in categories I and II. JPred was only found in the two first categories, while PHD was present in three fields. Conclusion: This study provides a comprehensive insight into the recent usage of SSP tools which could be helpful for selecting a proper tool.

Download Full-text

Identification and application of the concepts important for accurate and reliable protein secondary structure prediction

Protein Science ◽

10.1002/pro.5560051116 ◽

1996 ◽

Vol 5 (11) ◽

pp. 2298-2310 ◽

Cited By ~ 299

Author(s):

Ross D. King ◽

Michael J.E. Sternberg

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Protein Secondary Structure Prediction

Download Full-text

Mixed-integer linear optimization for full truckload pickup and delivery

Optimization Letters ◽

10.1007/s11590-021-01736-x ◽

2021 ◽

Author(s):

Akang Wang ◽

Nicholas Ferro ◽

Rita Majewski ◽

Chrysanthos E. Gounaris

Keyword(s):

Linear Optimization ◽

Mixed Integer ◽

Pickup And Delivery ◽

Mixed Integer Linear Optimization ◽

Integer Linear Optimization

Download Full-text

Protein Secondary Structure Prediction Using Support Vector Machines (SVMs)

2013 International Conference on Machine Intelligence and Research Advancement ◽

10.1109/icmira.2013.124 ◽

2013 ◽

Author(s):

Hitesh Shah

Keyword(s):

Support Vector Machines ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Support Vector ◽

Protein Secondary Structure Prediction ◽

Vector Machines

Download Full-text

Protein Secondary Structure Prediction using Bayesian Inference method on Decision fusion algorithms

2007 IEEE International Parallel and Distributed Processing Symposium ◽

10.1109/ipdps.2007.370430 ◽

2007 ◽

Cited By ~ 1

Author(s):

Somasheker Akkaladevi ◽

Ajay K Katangur

Keyword(s):

Bayesian Inference ◽

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Decision Fusion ◽

Inference Method ◽

Protein Secondary Structure Prediction ◽

Bayesian Inference Method

Download Full-text

Capturing hydrophobic moment using spectral coherence for protein secondary structure prediction

2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB) ◽

10.1109/cibcb.2013.6595403 ◽

2013 ◽

Author(s):

Pradeep Chowriappa ◽

Sumeet Dua

Keyword(s):

Secondary Structure ◽

Structure Prediction ◽

Secondary Structure Prediction ◽

Protein Secondary Structure ◽

Spectral Coherence ◽

Protein Secondary Structure Prediction ◽

Hydrophobic Moment

Download Full-text

In Silico Study of Secondary Structure of Hemoglobin Protein

Research Journal of Pharmacy and Technology ◽

10.52711/0974-360x.2021.01080 ◽

2021 ◽

pp. 6245-6249

Author(s):

Roma Chandra

Keyword(s):

Secondary Structure ◽

Protein Sequence ◽

Structure Prediction ◽

Tertiary Structure ◽

Secondary Structure Prediction ◽

Three Dimensional ◽

Protein Secondary Structure ◽

Alpha Helix ◽

Prediction Methods ◽

Protein Secondary Structures

Protein structure prediction is one of the important goals in the area of bioinformatics and biotechnology. Prediction methods include structure prediction of both secondary and tertiary structures of protein. Protein secondary structure prediction infers knowledge related to presence of helixes, sheets and coils in a polypeptide chain whereas protein tertiary structure prediction infers knowledge related to three dimensional structures of proteins. Protein secondary structures represent the possible motifs or regular expressions represented as patterns that are predicted from primary protein sequence in the form of alpha helix, betastr and and coils. The secondary structure prediction is useful as it infers information related to the structure and function of unknown protein sequence. There are various secondary structure prediction methods used to predict about helixes, sheets and coils. Based on these methods there are various prediction tools under study. This study includes prediction of hemoglobin using various tools. The results produced inferred knowledge with reference to percentage of amino acids participating to produce helices, sheets and coils. PHD and DSC produced the best of the results out of all the tools used.

Download Full-text