scholarly journals PSICA: a fast and accurate web service for protein model quality analysis

2019 ◽  
Vol 47 (W1) ◽  
pp. W443-W450 ◽  
Author(s):  
Wenbo Wang ◽  
Zhaoyu Li ◽  
Junlin Wang ◽  
Dong Xu ◽  
Yi Shang

Abstract This paper presents a new fast and accurate web service for protein model quality analysis, called PSICA (Protein Structural Information Conformity Analysis). It is designed to evaluate how much a tertiary model of a given protein primary sequence conforms to the known protein structures of similar protein sequences, and to evaluate the quality of predicted protein models. PSICA implements the MUfoldQA_S method, an efficient state-of-the-art protein model quality assessment (QA) method. In CASP12, MUfoldQA_S ranked No. 1 in the protein model QA select-20 category in terms of the difference between the predicted and true GDT-TS value of each model. For a given predicted 3D model, PSICA generates (i) predicted global GDT-TS value; (ii) interactive comparison between the model and other known protein structures; (iii) visualization of the predicted local quality of the model; and (iv) JSmol rendering of the model. Additionally, PSICA implements MUfoldQA_C, a new consensus method based on MUfoldQA_S. In CASP12, MUfoldQA_C ranked No. 1 in top 1 model GDT-TS loss on the select-20 QA category and No. 2 in the average difference between the predicted and true GDT-TS value of each model for both select-20 and best-150 QA categories. The PSICA server is freely available at http://qas.wangwb.com/∼wwr34/mufoldqa/index.html.

2018 ◽  
Author(s):  
Karolis Uziela ◽  
David Menéndez Hurtado ◽  
Nanjiang Shu ◽  
Björn Wallner ◽  
Arne Elofsson

AbstractProtein modeling quality is an important part of protein structure prediction. We have for more than a decade developed a set of methods for this problem. We have used various types of description of the protein and different machine learning methodologies. However, common to all these methods has been the target function used for training. The target function in ProQ describes the local quality of a residue in a protein model. In all versions of ProQ the target function has been the S-score. However, other quality estimation functions also exist, which can be divided into superposition- and contact-based methods. The superposition-based methods, such as S-score, are based on a rigid body superposition of a protein model and the native structure, while the contact-based methods compare the local environment of each residue. Here, we examine the effects of retraining our latest predictor, ProQ3D, using identical inputs but different target functions. We find that the c ntact-based methods are easier to predict and that predictors trained on these measures provide some advantages when it comes to identifying the best model. One possible reason for this is that contact based methods are better at estimating the quality of multi-domain targets. However, training on the S-score gives the best correlation with the GDT_TS score, which is commonly used in CASP to score the global model quality. To take the advantage of both of these features we provide an updated version of ProQ3D that predicts local and global model quality estimates based on different quality estimates.


Author(s):  
Enes Sari ◽  
Levent FAZLI Umur

BACKGROUND:The aim of this study was to evaluate the information quality of YouTube videos on hallux valgus. METHODS:A YouTube search was performed using the keyword 'hallux valgus' to determine the first 300 videos related to hallux valgus. A total of 54 videos met our inclusion criteria and evaluated for information quality by using DISCERN, Journal of the American Medical Association (JAMA) and hallux valgus information assessment (HAVIA) scores. Number of views, time since the upload date, view rate, number of comments, number of likes, number of dislikes, video power index (VPI) values were calculated to determine video popularity. Video length (sec), video source and video content were also noted. The relation between information quality and these factors were statistically evaluated. RESULTS:The mean DISCERN score was 30.35{plus minus}11.56 (poor quality) (14-64), the mean JAMA score was 2.28{plus minus}0.96 (1-4), and the mean HAVIA score was 3.63{plus minus}2.42 (moderate quality) (0.5-8.5). Although videos uploaded by physicians had higher mean DISCERN, JAMA, and HAVIA scores than videos uploaded by non-physicians, the difference was not statistically significant. Additionally, view rates and VPI values were higher for videos uploaded by health channels, but the difference did not reach statistical significance. A statistically significant positive correlation was found between video length and DISCERN (r= 0.294, p= 0.028), and HAVIA scores (r= 0.326, p= 0.015). CONCLUSIONS:This present study demonstrated that the quality of information available on YouTube videos about hallux valgus was low and insufficient. Videos containing accurate information from reliable sources are needed to educate patients on hallux valgus, especially in less frequently mentioned topics such as postoperative complications and healing period.


2020 ◽  
Vol 76 (3) ◽  
pp. 285-290
Author(s):  
Björn Wallner

Model quality assessment programs estimate the quality of protein models and can be used to estimate local error in protein models. ProQ3D is the most recent and most accurate version of our software. Here, it is demonstrated that it is possible to use local error estimates to substantially increase the quality of the models for molecular replacement (MR). Adjusting the B factors using ProQ3D improved the log-likelihood gain (LLG) score by over 50% on average, resulting in significantly more successful models in MR compared with not using error estimates. On a data set of 431 homology models to address difficult MR targets, models with error estimates from ProQ3D received an LLG of >50 for almost half of the models 209/431 (48.5%), compared with 175/431 (40.6%) for the previous version, ProQ2, and only 74/431 (17.2%) for models with no error estimates, clearly demonstrating the added value of using error estimates to enable MR for more targets. ProQ3D is available from http://proq3.bioinfo.se/ both as a server and as a standalone download.


F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 243 ◽  
Author(s):  
Sandeep Chakraborty ◽  
Ravindra Venkatramani ◽  
Basuthkar J. Rao ◽  
Bjarni Asgeirsson ◽  
Abhaya M. Dandekar

The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.


F1000Research ◽  
2014 ◽  
Vol 2 ◽  
pp. 243
Author(s):  
Sandeep Chakraborty ◽  
Ravindra Venkatramani ◽  
Basuthkar J. Rao ◽  
Bjarni Asgeirsson ◽  
Abhaya M. Dandekar

The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.


F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 243 ◽  
Author(s):  
Sandeep Chakraborty ◽  
Ravindra Venkatramani ◽  
Basuthkar J. Rao ◽  
Bjarni Asgeirsson ◽  
Abhaya M. Dandekar

The structure of a protein provides insight into its physiological interactions with other components of the cellular soup. Methods that predict putative structures from sequences typically yield multiple, closely-ranked possibilities. A critical component in the process is the model quality assessing program (MQAP), which selects the best candidate from this pool of structures. Here, we present a novel MQAP based on the physical properties of sidechain atoms. We propose a method for assessing the quality of protein structures based on the electrostatic potential difference (EPD) of Cβ atoms in consecutive residues. We demonstrate that the EPDs of Cβ atoms on consecutive residues provide unique signatures of the amino acid types. The EPD of Cβ atoms are learnt from a set of 1000 non-homologous protein structures with a resolution cuto of 1.6 Å obtained from the PISCES database. Based on the Boltzmann hypothesis that lower energy conformations are proportionately sampled more, and on Annsen's thermodynamic hypothesis that the native structure of a protein is the minimum free energy state, we hypothesize that the deviation of observed EPD values from the mean values obtained in the learning phase is minimized in the native structure. We achieved an average specificity of 0.91, 0.94 and 0.93 on hg_structal, 4state_reduced and ig_structal decoy sets, respectively, taken from the Decoys `R' Us database. The source code and manual is made available at https://github.com/sanchak/mqap and permanently available on 10.5281/zenodo.7134.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Haiteng Zhang ◽  
Zhiqing Shao ◽  
Hong Zheng ◽  
Jie Zhai

In the early service transactions, quality of service (QoS) information was published by service provider which was not always true and credible. For better verification the trust of the QoS information was provided by the Web service. In this paper, the factual QoS running data are collected by our WS-QoS measurement tool; based on these objectivity data, an algorithm compares the difference of the offered and measured quality data of the service and gives the similarity, and then a reputation evaluation method computes the reputation level of the Web service based on the similarity. The initial implementation and experiment with three Web services' example show that this approach is feasible and these values can act as the references for subsequent consumers to select the service.


2009 ◽  
Vol 07 (05) ◽  
pp. 789-810 ◽  
Author(s):  
XIN GAO ◽  
JINBO XU ◽  
SHUAI CHENG LI ◽  
MING LI

Although protein structure prediction has made great progress in recent years, a protein model derived from automated prediction methods is subject to various errors. As methods for structure prediction develop, a continuing problem is how to evaluate the quality of a protein model, especially to identify some well-predicted regions of the model, so that the structural biology community can benefit from the automated structure prediction. It is also important to identify badly-predicted regions in a model so that some refinement measurements can be applied to it. We present two complementary techniques, FragQA and PosQA, to accurately predict local quality of a sequence–structure (i.e. sequence–template) alignment generated by comparative modeling (i.e. homology modeling and threading). FragQA and PosQA predict local quality from two different perspectives. Different from existing methods, FragQA directly predicts cRMSD between a continuously aligned fragment determined by an alignment and the corresponding fragment in the native structure, while PosQA predicts the quality of an individual aligned position. Both FragQA and PosQA use an SVM (Support Vector Machine) regression method to perform prediction using similar information extracted from a single given alignment. Experimental results demonstrate that FragQA performs well on predicting local fragment quality, and PosQA outperforms two top-notch methods, ProQres and ProQprof. Our results indicate that (1) local quality can be predicted well; (2) local sequence evolutionary information (i.e. sequence similarity) is the major factor in predicting local quality; and (3) structural information such as solvent accessibility and secondary structure helps to improve the prediction performance.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xiao Chen ◽  
Jian Liu ◽  
Zhiye Guo ◽  
Tianqi Wu ◽  
Jie Hou ◽  
...  

AbstractThe inter-residue contact prediction and deep learning showed the promise to improve the estimation of protein model accuracy (EMA) in the 13th Critical Assessment of Protein Structure Prediction (CASP13). To further leverage the improved inter-residue distance predictions to enhance EMA, during the 2020 CASP14 experiment, we integrated several new inter-residue distance features with the existing model quality assessment features in several deep learning methods to predict the quality of protein structural models. According to the evaluation of performance in selecting the best model from the models of CASP14 targets, our three multi-model predictors of estimating model accuracy (MULTICOM-CONSTRUCT, MULTICOM-AI, and MULTICOM-CLUSTER) achieve the averaged loss of 0.073, 0.079, and 0.081, respectively, in terms of the global distance test score (GDT-TS). The three methods are ranked first, second, and third out of all 68 CASP14 predictors. MULTICOM-DEEP, the single-model predictor of estimating model accuracy (EMA), is ranked within top 10 among all the single-model EMA methods according to GDT-TS score loss. The results demonstrate that inter-residue distance features are valuable inputs for deep learning to predict the quality of protein structural models. However, larger training datasets and better ways of leveraging inter-residue distance information are needed to fully explore its potentials.


2021 ◽  
Author(s):  
Xiao Chen ◽  
Jianling Cheng

AbstractBackgroundEstimation of the accuracy (quality) of protein structural models is important for both prediction and use of protein structural models. Deep learning methods have been used to integrate protein structure features to predict the quality of protein models. Inter-residue distances are key information for predicting protein’s tertiary structures and therefore have good potentials to predict the quality of protein structural models. However, few methods have been developed to fully take advantage of predicted inter-residue distance maps to estimate the accuracy of a single protein structural model.ResultWe developed an attentive 2D convolutional neural network (CNN) with channel-wise attention to take only a raw difference map between the inter-residue distance map calculated from a single protein model and the distance map predicted from the protein sequence as input to predict the quality of the model. The network comprises multiple convolutional layers, batch normalization layers, dense layers, and Squeeze-and-Excitation blocks with attention to automatically extract features relevant to protein model quality from the raw input without using any expert-curated features. We evaluated DISTEMA’s capability of selecting the best models for CASP13 targets in terms of ranking loss of GDT-TS score. The ranking loss of DISTEMA is 0.079, lower than several state-of-the-art single-model quality assessment methods. The work demonstrates that using raw inter-residue distance information alone with deep learning can predict the quality of protein structural models reasonably well.


Sign in / Sign up

Export Citation Format

Share Document