Inter-protein residue covariation information unravels physically interacting protein dimers

Abstract Background Predicting physical interaction between proteins is one of the greatest challenges in computational biology. There are considerable various protein interactions and a huge number of protein sequences and synthetic peptides with unknown interacting counterparts. Most of co-evolutionary methods discover a combination of physical interplays and functional associations. However, there are only a handful of approaches which specifically infer physical interactions. Hybrid co-evolutionary methods exploit inter-protein residue coevolution to unravel specific physical interacting proteins. In this study, we introduce a hybrid co-evolutionary-based approach to predict physical interplays between pairs of protein families, starting from protein sequences only. Results In the present analysis, pairs of multiple sequence alignments are constructed for each dimer and the covariation between residues in those pairs are calculated by CCMpred (Contacts from Correlated Mutations predicted) and three mutual information based approaches for ten accessible surface area threshold groups. Then, whole residue couplings between proteins of each dimer are unified into a single Frobenius norm value. Norms of residue contact matrices of all dimers in different accessible surface area thresholds are fed into support vector machine as single or multiple feature models. The results of training the classifiers by single features show no apparent different accuracies in distinct methods for different accessible surface area thresholds. Nevertheless, mutual information product and context likelihood of relatedness procedures may roughly have an overall higher and lower performances than other two methods for different accessible surface area cut-offs, respectively. The results also demonstrate that training support vector machine with multiple norm features for several accessible surface area thresholds leads to a considerable improvement of prediction performance. In this context, CCMpred roughly achieves an overall better performance than mutual information based approaches. The best accuracy, sensitivity, specificity, precision and negative predictive value for that method are 0.98, 1, 0.962, 0.96, and 0.962, respectively. Conclusions In this paper, by feeding norm values of protein dimers into support vector machines in different accessible surface area thresholds, we demonstrate that even small number of proteins in pairs of multiple alignments could allow one to accurately discriminate between positive and negative dimers.

Download Full-text

Application of the ESMACS Binding Free Energy Protocol to a Highly Varied Ligand Dataset: Lactate Dehydogenase A

10.26434/chemrxiv.8398055 ◽

2019 ◽

Author(s):

David Wright ◽

Fouad Husseini ◽

Shunzhou Wan ◽

Christophe Meyer ◽

Herman Van Vlijmen ◽

...

Keyword(s):

Free Energy ◽

Surface Area ◽

Binding Free Energy ◽

Normal Mode Analysis ◽

Binding Mode ◽

Accessible Surface Area ◽

Solvent Accessible Surface Area ◽

Mode Analysis ◽

Energy Calculation ◽

Accessible Surface

<div>Here, we evaluate the performance of our range of ensemble simulation based binding free energy calculation protocols, called ESMACS (enhanced sampling of molecular dynamics with approximation of continuum solvent) for use in fragment based drug design scenarios. ESMACS is designed to generate reproducible binding affinity predictions from the widely used molecular mechanics Poisson-Boltzmann surface area (MMPBSA) approach. We study ligands designed to target two binding pockets in the lactate dehydogenase A target protein, which vary in size, charge and binding mode. When comparing to experimental results, we obtain excellent statistical rankings across this highly diverse set of ligands. In addition, we investigate three approaches to account for entropic contributions not captured by standard MMPBSA calculations: (1) normal mode analysis, (2) weighted solvent accessible surface area (WSAS) and (3) variational entropy. </div>

Download Full-text

Faculty Opinions recommendation of Buried and accessible surface area control intrinsic protein flexibility.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718033490.793483384 ◽

2013 ◽

Author(s):

Yaoqi Zhou

Keyword(s):

Surface Area ◽

Protein Flexibility ◽

Accessible Surface Area ◽

Intrinsic Protein ◽

Accessible Surface

Download Full-text

Analog Circuit Fault Diagnosis Based on Support Vector Machine Classifier and Fuzzy Feature Selection

Electronics ◽

10.3390/electronics10121496 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1496

Author(s):

Hao Liang ◽

Yiman Zhu ◽

Dongyang Zhang ◽

Le Chang ◽

Yuming Lu ◽

...

Keyword(s):

Support Vector Machine ◽

Fault Diagnosis ◽

Mutual Information ◽

Analog Circuit ◽

Fault Classification ◽

Support Vector ◽

Svm Classifier ◽

Fault Parameters ◽

Diagnosis Method ◽

Circuit Fault Diagnosis

In analog circuit, the component parameters have tolerances and the fault component parameters present a wide distribution, which brings obstacle to classification diagnosis. To tackle this problem, this article proposes a soft fault diagnosis method combining the improved barnacles mating optimizer(BMO) algorithm with the support vector machine (SVM) classifier, which can achieve the minimum redundancy and maximum relevance for feature dimension reduction with fuzzy mutual information. To be concrete, first, the improved barnacles mating optimizer algorithm is used to optimize the parameters for learning and classification. We adopt six test functions that are on three data sets from the University of California, Irvine (UCI) machine learning repository to test the performance of SVM classifier with five different optimization algorithms. The results show that the SVM classifier combined with the improved barnacles mating optimizer algorithm is characterized with high accuracy in classification. Second, fuzzy mutual information, enhanced minimum redundancy, and maximum relevance principle are applied to reduce the dimension of the feature vector. Finally, a circuit experiment is carried out to verify that the proposed method can achieve fault classification effectively when the fault parameters are both fixed and distributed. The accuracy of the proposed fault diagnosis method is 92.9% when the fault parameters are distributed, which is 1.8% higher than other classifiers on average. When the fault parameters are fixed, the accuracy rate is 99.07%, which is 0.7% higher than other classifiers on average.

Download Full-text

Feature Selection Method Based on Mutual Information and Support Vector Machine

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142150021x ◽

2021 ◽

pp. 2150021

Author(s):

Gang Liu ◽

Chunlei Yang ◽

Sen Liu ◽

Chunbao Xiao ◽

Bin Song

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Mutual Information ◽

Classification Accuracy ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Standard Data ◽

Feature Dimension

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.

Download Full-text