pka prediction
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 17)

H-INDEX

17
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Ada Y. Chen ◽  
Juyong Lee ◽  
Ana Damjanovic ◽  
Bernard R. Brooks

We present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA. The overall RMSE for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys and Tyr), and 0.63 when considering Asp, Glu, His and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.


ACS Omega ◽  
2021 ◽  
Author(s):  
Zhitao Cai ◽  
Fangfang Luo ◽  
Yongxian Wang ◽  
Enling Li ◽  
Yandong Huang

2021 ◽  
Author(s):  
Zhitao Cai ◽  
Fangfang Luo ◽  
Yongxian Wang ◽  
Enling Li ◽  
Yandong Huang

Protein pKa prediction is essential for the investigation of pH-associated relationship between protein structure and function. In this work, we introduce a deep learning based protein pKa predictor DeepKa, which is trained and validated with the pKa values derived from continuous constant pH molecular dynamics (CpHMD) simulations of 279 soluble proteins. Here the CpHMD implemented in the Amber molecular dynamics package has been employed (Huang, Harris, and Shen J. Chem. Inf. Model. 2018, 58, 1372-1383). Notably, to avoid discontinuities at the boundary, grid charges are proposed to represent protein electrostatics. We show that the prediction accuracy by DeepKa is close to that by CpHMD benchmarking simulations, validating DeepKa as an efficient protein pKa predictor. In addition, the training and validation sets created in this study can be applied to the development of machine learning based protein pKa predictors in future. Finally, the grid charge representation is general and applicable to other topics, such as the protein-ligand binding affinity prediction.


2021 ◽  
Author(s):  
Zhitao Cai ◽  
Fangfang Luo ◽  
Yongxian Wang ◽  
Enling Li ◽  
Yandong Huang

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Phuong Thuy Viet Nguyen ◽  
Truong Van Dat ◽  
Shusaku Mizukami ◽  
Duy Le Hoang Nguyen ◽  
Farhana Mosaddeque ◽  
...  

Abstract Background Emergence of cross-resistance to current anti-malarial drugs has led to an urgent need for identification of potential compounds with novel modes of action and anti-malarial activity against the resistant strains. One of the most promising therapeutic targets of anti-malarial agents related to food vacuole of malaria parasite is haemozoin, a product formed by the parasite through haemoglobin degradation. Methods With this in mind, this study developed two-dimensional-quantitative structure–activity relationships (QSAR) models of a series of 21 haemozoin inhibitors to explore the useful physicochemical parameters of the active compounds for estimation of anti-malarial activities. The 2D-QSAR model with good statistical quality using partial least square method was generated after removing the outliers. Results Five two-dimensional descriptors of the training set were selected: atom count (a_ICM); adjacency and distance matrix descriptor (GCUT_SLOGP_2: the third GCUT descriptor using atomic contribution to logP); average total charge sum (h_pavgQ) in pKa prediction (pH = 7); a very low negative partial charge, including aromatic carbons which have a heteroatom-substitution in “ortho” position (PEOE_VSA-0) and molecular descriptor (rsynth: estimating the synthesizability of molecules as the fraction of heavy atoms that can be traced back to starting material fragments resulting from retrosynthetic rules), respectively. The model suggests that the anti-malarial activity of haemozoin inhibitors increases with molecules that have higher average total charge sum in pKa prediction (pH = 7). QSAR model also highlights that the descriptor using atomic contribution to logP or the distance matrix descriptor (GCUT_SLOGP_2), and structural component of the molecules, including topological descriptors does make for better anti-malarial activity. Conclusions The model is capable of predicting the anti-malarial activities of anti-haemozoin compounds. In addition, the selected molecular descriptors in this QSAR model are helpful in designing more efficient compounds against the P. falciparum 3D7A strain.


Molecules ◽  
2021 ◽  
Vol 26 (4) ◽  
pp. 1048
Author(s):  
Jeffrey Plante ◽  
Beth A. Caine ◽  
Paul L. A. Popelier

The prediction of the aqueous pKa of carbon acids by Quantitative Structure Property Relationship or cheminformatics-based methods is a rather arduous problem. Primarily, there are insufficient high-quality experimental data points measured in homogeneous conditions to allow for a good global model to be generated. In our computationally efficient pKa prediction method, we generate an atom-type feature vector, called a distance spectrum, from the assigned ionisation atom, and learn coefficients for those atom-types that show the impact each atom-type has on the pKa of the ionisable centre. In the current work, we augment our dataset with pKa values from a series of high performing local models derived from the Ab Initio Bond Lengths method (AIBL). We find that, in distilling the knowledge available from multiple models into one general model, the prediction error for an external test set is reduced compared to that using literature experimental data alone.


ACS Omega ◽  
2020 ◽  
Vol 5 (49) ◽  
pp. 32023-32031
Author(s):  
Dinesh M. Dhumal ◽  
Pankaj D. Patil ◽  
Raghavendra V. Kulkarni ◽  
Krishnacharya G. Akamanchi

ACS Omega ◽  
2020 ◽  
Vol 5 (23) ◽  
pp. 13751-13759
Author(s):  
Sergio A. Rodriguez ◽  
Maria T. Baumgartner
Keyword(s):  

2020 ◽  
Vol 124 (23) ◽  
pp. 4712-4722
Author(s):  
Laura Zanetti-Polzi ◽  
Isabella Daidone ◽  
Andrea Amadei

Sign in / Sign up

Export Citation Format

Share Document