scholarly journals Molecular recognition principles in protein-ligand interactions as a prerequisite for the design of specific and selective leads

Author(s):  
G. Klebe
Author(s):  
Lieyang Chen ◽  
Anthony Cruz ◽  
Steven Ramsey ◽  
Callum J. Dickson ◽  
José S. Duca ◽  
...  

<p>Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development. </p>


1996 ◽  
Vol 24 (1) ◽  
pp. 280-284 ◽  
Author(s):  
A. C. Wallace ◽  
R. A. Laskowski ◽  
J. Singh ◽  
J. M. Thornton

Author(s):  
Lieyang Chen ◽  
Anthony Cruz ◽  
Steven Ramsey ◽  
Callum J. Dickson ◽  
José S. Duca ◽  
...  

<p>Recently much effort has been invested in using convolutional neural network (CNN) models trained on 3D structural images of protein-ligand complexes to distinguish binding from non-binding ligands for virtual screening. However, the dearth of reliable protein-ligand x-ray structures and binding affinity data has required the use of constructed datasets for the training and evaluation of CNN molecular recognition models. Here, we outline various sources of bias in one such widely-used dataset, the Directory of Useful Decoys: Enhanced (DUD-E). We have constructed and performed tests to investigate whether CNN models developed using DUD-E are properly learning the underlying physics of molecular recognition, as intended, or are instead learning biases inherent in the dataset itself. We find that superior enrichment efficiency in CNN models can be attributed to the analogue and decoy bias hidden in the DUD-E dataset rather than successful generalization of the pattern of protein-ligand interactions. Comparing additional deep learning models trained on PDBbind datasets, we found that their enrichment performances using DUD-E are not superior to the performance of the docking program AutoDock Vina. Together, these results suggest that biases that could be present in constructed datasets should be thoroughly evaluated before applying them to machine learning based methodology development. </p>


2017 ◽  
Vol 114 (25) ◽  
pp. 6563-6568 ◽  
Author(s):  
José A. Caro ◽  
Kyle W. Harpole ◽  
Vignesh Kasinath ◽  
Jackwee Lim ◽  
Jeffrey Granja ◽  
...  

Molecular recognition by proteins is fundamental to molecular biology. Dissection of the thermodynamic energy terms governing protein–ligand interactions has proven difficult, with determination of entropic contributions being particularly elusive. NMR relaxation measurements have suggested that changes in protein conformational entropy can be quantitatively obtained through a dynamical proxy, but the generality of this relationship has not been shown. Twenty-eight protein–ligand complexes are used to show a quantitative relationship between measures of fast side-chain motion and the underlying conformational entropy. We find that the contribution of conformational entropy can range from favorable to unfavorable, which demonstrates the potential of this thermodynamic variable to modulate protein–ligand interactions. For about one-quarter of these complexes, the absence of conformational entropy would render the resulting affinity biologically meaningless. The dynamical proxy for conformational entropy or “entropy meter” also allows for refinement of the contributions of solvent entropy and the loss in rotational-translational entropy accompanying formation of high-affinity complexes. Furthermore, structure-based application of the approach can also provide insight into long-lived specific water–protein interactions that escape the generic treatments of solvent entropy based simply on changes in accessible surface area. These results provide a comprehensive and unified view of the general role of entropy in high-affinity molecular recognition by proteins.


Sign in / Sign up

Export Citation Format

Share Document