scholarly journals CASTELO: clustered atom subtypes aided lead optimization—a combined machine learning and molecular modeling method

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Leili Zhang ◽  
Giacomo Domeniconi ◽  
Chih-Chieh Yang ◽  
Seung-gu Kang ◽  
Ruhong Zhou ◽  
...  

Abstract Background Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that partially automates lead optimization workflow in silico, providing suggestions for modification hot spots. Results The initial data collection is achieved with physics-based molecular dynamics simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional and CVAE-based clustering methods are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization. Conclusion With no need for extensive structure-activity data, our method provides new hints for drug modification hotspots which can be used to improve drug potency and reduce the lead optimization time. It can potentially become a valuable tool for medicinal chemists.

2021 ◽  
Author(s):  
Leili Zhang ◽  
Giacomo Domeniconi ◽  
Chih-Chieh Yang ◽  
Seung-gu Kang ◽  
Ruhong Zhou ◽  
...  

Abstract Background: Drug discovery is a multi-stage process that comprises two costly major steps: pre-clinical research and clinical trials. Among its stages, lead optimization easily consumes more than half of the pre-clinical budget. We propose a combined machine learning and molecular modeling approach that partially automates lead optimization workflow in silico, providing suggestions for modification hot spots.Results: The initial data collection is achieved with physics-based molecular dynamics (MD) simulation. Contact matrices are calculated as the preliminary features extracted from the simulations. To take advantage of the temporal information from the simulations, we enhanced contact matrices data with temporal dynamism representation, which are then modeled with unsupervised convolutional variational autoencoder (CVAE). Finally, conventional and CVAE-based clustering methods are compared with metrics to rank the submolecular structures and propose potential candidates for lead optimization.Conclusion: With no need for extensive structure-activity data, our method provides new hints for drug modification hotspots which can be used to improve drug potency and reduce the lead optimization time. It can potentially become a valuable tool for medicinal chemists.


2017 ◽  
Vol 20 (1) ◽  
pp. 82-92 ◽  
Author(s):  
Kishore Sarma ◽  
Shubhadeep Roychoudhury ◽  
Sudipta Bora ◽  
Budheswar Dehury ◽  
Pratap Parida ◽  
...  

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


2021 ◽  
Vol 61 (9) ◽  
pp. 4266-4279 ◽  
Author(s):  
Kuo Hao Lee ◽  
Andrew D. Fant ◽  
Jiqing Guo ◽  
Andy Guan ◽  
Joslyn Jung ◽  
...  

Molecules ◽  
2022 ◽  
Vol 27 (2) ◽  
pp. 387
Author(s):  
Xiangcong Wang ◽  
Moxuan Zhang ◽  
Ranran Zhu ◽  
Zhongshan Wu ◽  
Fanhong Wu ◽  
...  

PI3Kα is one of the potential targets for novel anticancer drugs. In this study, a series of 2-difluoromethylbenzimidazole derivatives were studied based on the combination of molecular modeling techniques 3D-QSAR, molecular docking, and molecular dynamics. The results showed that the best comparative molecular field analysis (CoMFA) model had q2 = 0.797 and r2 = 0.996 and the best comparative molecular similarity indices analysis (CoMSIA) model had q2 = 0.567 and r2 = 0.960. It was indicated that these 3D-QSAR models have good verification and excellent prediction capabilities. The binding mode of the compound 29 and 4YKN was explored using molecular docking and a molecular dynamics simulation. Ultimately, five new PI3Kα inhibitors were designed and screened by these models. Then, two of them (86, 87) were selected to be synthesized and biologically evaluated, with a satisfying result (22.8 nM for 86 and 33.6 nM for 87).


Sign in / Sign up

Export Citation Format

Share Document