scholarly journals Prediction of activity and selectivity profiles of human Carbonic Anhydrase inhibitors using machine learning classification models

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Annachiara Tinivella ◽  
Luca Pinzi ◽  
Giulio Rastelli

AbstractThe development of selective inhibitors of the clinically relevant human Carbonic Anhydrase (hCA) isoforms IX and XII has become a major topic in drug research, due to their deregulation in several types of cancer. Indeed, the selective inhibition of these two isoforms, especially with respect to the homeostatic isoform II, holds great promise to develop anticancer drugs with limited side effects. Therefore, the development of in silico models able to predict the activity and selectivity against the desired isoform(s) is of central interest. In this work, we have developed a series of machine learning classification models, trained on high confidence data extracted from ChEMBL, able to predict the activity and selectivity profiles of ligands for human Carbonic Anhydrase isoforms II, IX and XII. The training datasets were built with a procedure that made use of flexible bioactivity thresholds to obtain well-balanced active and inactive classes. We used multiple algorithms and sampling sizes to finally select activity models able to classify active or inactive molecules with excellent performances. Remarkably, the results herein reported turned out to be better than those obtained by models built with the classic approach of selecting an a priori activity threshold. The sequential application of such validated models enables virtual screening to be performed in a fast and more reliable way to predict the activity and selectivity profiles against the investigated isoforms.

2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e15649-e15649
Author(s):  
Wei Zhou ◽  
Huan Chen ◽  
Wenbo Han ◽  
Ji He ◽  
Henghui Zhang

e15649 Background: The outcome prediction of hepatocellular carcinoma (HCC) is conventionally determined by evaluating tissue samples obtained during surgical removal of the primary tumor focusing on their clinical and pathologic features. Recently, accumulating evidence suggests that cancer development is comprehensively modulated by the host’s immune system underlying the importance of immunological biomarkers for the prediction of HCC prognosis. However, an integrated predictive algorism incorporating clinical characteristic and immune features still remain to be established. Methods: We obtained respectable stage II HCC specimens, along with adjacent para-tumor tissues from 221 patients who underwent surgical resection at Eastern Hepatobiliary Surgery Hospital, (Shanghai, China) from 2015 through April 2018. Characteristics such as CD8+, CD163+, tumor-infiltrating lymphocytes (TILs) were obtained for further model construction used to predict the status of 3 survival indexes: Overall Survival (OS ,≤ 24 or > 24 month), Progression Free Survival (PFS, ≤ 6 or > 6 month), and Recurrence/Death (RD). Mutual information and coefficient between each feature and the survival indexes were tested to remove low scoring features after data cleaning and standardization. Furthermore, recursive features selection was preformed to obtain the optimal features combination. Finally, supervised learning techniques include either boosting or bagging strategy were used to fit and predict model with a grid-search method optimizing the parameters. Meanwhile, a cross validation procedure with 0.2 proportion of test cohort was randomly carried out for 10 times to evaluate the model. Results: We finally confirmed 15 biomarkers from the 46 candidates as features for the survival status prediction by using a 221 patients cohort. Among them, the top 10 most important biomarkers, included both clinical and immune attributes. The AUC of our model for survival indexes (OS, PFS, RD) was ranged from 0.76 (RD) to 0.8 (PFS), and the accuracy was above 0.85. Conclusions: We describe the integrative analysis of the clinical and immune features which collectively contribute to the survival index of HCC. Machine learning techniques, such as Gradient Boosting and random forest classifier , have a great promise for using in HCC cancer survival prediction.


Author(s):  
Brian Carnahan ◽  
Gérard Meyer ◽  
Lois-Ann Kuntz

Multivariate classification models play an increasingly important role in human factors research. In the past, these models have been based primarily on discriminant analysis and logistic regression. Models developed from machine learning research offer the human factors professional a viable alternative to these traditional statistical classification methods. To illustrate this point, two machine learning approaches - genetic programming and decision tree induction - were used to construct classification models designed to predict whether or not a student truck driver would pass his or her commercial driver license (CDL) examination. The models were developed and validated using the curriculum scores and CDL exam performances of 37 student truck drivers who had completed a 320-hr driver training course. Results indicated that the machine learning classification models were superior to discriminant analysis and logistic regression in terms of predictive accuracy. Actual or potential applications of this research include the creation of models that more accurately predict human performance outcomes.


2021 ◽  
Vol 7 ◽  
pp. e642
Author(s):  
Kongmeng Liew ◽  
Yukiko Uchida ◽  
Igor de Almeida

Background Preferences for music can be represented through music features. The widespread prevalence of music streaming has allowed for music feature information to be consolidated by service providers like Spotify. In this paper, we demonstrate that machine learning classification on cultural market membership (Taiwanese, Japanese, American) by music features reveals variations in popular music across these markets. Methods We present an exploratory analysis of 1.08 million songs centred on Taiwanese, Japanese and American markets. We use both multiclass classification models (Gradient Boosted Decision Trees (GBDT) and Multilayer Perceptron (MLP)), and binary classification models, and interpret their results using variable importance measures and Partial Dependence Plots. To ensure the reliability of our interpretations, we conducted a follow-up study comparing Top-50 playlists from Taiwan, Japan, and the US on identified variables of importance. Results The multiclass models achieved moderate classification accuracy (GBDT = 0.69, MLP = 0.66). Accuracy scores for binary classification models ranged between 0.71 to 0.81. Model interpretation revealed music features of greatest importance: Overall, popular music in Taiwan was characterised by high acousticness, American music was characterised by high speechiness, and Japanese music was characterised by high energy features. A follow-up study using Top-50 charts found similarly significant differences between cultures for these three features. Conclusion We demonstrate that machine learning can reveal both the magnitude of differences in music preference across Taiwanese, Japanese, and American markets, and where these preferences are different. While this paper is limited to Spotify data, it underscores the potential contribution of machine learning in exploratory approaches to research on cultural differences.


Sign in / Sign up

Export Citation Format

Share Document