scholarly journals Curated Database and Preliminary AutoML QSAR Model for 5-HT1A Receptor

Pharmaceutics ◽  
2021 ◽  
Vol 13 (10) ◽  
pp. 1711
Author(s):  
Natalia Czub ◽  
Adam Pacławski ◽  
Jakub Szlęk ◽  
Aleksander Mendyk

Introduction of a new drug to the market is a challenging and resource-consuming process. Predictive models developed with the use of artificial intelligence could be the solution to the growing need for an efficient tool which brings practical and knowledge benefits, but requires a large amount of high-quality data. The aim of our project was to develop quantitative structure–activity relationship (QSAR) model predicting serotonergic activity toward the 5-HT1A receptor on the basis of a created database. The dataset was obtained using ZINC and ChEMBL databases. It contained 9440 unique compounds, yielding the largest available database of 5-HT1A ligands with specified pKi value to date. Furthermore, the predictive model was developed using automated machine learning (AutoML) methods. According to the 10-fold cross-validation (10-CV) testing procedure, the root-mean-squared error (RMSE) was 0.5437, and the coefficient of determination (R2) was 0.74. Moreover, the Shapley Additive Explanations method (SHAP) was applied to assess a more in-depth understanding of the influence of variables on the model’s predictions. According to to the problem definition, the developed model can efficiently predict the affinity value for new molecules toward the 5-HT1A receptor on the basis of their structure encoded in the form of molecular descriptors. Usage of this model in screening processes can significantly improve the process of discovery of new drugs in the field of mental diseases and anticancer therapy.

2019 ◽  
Vol 20 (8) ◽  
pp. 1897 ◽  
Author(s):  
Shuaibing He ◽  
Tianyuan Ye ◽  
Ruiying Wang ◽  
Chenyang Zhang ◽  
Xuelian Zhang ◽  
...  

As one of the leading causes of drug failure in clinical trials, drug-induced liver injury (DILI) seriously impeded the development of new drugs. Assessing the DILI risk of drug candidates in advance has been considered as an effective strategy to decrease the rate of attrition in drug discovery. Recently, there have been continuous attempts in the prediction of DILI. However, it indeed remains a huge challenge to predict DILI successfully. There is an urgent need to develop a quantitative structure–activity relationship (QSAR) model for predicting DILI with satisfactory performance. In this work, we reported a high-quality QSAR model for predicting the DILI risk of xenobiotics by incorporating the use of eight effective classifiers and molecular descriptors provided by Marvin. In model development, a large-scale and diverse dataset consisting of 1254 compounds for DILI was built through a comprehensive literature retrieval. The optimal model was attained by an ensemble method, averaging the probabilities from eight classifiers, with accuracy (ACC) of 0.783, sensitivity (SE) of 0.818, specificity (SP) of 0.748, and area under the receiver operating characteristic curve (AUC) of 0.859. For further validation, three external test sets and a large negative dataset were utilized. Consequently, both the internal and external validation indicated that our model outperformed prior studies significantly. Data provided by the current study will also be a valuable source for modeling/data mining in the future.


2020 ◽  
Author(s):  
Zakari Ya’u Ibrahim ◽  
Adamu Uzairu ◽  
Gideon Shallangwa ◽  
Stephen Abechi

Abstract A blend of genetic algorithm with multiple linear regression (GA-MLR) method was utilized in generating a quantitative structure–activity relationship (QSAR) model on the antimalarial activity of aryl and aralkyl amine-based triazolopyrimidine derivatives. The structures of derivatives were optimized using density functional theory (DFT) DFT/B3LYP/6–31 + G* basis set to generate their molecular descriptors, where two (2) predictive models were developed with the aid of these descriptors. The model with an excellent statistical parameters; high coefficient of determination (R2) = 0.8884, cross-validated R2 (Q2cv) = 0.8317 and highest external validated R2 (R2pred) = 0.7019 was selected as the best model. The model generated was validated through internal (leave-one-out (LOO) cross-validation), external test set, and Y-randomization test. These parameters are indicators of robustness, excellent prediction, and validity of the selected model. The most relevant descriptor to the antimalarial activity in the model was found to be GATS6p (Geary autocorrelation—lag 6/weighted by polarizabilities), in the model due to its highest mean effect. The descriptor (GATS6p) was significant in the in-silico design of sixteen (16) derivatives of aryl and aralkyl amine-based triazolopyrimidine adopting compound DSM191 with the highest activity (pEC50 = 7.1805) as the design template. The design compound D8 was found to be the most active compound due to its superior hypothetical activity (pEC50 = 8.9545).


Molecules ◽  
2018 ◽  
Vol 23 (11) ◽  
pp. 3027 ◽  
Author(s):  
Hui Wang ◽  
Mingyue Jiang ◽  
Fangli Sun ◽  
Shujun Li ◽  
Chung-Yun Hse ◽  
...  

Development of new drugs is one of the solutions to fight against the existing antimicrobial resistance threat. Cinnamaldehyde-amino acid Schiff base compounds, are newly discovered compounds that exhibit good antibacterial activity against gram-positive and gram-negative bacteria. Quantitative structure–activity relationship (QSAR) methodology was applied to explore the correlation between antibacterial activity and compound structures. The two best QSAR models showed R2 = 0.9354, F = 57.96, and s2 = 0.0020 against Escherichia coli, and R2 = 0.8946, F = 33.94, and s2 = 0.0043 against Staphylococcus aureus. The model analysis showed that the antibacterial activity of cinnamaldehyde compounds was significantly affected by the polarity parameter/square distance and the minimum atomic state energy for an H atom. According to the best QSAR model, the screening, synthesis, and antibacterial activity of three cinnamaldehyde-amino acid Schiff compounds were reported. The experiment value of antibacterial activity demonstrated that the new compounds possessed excellent antibacterial activity that was comparable to that of ciprofloxacin.


Author(s):  
Apilak Worachartcheewan ◽  
Alla P. Toropova ◽  
Andrey A. Toropov ◽  
Reny Pratiwi ◽  
Virapong Prachayasittikul ◽  
...  

Background: Sirtuin 1 (Sirt1) and sirtuin 2 (Sirt2) are NAD+ -dependent histone deacetylases which play important functional roles in removal of the acetyl group of acetyl-lysine substrates. Considering the dysregulation of Sirt1 and Sirt2 as etiological causes of diseases, Sirt1 and Sirt2 are lucrative target proteins for treatment, thus there has been great interest in the development of Sirt1 and Sirt2 inhibitors. Objective: This study compiled the bioactivity data of Sirt1 and Sirt2 for the construction of quantitative structure-activity relationship (QSAR) models in accordance with the OECD principles. Method: Simplified molecular input line entry system (SMILES)-based molecular descriptors were used to characterize the molecular features of inhibitors while the Monte Carlo method of the CORAL software was employed for multivariate analysis. The data set was subjected to 3 random splits in which each split separated the data into 4 subsets consisting of training, invisible training, calibration and external sets. Results: Statistical indices for the evaluation of QSAR models suggested good statistical quality for models of Sirt1 and Sirt2 inhibitors. Furthermore, mechanistic interpretation of molecular substructures that are responsible for modulating the bioactivity (i.e. promoters of increase or decrease of bioactivity) was extracted via the analysis of correlation weights. It exhibited molecular features involved Sirt1 and Sirt2 inhibitors. Conclusion: It is anticipated that QSAR models presented herein can be useful as guidelines in the rational design of potential Sirt1 and Sirt2 inhibitors for the treatment of Sirtuin-related diseases.


2018 ◽  
Vol 18 (3) ◽  
pp. 219-232 ◽  
Author(s):  
Riccardo Concu ◽  
M. Natalia D.S. Cordeiro

Epidermal Growth Factor Receptor (EGFR) is still the main target of the Head and Neck Squamous Cell Cancer (HNSCC) because its overexpression has been detected in more than 90% of this type of cancer. This overexpression is usually linked with more aggressive disease, increased resistance to chemotherapy and radiotherapy, increased metastasis, inhibition of apoptosis, promotion of neoplastic angiogenesis, and, finally, poor prognosis and decreased survival. Due to this reason, the main target in the search of new drugs and inhibitors candidates is to downturn this overexpression. Quantitative Structure-Activity Relationship (QSAR) is one of the most widely used approaches while looking for new and more active inhibitors drugs. In this contest, a lot of authors used this technique, combined with others, to find new drugs or enhance the activity of well-known inhibitors. In this paper, on one hand, we will review the most important QSAR approaches developed in the last fifteen years, spacing from classical 1D approaches until more sophisticated 3D; the first paper is dated 2003 while the last one is from 2017. On the other hand, we will present a completely new QSAR approach aimed at the prediction of new EGFR inhibitors drugs. The model presented here has been developed over a dataset consisting of more than 1000 compounds using various molecular descriptors calculated with the DRAGON 7.0© software.


Author(s):  
Shu Cheng ◽  
Yanrui Ding

Background: Quantitative Structure Activity Relationship (QSAR) methods based on machine learning play a vital role in predicting biological effect. Objective: Considering the characteristics of the binding interface between ligands and the inhibitory neurotransmitter Gamma Aminobutyric Acid A(GABAA) receptor, we built a QSAR model of ligands that bind to the human GABAA receptor. Method: After feature selection with Mean Decrease Impurity, we selected 53 from 1,286 docked ligand molecular descriptors. Three QSAR models are built using gradient boosting regression tree algorithm based on the different combinations of docked ligand molecular descriptors and ligand-receptor interaction characteristics. Results: The features of the optimal QSAR model contain both the docked ligand molecular descriptors and ligand-receptor interaction characteristics. The Leave-One-Out-Cross-Validation (Q2 LOO) of the optimal QSAR model is 0.8974, the Coefficient of Determination (R2) for the testing set is 0.9261, the Mean Square Error (MSE) is 0.1862. We also used this model to predict the pIC50 of two new ligands, the differences between the predicted and experimental pIC50 are -0.02 and 0.03 respectively. Conclusion : We found the BELm2, BELe2, MATS1m, X5v, Mor08v, and Mor29m are crucial features, which can help to build the QSAR model more accurately.


2021 ◽  
Vol 149 ◽  
Author(s):  
Junwen Tao ◽  
Yue Ma ◽  
Xuefei Zhuang ◽  
Qiang Lv ◽  
Yaqiong Liu ◽  
...  

Abstract This study proposed a novel ensemble analysis strategy to improve hand, foot and mouth disease (HFMD) prediction by integrating environmental data. The approach began by establishing a vector autoregressive model (VAR). Then, a dynamic Bayesian networks (DBN) model was used for variable selection of environmental factors. Finally, a VAR model with constraints (CVAR) was established for predicting the incidence of HFMD in Chengdu city from 2011 to 2017. DBN showed that temperature was related to HFMD at lags 1 and 2. Humidity, wind speed, sunshine, PM10, SO2 and NO2 were related to HFMD at lag 2. Compared with the autoregressive integrated moving average model with external variables (ARIMAX), the CVAR model had a higher coefficient of determination (R2, average difference: + 2.11%; t = 6.2051, P = 0.0003 < 0.05), a lower root mean-squared error (−24.88%; t = −5.2898, P = 0.0007 < 0.05) and a lower mean absolute percentage error (−16.69%; t = −4.3647, P = 0.0024 < 0.05). The accuracy of predicting the time-series shape was 88.16% for the CVAR model and 86.41% for ARIMAX. The CVAR model performed better in terms of variable selection, model interpretation and prediction. Therefore, it could be used by health authorities to identify potential HFMD outbreaks and develop disease control measures.


2021 ◽  
Vol 13 (3) ◽  
pp. 438
Author(s):  
Subrina Tahsin ◽  
Stephen C. Medeiros ◽  
Arvind Singh

Long-term monthly coastal wetland vegetation monitoring is the key to quantifying the effects of natural and anthropogenic events, such as severe storms, as well as assessing restoration efforts. Remote sensing data products such as Normalized Difference Vegetation Index (NDVI), alongside emerging data analysis techniques, have enabled broader investigations into their dynamics at monthly to decadal time scales. However, NDVI data suffer from cloud contamination making periods within the time series sparse and often unusable during meteorologically active seasons. This paper proposes a virtual constellation for NDVI consisting of the red and near-infrared bands of Landsat 8 Operational Land Imager, Sentinel-2A Multi-Spectral Instrument, and Advanced Spaceborne Thermal Emission and Reflection Radiometer. The virtual constellation uses time-space-spectrum relationships from 2014 to 2018 and a random forest to produce synthetic NDVI imagery rectified to Landsat 8 format. Over the sample coverage area near Apalachicola, Florida, USA, the synthetic NDVI showed good visual coherence with observed Landsat 8 NDVI. Comparisons between the synthetic and observed NDVI showed Root Mean Squared Error and Coefficient of Determination (R2) values of 0.0020 sr−1 and 0.88, respectively. The results suggest that the virtual constellation was able to mitigate NDVI data loss due to clouds and may have the potential to do the same for other data. The ability to participate in a virtual constellation for a useful end product such as NDVI adds value to existing satellite missions and provides economic justification for future projects.


2021 ◽  
Vol 13 (7) ◽  
pp. 3727
Author(s):  
Fatema Rahimi ◽  
Abolghasem Sadeghi-Niaraki ◽  
Mostafa Ghodousi ◽  
Soo-Mi Choi

During dangerous circumstances, knowledge about population distribution is essential for urban infrastructure architecture, policy-making, and urban planning with the best Spatial-temporal resolution. The spatial-temporal modeling of the population distribution of the case study was investigated in the present study. In this regard, the number of generated trips and absorbed trips using the taxis pick-up and drop-off location data was calculated first, and the census population was then allocated to each neighborhood. Finally, the Spatial-temporal distribution of the population was calculated using the developed model. In order to evaluate the model, a regression analysis between the census population and the predicted population for the time period between 21:00 to 23:00 was used. Based on the calculation of the number of generated and the absorbed trips, it showed a different spatial distribution for different hours in one day. The spatial pattern of the population distribution during the day was different from the population distribution during the night. The coefficient of determination of the regression analysis for the model (R2) was 0.9998, and the mean squared error was 10.78. The regression analysis showed that the model works well for the nighttime population at the neighborhood level, so the proposed model will be suitable for the day time population.


Sign in / Sign up

Export Citation Format

Share Document