Interpretation of Ligand-Based Activity Cliff Prediction Models Using the Matched Molecular Pair Kernel

Activity cliffs (ACs) are formed by two structurally similar compounds with a large difference in potency. Accurate AC prediction is expected to help researchers’ decisions in the early stages of drug discovery. Previously, predictive models based on matched molecular pair (MMP) cliffs have been proposed. However, the proposed methods face a challenge of interpretability due to the black-box character of the predictive models. In this study, we developed interpretable MMP fingerprints and modified a model-specific interpretation approach for models based on a support vector machine (SVM) and MMP kernel. We compared important features highlighted by this SVM-based interpretation approach and the SHapley Additive exPlanations (SHAP) as a major model-independent approach. The model-specific approach could capture the difference between AC and non-AC, while SHAP assigned high weights to the features not present in the test instances. For specific MMPs, the feature weights mapped by the SVM-based interpretation method were in agreement with the previously confirmed binding knowledge from X-ray co-crystal structures, indicating that this method is able to interpret the AC prediction model in a chemically intuitive manner.

Download Full-text

Ripeness Prediction of Postharvest Kiwifruit Using a MOS E-Nose Combined with Chemometrics

Sensors ◽

10.3390/s19020419 ◽

2019 ◽

Vol 19 (2) ◽

pp. 419 ◽

Cited By ~ 11

Author(s):

Dongdong Du ◽

Jun Wang ◽

Bo Wang ◽

Luyi Zhu ◽

Xuezhen Hong

Keyword(s):

Prediction Models ◽

Extraction Methods ◽

Oxide Semiconductor ◽

Soluble Solids ◽

Support Vector ◽

Least Squares Regression ◽

Accuracy Rate ◽

Linear Discriminant ◽

The Difference ◽

Ripe Stage

Postharvest kiwifruit continues to ripen for a period until it reaches the optimal “eating ripe” stage. Without damaging the fruit, it is very difficult to identify the ripeness of postharvest kiwifruit by conventional means. In this study, an electronic nose (E-nose) with 10 metal oxide semiconductor (MOS) gas sensors was used to predict the ripeness of postharvest kiwifruit. Three different feature extraction methods (the max/min values, the difference values and the 70th s values) were employed to discriminate kiwifruit at different ripening times by linear discriminant analysis (LDA), and results showed that the 70th s values method had the best performance in discriminating kiwifruit at different ripening stages, obtaining a 100% original accuracy rate and a 99.4% cross-validation accuracy rate. Partial least squares regression (PLSR), support vector machine (SVM) and random forest (RF) were employed to build prediction models for overall ripeness, soluble solids content (SSC) and firmness. The regression results showed that the RF algorithm had the best performance in predicting the ripeness indexes of postharvest kiwifruit compared with PLSR and SVM, which illustrated that the E-nose data had high correlations with overall ripeness (training: R2 = 0.9928; testing: R2 = 0.9928), SSC (training: R2 = 0.9749; testing: R2 = 0.9143) and firmness (training: R2 = 0.9814; testing: R2 = 0.9290). This study demonstrated that E-nose could be a comprehensive approach to predict the ripeness of postharvest kiwifruit through aroma volatiles.

Download Full-text

Addressing bias in prediction models by improving subpopulation calibration

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocaa283 ◽

2020 ◽

Author(s):

Noam Barda ◽

Gal Yona ◽

Guy N Rothblum ◽

Philip Greenland ◽

Morton Leibowitz ◽

...

Keyword(s):

Risk Assessment ◽

Cohort Study ◽

Predictive Models ◽

Model Calibration ◽

Retrospective Cohort ◽

Prediction Models ◽

Assessment Tool ◽

Substantial Portion ◽

Fracture Risk Assessment Tool ◽

Model Independent

Abstract Objective To illustrate the problem of subpopulation miscalibration, to adapt an algorithm for recalibration of the predictions, and to validate its performance. Materials and Methods In this retrospective cohort study, we evaluated the calibration of predictions based on the Pooled Cohort Equations (PCE) and the fracture risk assessment tool (FRAX) in the overall population and in subpopulations defined by the intersection of age, sex, ethnicity, socioeconomic status, and immigration history. We next applied the recalibration algorithm and assessed the change in calibration metrics, including calibration-in-the-large. Results 1 021 041 patients were included in the PCE population, and 1 116 324 patients were included in the FRAX population. Baseline overall model calibration of the 2 tested models was good, but calibration in a substantial portion of the subpopulations was poor. After applying the algorithm, subpopulation calibration statistics were greatly improved, with the variance of the calibration-in-the-large values across all subpopulations reduced by 98.8% and 94.3% in the PCE and FRAX models, respectively. Discussion Prediction models in medicine are increasingly common. Calibration, the agreement between predicted and observed risks, is commonly poor for subpopulations that were underrepresented in the development set of the models, resulting in bias and reduced performance for these subpopulations. In this work, we empirically evaluated an adapted version of the fairness algorithm designed by Hebert-Johnson et al. (2017) and demonstrated its use in improving subpopulation miscalibration. Conclusion A postprocessing and model-independent fairness algorithm for recalibration of predictive models greatly decreases the bias of subpopulation miscalibration and thus increases fairness and equality.

Download Full-text

Predictive models for personalized asthma attacks based on patient’s biosignals and environmental factors: a systematic review

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01704-6 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Eman T. Alharbi ◽

Farrukh Nadeem ◽

Asma Cherif

Keyword(s):

Risk Factors ◽

Systematic Review ◽

Environmental Factors ◽

Predictive Models ◽

Critical Appraisal ◽

Prediction Models ◽

Research Articles ◽

Support Vector ◽

World Population ◽

Asthma Attack

Abstract Background Asthma is a chronic disease that exacerbates due to various risk factors, including the patient’s biosignals and environmental conditions. It is affecting on average 7% of the world population. Preventing an asthma attack is the main challenge for asthma patients, which requires keeping track of any risk factor that can cause a seizure. Many researchers developed asthma attacks prediction models that used various asthma biosignals and environmental factors. These predictive models can help asthmatic patients predict asthma attacks in advance, and thus preventive measures can be taken. This paper introduces a review of these models to evaluate the used methods, model’s performance, and determine the need to improve research in this field. Method A systematic review was conducted for the research articles introducing asthma attack prediction models for children and adults. We searched the PubMed, ScienceDirect, Springer, and IEEE databases from January 2000 to December 2020. The search includes the prediction models that used biosignal, environmental, and both risk factors. The research article’s quality was assessed and scored based on two checklists, the Checklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) and the Critical Appraisal Skills Programme clinical prediction rule checklist (CASP). The highest scored articles were selected to review. Result From 1068 research articles we reviewed, we found that most of the studies used asthma biosignal factors only for prediction, few of the studies used environmental factors, and limited studies used both of these factors. Fifteen different asthma attack predictive models were selected for this review. we found that most of the studies used traditional prediction methods, like Support Vector Machine and regression. We have identified the pros and cons of the reviewed asthma attack prediction models and propose solutions to advance the studies in this field. Conclusion Asthma attack predictive models become more significant when using both patient’s biosignal and environmental factors. There is a lack of utilizing advanced machine learning methods, like deep learning techniques. Besides, there is a need to build smart healthcare systems that provide patients with decision-making systems to identify risk and visualize high-risk regions.

Download Full-text

Predictive Models for Personalized Asthma Attacks Based on Patient’s Biosignals and Environmental Factors: A Systematic Review

10.21203/rs.3.rs-770597/v1 ◽

2021 ◽

Author(s):

Eman T. Alharbi ◽

Farrukh Nadeem ◽

Asma Cherif

Keyword(s):

Risk Factors ◽

Systematic Review ◽

Environmental Factors ◽

Predictive Models ◽

Critical Appraisal ◽

Prediction Models ◽

Research Articles ◽

Support Vector ◽

World Population ◽

Asthma Attack

Abstract Background: Asthma is a chronic disease that exacerbates due to various risk factors, including the patient's biosignals and environmental conditions. It is affecting on average 7% of the world population. Preventing an asthma attack is the main challenge for asthma patients, which requires keeping track of any risk factor that can cause a seizure. Many researchers developed asthma attacks prediction models that used various asthma biosignals and environmental factors. These predictive models can help asthmatic patients predict asthma attacks in advance, and thus preventive measures can be taken. This paper introduces a review of these models to evaluate the used methods, model's performance, and determine the need to improve research in this field.Method: A systematic review was conducted for the research articles introducing asthma attack prediction models for children and adults. We searched the PubMed, ScienceDirect, Springer, and IEEE databases from January 2000 to December 2020. The search includes the prediction models that used biosignal, environmental, and both risk factors. The research article's quality was assessed and scored based on two checklists, the Checklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) and the Critical Appraisal Skills Programme clinical prediction rule checklist (CASP). The highest scored articles were selected to review.Result: From 1068 research articles we reviewed, we found that most of the studies used asthma biosignal factors only for prediction, few of the studies used environmental factors, and limited studies used both of these factors. Fifteen different asthma attack predictive models were selected for this review. we found that most of the studies used traditional prediction methods, like Support Vector Machine and regression. We have identified the pros and cons of the reviewed asthma attack prediction models and propose solutions to advance the studies in this field.Conclusion: Asthma attack predictive models become more significant when using both patient's biosignal and environmental factors. There is a lack of utilizing advanced machine learning methods, like deep learning techniques. Besides, there is a need to build smart healthcare systems that provide patients with decision-making systems to identify risk and visualize high-risk regions.

Download Full-text

Boruta-grid-search least square support vector machine for NO2 pollution prediction using big data analytics and IoT emission sensors

Applied Computing and Informatics ◽

10.1108/aci-04-2021-0092 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Habeeb Balogun ◽

Hafiz Alaka ◽

Christian Nnaemeka Egwim

Keyword(s):

Machine Learning ◽

Big Data ◽

Predictive Models ◽

Data Analytics ◽

Prediction Models ◽

Big Data Analytics ◽

Support Vector ◽

Traffic Data ◽

Content Type ◽

Pollution Concentration

PurposeThis paper seeks to assess the performance levels of BA-GS-LSSVM compared to popular standalone algorithms used to build NO2 prediction models. The purpose of this paper is to pre-process a relatively large data of NO2 from Internet of Thing (IoT) sensors with time-corresponding weather and traffic data and to use the data to develop NO2 prediction models using BA-GS-LSSVM and popular standalone algorithms to allow for a fair comparison.Design/methodology/approachThis research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration. The authors used big data analytics infrastructure to retrieve the large volume of data collected in tens of seconds for over 5 months. Weather data from the UK meteorology department and traffic data from the department for transport were collected and merged for the corresponding time and location where the pollution sensors exist.FindingsThe results show that the hybrid BA-GS-LSSVM outperforms all other standalone machine learning predictive Model for NO2 pollution.Practical implicationsThis paper's hybrid model provides a basis for giving an informed decision on the NO2 pollutant avoidance system.Originality/valueThis research installed and used data from 14 IoT emission sensors to develop machine learning predictive models for NO2 pollution concentration.

Download Full-text

POLIGAMI DALAM PERSPEKTIF AL-QUR’AN

MAGHZA Jurnal Ilmu Al-Qur an dan Tafsir ◽

10.24090/maghza.v1i2.739 ◽

2016 ◽

Vol 1 (2) ◽

pp. 35-50

Author(s):

Makrum Makrum

Keyword(s):

Qualitative Research ◽

Islamic Law ◽

Qualitative Approach ◽

Comprehensive Understanding ◽

Research Results ◽

Library Research ◽

Difference Of Opinion ◽

Controversial Problem ◽

The Difference ◽

Interpretation Method

This paper is discusion the polygamy is still a controversial problem, although much discussed and examined. The difference of opinion among scholars make this problem continues to potentially raises the agree and disagree. Even though it has been regulated in Act Number 1 of 1974 concerning marriage and the compilation of Islamic law (KHI), this does not necessarily make the problem of polygamy is complete. Not a few perpetrators of polygamy choose married under the hand or by sirri. This research uses qualitative approach by implementing thematic interpretation method (maudhu'i) to obtain a comprehensive understanding about polygamy in the Qur'an. The Data obtained through the study of a library research by sharing the data that comes from the various verse of the Qur'an, hadith, book fiqh, research results, books and the news in various media outlets in order to complete the interpretation of the verses of polygamy. Based on the results of this research it is known that the verses of the Qur'an gives a very tight restrictions for those who want to in polygamy. Justice that the conditions of polygamy is not only were quantitative but also qualitative research. In the context of historical-socio, the command of polygamy is intended as a form of the solution to avoid injustice to orphans women. Even if polygamy still want to do, should the husband marrying the widows who have lighten the orphan.

Download Full-text

ЕКАТЕРИНА ВЕЛИКАЯ И ДИАЛЕКТИКА ПРОСВЕЩЕННОЙ МОНАРХИИ

Konfliktologia ◽

10.31312/2310-6085-2020-15-1-39-51 ◽

2020 ◽

Vol 15 (1) ◽

pp. 39

Author(s):

С. И. Дудник ◽

И. Д. Осипов

Keyword(s):

Cultural Policy ◽

Separation Of Powers ◽

Point Of View ◽

Human Sciences ◽

The Rule Of Law ◽

Political Ideas ◽

The People ◽

Specific Interpretation ◽

The Difference ◽

Conservative Values

The article discusses the problems of evolution and the formation of the ideology of an enlightened monarchy in Russia. In this regard, the philosophical and political ideas of Catherine the Great, as well as their theoretical and ideological premises, are analyzed. It is noted that the philosophy of education in Russia was closely connected with the concepts of Voltaire, Didro, Montesquieu, Beccaria, Bentham, their views on natural law and human freedom, humanism and the rule of law. These concepts in the philosophy of Catherine received a specific interpretation, due to the sociocultural conditions of Russia. This was manifested in the famous work of Catherine the Great “The Nakaz”, which recognized Montesquieu's argument in favor of the autocracy, but at the same time, his point of view on the separation of powers was rejected. The specificity of the doctrine of enlightened monarchy lies in the combination of liberal and conservative values, which form eclectic forms. This was the dialectic of the supreme power, the difference between the enlightened monarchy and the ideology of absolutism. The article also notes that education in Russia is associated with fundamental socio-political reforms, processes of secularization of culture. At this time, the natural and human sciences are developing. The changes positively influenced the development of medicine, beautification of towns and public education. Also considered are the views on the autocracy of the opposition nobility intelligentsia: A. N. Radishchev and noted that his criticism of the autocracy was determined by an alternative cultural policy, proceeding from the protection of the interests of the people. The doctrine of enlightened monarchy is characterized by internal worldview inconsistency and political inconsistency, which did not allow solving the pressing social problems of the establishment of legal state, democratization of society and the abolition of serfdom.

Download Full-text

Bioactivity Prediction Based on Matched Molecular Pair and Matched Molecular Series Methods

Current Pharmaceutical Design ◽

10.2174/1381612826666200427111309 ◽

2020 ◽

Vol 26 (33) ◽

pp. 4195-4205

Author(s):

Xiaoyu Ding ◽

Chen Cui ◽

Dingyan Wang ◽

Jihui Zhao ◽

Mingyue Zheng ◽

...

Keyword(s):

Prediction Model ◽

Large Scale ◽

Prediction Models ◽

Predictive Accuracy ◽

Lead Optimization ◽

Consensus Method ◽

Molecular Pair ◽

Bioactivity Prediction ◽

Compound Synthesis ◽

Consensus Modeling

Background: Enhancing a compound’s biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. Methods: Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. Results: Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). Conclusion: An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.

Download Full-text

In silico Prediction of Inhibitory Constant of Thrombin Inhibitors Using Machine Learning

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666181220130232 ◽

2019 ◽

Vol 21 (9) ◽

pp. 662-669 ◽

Cited By ~ 1

Author(s):

Junnan Zhao ◽

Lu Zhu ◽

Weineng Zhou ◽

Lingfeng Yin ◽

Yuchen Wang ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Regression Tree ◽

Large Data ◽

Thrombin Inhibitors ◽

Coagulation Cascade ◽

Gradient Boosting ◽

Support Vector ◽

Data Set ◽

Descriptor Selection

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.

Download Full-text

Possibility of Human Gender Recognition Using Raman Spectra of Teeth

Molecules ◽

10.3390/molecules26133983 ◽

2021 ◽

Vol 26 (13) ◽

pp. 3983

Author(s):

Ozren Gamulin ◽

Marko Škrabić ◽

Kristina Serec ◽

Matej Par ◽

Marija Baković ◽

...

Keyword(s):

Raman Spectra ◽

Principal Component ◽

Support Vector ◽

Gender Recognition ◽

Proof Of Concept ◽

Male And Female ◽

Tooth Type ◽

Tooth Apex ◽

The Difference

Gender determination of the human remains can be very challenging, especially in the case of incomplete ones. Herein, we report a proof-of-concept experiment where the possibility of gender recognition using Raman spectroscopy of teeth is investigated. Raman spectra were recorded from male and female molars and premolars on two distinct sites, tooth apex and anatomical neck. Recorded spectra were sorted into suitable datasets and initially analyzed with principal component analysis, which showed a distinction between spectra of male and female teeth. Then, reduced datasets with scores of the first 20 principal components were formed and two classification algorithms, support vector machine and artificial neural networks, were applied to form classification models for gender recognition. The obtained results showed that gender recognition with Raman spectra of teeth is possible but strongly depends both on the tooth type and spectrum recording site. The difference in classification accuracy between different tooth types and recording sites are discussed in terms of the molecular structure difference caused by the influence of masticatory loading or gender-dependent life events.

Download Full-text