Machine Learning Methods in Drug Discovery

<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>

Download Full-text

Iterative machine learning applied to annotation of text datasets

10.5753/eniac.2021.18268 ◽

2021 ◽

Author(s):

Thiago Abdo ◽

Fabiano Silva

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Algorithms ◽

Computational Cost ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Learning Techniques ◽

The Creation ◽

The Impact ◽

High Computational Cost

The purpose of this paper is to analyze the use of different machine learning approaches and algorithms to be integrated as an automated assistance on a tool to aid the creation of new annotated datasets. We evaluate how they scale in an environment without dedicated machine learning hardware. In particular, we study the impact over a dataset with few examples and one that is being constructed. We experiment using deep learning algorithms (Bert) and classical learning algorithms with a lower computational cost (W2V and Glove combined with RF and SVM). Our experiments show that deep learning algorithms have a performance advantage over classical techniques. However, deep learning algorithms have a high computational cost, making them inadequate to an environment with reduced hardware resources. Simulations using Active and Iterative machine learning techniques to assist the creation of new datasets are conducted. For these simulations, we use the classical learning algorithms because of their computational cost. The knowledge gathered with our experimental evaluation aims to support the creation of a tool for building new text datasets.

Download Full-text

Application of Deep Learning for Credit Card Approval: A Comparison with Two Machine Learning Techniques

International Journal of Machine Learning and Computing ◽

10.18178/ijmlc.2021.11.4.1049 ◽

2021 ◽

Vol 11 (4) ◽

pp. 286-290

Author(s):

Md. Golam Kibria ◽

◽

Mehmet Sevkli

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Credit Card ◽

Learning Algorithms ◽

Learning Model ◽

Machine Learning Algorithms ◽

The Other ◽

Machine Learning Techniques ◽

Support Vector ◽

Deep Learning Model

The increased credit card defaulters have forced the companies to think carefully before the approval of credit applications. Credit card companies usually use their judgment to determine whether a credit card should be issued to the customer satisfying certain criteria. Some machine learning algorithms have also been used to support the decision. The main objective of this paper is to build a deep learning model based on the UCI (University of California, Irvine) data sets, which can support the credit card approval decision. Secondly, the performance of the built model is compared with the other two traditional machine learning algorithms: logistic regression (LR) and support vector machine (SVM). Our results show that the overall performance of our deep learning model is slightly better than that of the other two models.

Download Full-text

Compound2Drug – a Machine/deep Learning Tool for Predicting the Bioactivity of PubChem Compounds

10.26434/chemrxiv.13052951.v1 ◽

2020 ◽

Author(s):

Ben Geoffrey A S ◽

Pavan Preetham Valluri ◽

Akhil Sanker ◽

Rafal Madaj ◽

Host Antony Davidd ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Molecular Docking ◽

Drug Target ◽

Drug Targets ◽

Learning Algorithms ◽

Network Data ◽

Ligand Interaction ◽

Pubchem Compound ◽

Protein Ligand Interaction

<p>Network data is composed of nodes and edges. Successful application of machine learning/deep learning algorithms on network data to make node classification and link prediction has been shown in the area of social networks through which highly customized suggestions are offered to social network users. Similarly one can attempt the use of machine learning/deep learning algorithms on biological network data to generate predictions of scientific usefulness. In the present work, compound-drug target interaction data set from bindingDB has been used to train machine learning/deep learning algorithms which are used to predict the drug targets for any PubChem compound queried by the user. The user is required to input the PubChem Compound ID (CID) of the compound the user wishes to gain information about its predicted biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target. The tool also incorporates a feature to perform automated <i>In Silico</i> modelling for the compounds and the predicted drug targets to uncover their protein-ligand interaction profiles. The programs fetches the structures of the compound and the predicted drug targets, prepares them for molecular docking using standard AutoDock Scripts that are part of MGLtools and performs molecular docking, protein-ligand interaction profiling of the targets and the compound and stores the visualized results in the working folder of the user. The program is hosted, supported and maintained at the following GitHub repository </p> <p><a href="https://github.com/bengeof/Compound2Drug">https://github.com/bengeof/Compound2Drug</a></p>

Download Full-text

A Very Large-Scale Bioactivity Comparison of Deep Learning and Multiple Machine Learning Algorithms for Drug Discovery

10.26434/chemrxiv.12781241 ◽

2020 ◽

Author(s):

Thomas R. Lane ◽

Daniel H. Foil ◽

Eni Minerali ◽

Fabio Urbina ◽

Kimberley M. Zorn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Drug Discovery ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

<p>Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay Central<sup>TM</sup> with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay Central<sup>TM</sup> and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay Central<sup>TM</sup> may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay Central<sup>TM</sup>performance, but support vector classification seems to be a strong competitor. We also apply Assay Central<sup>TM</sup> to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models. </p><p><b> </b></p>

Download Full-text

Leveraging Deep Learning Techniques for Malaria Parasite Detection Using Mobile Application

Wireless Communications and Mobile Computing ◽

10.1155/2020/8895429 ◽

2020 ◽

Vol 2020 ◽

pp. 1-15

Author(s):

Mehedi Masud ◽

Hesham Alhumyani ◽

Sultan S. Alshamrani ◽

Omar Cheikhrouhou ◽

Saleh Ibrahim ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Mobile Application ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Stochastic Gradient Descent ◽

Cognitive Computing ◽

Human Errors ◽

Learning Techniques

Malaria is a contagious disease that affects millions of lives every year. Traditional diagnosis of malaria in laboratory requires an experienced person and careful inspection to discriminate healthy and infected red blood cells (RBCs). It is also very time-consuming and may produce inaccurate reports due to human errors. Cognitive computing and deep learning algorithms simulate human intelligence to make better human decisions in applications like sentiment analysis, speech recognition, face detection, disease detection, and prediction. Due to the advancement of cognitive computing and machine learning techniques, they are now widely used to detect and predict early disease symptoms in healthcare field. With the early prediction results, healthcare professionals can provide better decisions for patient diagnosis and treatment. Machine learning algorithms also aid the humans to process huge and complex medical datasets and then analyze them into clinical insights. This paper looks for leveraging deep learning algorithms for detecting a deadly disease, malaria, for mobile healthcare solution of patients building an effective mobile system. The objective of this paper is to show how deep learning architecture such as convolutional neural network (CNN) which can be useful in real-time malaria detection effectively and accurately from input images and to reduce manual labor with a mobile application. To this end, we evaluate the performance of a custom CNN model using a cyclical stochastic gradient descent (SGD) optimizer with an automatic learning rate finder and obtain an accuracy of 97.30% in classifying healthy and infected cell images with a high degree of precision and sensitivity. This outcome of the paper will facilitate microscopy diagnosis of malaria to a mobile application so that reliability of the treatment and lack of medical expertise can be solved.

Download Full-text

QPowered Compound2DeNovoDrugPropMax –A Novel Programmatic Tool Incorporating Deep Learning and In Silico Methods for Automated In Silico Bio- Activity Discovery for any Compound of Interest

10.26434/chemrxiv.13052951.v3 ◽

2021 ◽

Author(s):

Ben Geoffrey A S ◽

Rafal Madaj ◽

Akhil Sanker ◽

Pavan Preetham Valluri

Keyword(s):

Machine Learning ◽

Deep Learning ◽

In Silico ◽

Drug Target ◽

Drug Targets ◽

Learning Algorithms ◽

Interaction Network ◽

Network Data ◽

Data Set ◽

Target Interaction

Network data is composed of nodes and edges. Successful application of machine learning/deep<br>learning algorithms on network data to make node classification and link prediction have been shown<br>in the area of social networks through which highly customized suggestions are offered to social<br>network users. Similarly one can attempt the use of machine learning/deep learning algorithms on<br>biological network data to generate predictions of scientific usefulness. In the presented work,<br>compound-drug target interaction network data set from bindingDB has been used to train deep<br>learning neural network and a multi class classification has been implemented to classify PubChem<br>compound queried by the user into class labels of PBD IDs. This way target interaction prediction for<br>PubChem compounds is carried out using deep learning. The user is required to input the PubChem<br>Compound ID (CID) of the compound the user wishes to gain information about its predicted<br>biological activity and the tool outputs the RCSB PDB IDs of the predicted drug target interaction for<br>the input CID. Further the tool also optimizes the compound of interest of the user toward drug<br>likeness properties through a deep learning based structure optimization with a deep learning based<br>drug likeness optimization protocol. The tool also incorporates a feature to perform automated In<br>Silico modelling for the compounds and the predicted drug targets to uncover their protein-ligand<br>interaction profiles. The program is hosted, supported and maintained at the following GitHub<br><div>repository</div><div><br></div><div>https://github.com/bengeof/Compound2DeNovoDrugPropMax</div><div><br></div>Anticipating the rise in the use of quantum computing and quantum machine learning in drug discovery we use<br>the Penny-lane interface to quantum hardware to turn classical Keras layers used in our machine/deep<br>learning models into a quantum layer and introduce quantum layers into classical models to produce a<br>quantum-classical machine/deep learning hybrid model of our tool and the code corresponding to the<br><div>same is provided below</div><div><br></div>https://github.com/bengeof/QPoweredCompound2DeNovoDrugPropMax<br>

Download Full-text

Music Signal Analysis: Regression Analysis

10.5121/csit.2021.111205 ◽

2021 ◽

Author(s):

V. N. Aditya Datta Chivukula ◽

Sri Keshava Reddy Adupala

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Deep Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Statistical Machine Learning ◽

Ongoing Research ◽

Music Signal ◽

Learning Techniques

Machine learning techniques have become a vital part of every ongoing research in technical areas. In recent times the world has witnessed many beautiful applications of machine learning in a practical sense which amaze us in every aspect. This paper is all about whether we should always rely on deep learning techniques or is it really possible to overcome the performance of simple deep learning algorithms by simple statistical machine learning algorithms by understanding the application and processing the data so that it can help in increasing the performance of the algorithm by a notable amount. The paper mentions the importance of data pre-processing than that of the selection of the algorithm. It discusses the functions involving trigonometric, logarithmic, and exponential terms and also talks about functions that are purely trigonometric. Finally, we discuss regression analysis on music signals.

Download Full-text

SNF-NN: computational method to predict drug-disease interactions using similarity network fusion and neural networks

BMC Bioinformatics ◽

10.1186/s12859-020-03950-3 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Tamer N. Jarada ◽

Jon G. Rokne ◽

Reda Alhajj

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Cross Validation ◽

Drug Repositioning ◽

Computational Method ◽

Machine Learning Techniques ◽

Similarity Network ◽

Novel Drug ◽

Similarity Information ◽

Fold Cross Validation

Abstract Background Drug repositioning is an emerging approach in pharmaceutical research for identifying novel therapeutic potentials for approved drugs and discover therapies for untreated diseases. Due to its time and cost efficiency, drug repositioning plays an instrumental role in optimizing the drug development process compared to the traditional de novo drug discovery process. Advances in the genomics, together with the enormous growth of large-scale publicly available data and the availability of high-performance computing capabilities, have further motivated the development of computational drug repositioning approaches. More recently, the rise of machine learning techniques, together with the availability of powerful computers, has made the area of computational drug repositioning an area of intense activities. Results In this study, a novel framework SNF-NN based on deep learning is presented, where novel drug-disease interactions are predicted using drug-related similarity information, disease-related similarity information, and known drug-disease interactions. Heterogeneous similarity information related to drugs and disease is fed to the proposed framework in order to predict novel drug-disease interactions. SNF-NN uses similarity selection, similarity network fusion, and a highly tuned novel neural network model to predict new drug-disease interactions. The robustness of SNF-NN is evaluated by comparing its performance with nine baseline machine learning methods. The proposed framework outperforms all baseline methods ($$AUC-ROC$$ A U C - R O C = 0.867, and $$AUC-PR$$ A U C - P R =0.876) using stratified 10-fold cross-validation. To further demonstrate the reliability and robustness of SNF-NN, two datasets are used to fairly validate the proposed framework’s performance against seven recent state-of-the-art methods for drug-disease interaction prediction. SNF-NN achieves remarkable performance in stratified 10-fold cross-validation with $$AUC-ROC$$ A U C - R O C ranging from 0.879 to 0.931 and $$AUC-PR$$ A U C - P R from 0.856 to 0.903. Moreover, the efficiency of SNF-NN is verified by validating predicted unknown drug-disease interactions against clinical trials and published studies. Conclusion In conclusion, computational drug repositioning research can significantly benefit from integrating similarity measures in heterogeneous networks and deep learning models for predicting novel drug-disease interactions. The data and implementation of SNF-NN are available at http://pages.cpsc.ucalgary.ca/ tnjarada/snf-nn.php.

Download Full-text

Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases

Briefings in Bioinformatics ◽

10.1093/bib/bby061 ◽

2018 ◽

Vol 20 (5) ◽

pp. 1878-1912 ◽

Cited By ~ 45

Author(s):

Ahmet Sureyya Rifaioglu ◽

Heval Atas ◽

Maria Jesus Martin ◽

Rengul Cetin-Atalay ◽

Volkan Atalay ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Drug Discovery ◽

Machine Intelligence ◽

New Drugs ◽

Machine Learning Techniques ◽

The Past ◽

Screening Experiments ◽

Learning Techniques ◽

Computational Drug Discovery

Abstract The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as ‘virtual screening’ (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance. The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.

Download Full-text