scholarly journals Application of network link prediction in drug discovery

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Khushnood Abbas ◽  
Alireza Abbasi ◽  
Shi Dong ◽  
Ling Niu ◽  
Laihang Yu ◽  
...  

Abstract Background Technological and research advances have produced large volumes of biomedical data. When represented as a network (graph), these data become useful for modeling entities and interactions in biological and similar complex systems. In the field of network biology and network medicine, there is a particular interest in predicting results from drug–drug, drug–disease, and protein–protein interactions to advance the speed of drug discovery. Existing data and modern computational methods allow to identify potentially beneficial and harmful interactions, and therefore, narrow drug trials ahead of actual clinical trials. Such automated data-driven investigation relies on machine learning techniques. However, traditional machine learning approaches require extensive preprocessing of the data that makes them impractical for large datasets. This study presents wide range of machine learning methods for predicting outcomes from biomedical interactions and evaluates the performance of the traditional methods with more recent network-based approaches. Results We applied a wide range of 32 different network-based machine learning models to five commonly available biomedical datasets, and evaluated their performance based on three important evaluations metrics namely AUROC, AUPR, and F1-score. We achieved this by converting link prediction problem as binary classification problem. In order to achieve this we have considered the existing links as positive example and randomly sampled negative examples from non-existant set. After experimental evaluation we found that Prone, ACT and $$LRW_5$$ L R W 5 are the top 3 best performers on all five datasets. Conclusions This work presents a comparative evaluation of network-based machine learning algorithms for predicting network links, with applications in the prediction of drug-target and drug–drug interactions, and applied well known network-based machine learning methods. Our work is helpful in guiding researchers in the appropriate selection of machine learning methods for pharmaceutical tasks.

Author(s):  
Basant Agarwal ◽  
Namita Mittal

Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.


Big Data ◽  
2016 ◽  
pp. 1917-1933
Author(s):  
Basant Agarwal ◽  
Namita Mittal

Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246102
Author(s):  
Daekyum Kim ◽  
Sang-Hun Kim ◽  
Taekyoung Kim ◽  
Brian Byunghyun Kang ◽  
Minhyuk Lee ◽  
...  

Soft robots have been extensively researched due to their flexible, deformable, and adaptive characteristics. However, compared to rigid robots, soft robots have issues in modeling, calibration, and control in that the innate characteristics of the soft materials can cause complex behaviors due to non-linearity and hysteresis. To overcome these limitations, recent studies have applied various approaches based on machine learning. This paper presents existing machine learning techniques in the soft robotic fields and categorizes the implementation of machine learning approaches in different soft robotic applications, which include soft sensors, soft actuators, and applications such as soft wearable robots. An analysis of the trends of different machine learning approaches with respect to different types of soft robot applications is presented; in addition to the current limitations in the research field, followed by a summary of the existing machine learning methods for soft robots.


2019 ◽  
Author(s):  
Saqib A Rahman ◽  
Robert C Walker ◽  
Megan A Lloyd ◽  
Ben L Grace ◽  
Gijs I van Boxel ◽  
...  

ABSTRACTObjectiveTo develop a predictive model for early recurrence after surgery for oesophageal adenocarcinoma using a large multi-national cohort.Summary Background DataEarly cancer recurrence after oesophagectomy is a common problem with an incidence of 20-30% despite the widespread use of neoadjuvant treatment. Quantification of this risk is difficult and existing models perform poorly. Machine learning techniques potentially allow more accurate prognostication and have been applied in this study.MethodsConsecutive patients who underwent oesophagectomy for adenocarcinoma and had neoadjuvant treatment in 6 UK and 1 Dutch oesophago-gastric units were analysed. Using clinical characteristics and post-operative histopathology, models were generated using elastic net regression (ELR) and the machine learning methods random forest (RF) and XG boost (XGB). Finally, a combined (Ensemble) model of these was generated. The relative importance of factors to outcome was calculated as a percentage contribution to the model.ResultsIn total 812 patients were included. The recurrence rate at less than 1 year was 29.1%. All of the models demonstrated good discrimination. Internally validated AUCs were similar, with the Ensemble model performing best (ELR=0.785, RF=0.789, XGB=0.794, Ensemble=0.806). Performance was similar when using internal-external validation (validation across sites, Ensemble AUC=0.804). In the final model the most important variables were number of positive lymph nodes (25.7%) and vascular invasion (16.9%).ConclusionsThe derived model using machine learning approaches and an international dataset provided excellent performance in quantifying the risk of early recurrence after surgery and will be useful in prognostication for clinicians and patients.DRAFT VISUAL ABSTRACTIcons taken from www.flaticon.com, made by ‘Freepik’, ‘smashicons’, and ‘prettycons’. Reproduced under creative commons attribution licenseMINI-ABSTRACTEarly recurrence after surgery for adenocarcinoma of the oesophagus is common. We derived a risk prediction model using modern machine learning methods that accurately predicts risk of early recurrence using post-operative pathology


2020 ◽  
Author(s):  
Thomas R. Lane ◽  
Daniel H. Foil ◽  
Eni Minerali ◽  
Fabio Urbina ◽  
Kimberley M. Zorn ◽  
...  

<p>Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay Central<sup>TM</sup> with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay Central<sup>TM</sup> and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay Central<sup>TM</sup> may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay Central<sup>TM</sup>performance, but support vector classification seems to be a strong competitor. We also apply Assay Central<sup>TM</sup> to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models. </p><p><b> </b></p>


2021 ◽  
Author(s):  
Pablo Cresta Morgado ◽  
Martín Carusso ◽  
Laura Alonso Alemany ◽  
Laura Acion

In a continuum with applied statistics, machine learning offers a wide variety of tools to explore, analyze, and understand addiction data. These tools include algorithms that can leverage useful information from data to build models. These models are capable of addressing different scientific problems. In this second part of this two-part machine learning review, we develop how to apply machine learning methods. We explain the main limitations of machine learning approaches and ways to address them. Like other analytical tools, machine learning methods require careful implementation to carry out a reproducible and transparent research process with reliable results. This review describes a helpful workflow to guide the application of machine learning. This workflow has several steps: study design, data collection, data pre-processing, modeling, and communication. How to train, validate and test a model, detect and characterize overfitting, and determine an adequate sample size are some of the key issues to handle when applying machine learning. We also illustrate the process and particular nuances with examples of how researchers in addiction have applied machine learning techniques with different goals, study designs, or data sources.


2020 ◽  
Author(s):  
Thomas R. Lane ◽  
Daniel H. Foil ◽  
Eni Minerali ◽  
Fabio Urbina ◽  
Kimberley M. Zorn ◽  
...  

<p>Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay Central<sup>TM</sup> with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay Central<sup>TM</sup> and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay Central<sup>TM</sup> may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay Central<sup>TM</sup>performance, but support vector classification seems to be a strong competitor. We also apply Assay Central<sup>TM</sup> to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models. </p><p><b> </b></p>


Author(s):  
Małgorzata Grządzielewska

AbstractAccurate prediction provides a number of important benefits for research and decision-making. Occupational burnout is intertwined with individual, cultural, and social factors, the resolution of which requires methods that can deal with large amounts of data. The application of such methods capable of dealing with large datasets is a relatively novel research area in social science. For this purpose, this article presents insights into machine learning methods, mainly related to prediction tasks. A brief review of these techniques in burnout domain was applied. It is shown that the choice of a method depends on the presence of certain dependent variables. This paper also presents a comparison between novel and traditional approaches, which shows that the appropriateness of a technique depends on the aim of the research. The theoretical and practical implications of using machine learning methods in this context is also presented in the paper. It is found that a gap in the study of burnout exists which requires the attention of social work researchers. Through machine learning techniques, new theoretical models of burnout can be created. These algorithms can also provide new approaches to create data-driven interventions. Burnout monitoring systems supported by machine-learning algorithms can also be used in recruitment processes and to supervise employees. Applying machine learning methods in reducing burnout can also provide socio-economic benefits such as help to reduce employee turnover and improve general working conditions.


2021 ◽  
Vol 10 (4) ◽  
pp. 199
Author(s):  
Francisco M. Bellas Aláez ◽  
Jesus M. Torres Palenzuela ◽  
Evangelos Spyrakos ◽  
Luis González Vilas

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


Sign in / Sign up

Export Citation Format

Share Document