Application of network link prediction in drug discovery

Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.

Download Full-text

Machine Learning Approaches for Sentiment Analysis

Big Data ◽

10.4018/978-1-4666-9840-6.ch088 ◽

2016 ◽

pp. 1917-1933

Author(s):

Basant Agarwal ◽

Namita Mittal

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Opinion Mining ◽

Machine Learning Algorithms ◽

Sentiment Classification ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Knowledge Based ◽

Semantic Orientation

Opinion Mining or Sentiment Analysis is the study that analyzes people's opinions or sentiments from the text towards entities such as products and services. It has always been important to know what other people think. With the rapid growth of availability and popularity of online review sites, blogs', forums', and social networking sites' necessity of analysing and understanding these reviews has arisen. The main approaches for sentiment analysis can be categorized into semantic orientation-based approaches, knowledge-based, and machine-learning algorithms. This chapter surveys the machine learning approaches applied to sentiment analysis-based applications. The main emphasis of this chapter is to discuss the research involved in applying machine learning methods mostly for sentiment classification at document level. Machine learning-based approaches work in the following phases, which are discussed in detail in this chapter for sentiment classification: (1) feature extraction, (2) feature weighting schemes, (3) feature selection, and (4) machine-learning methods. This chapter also discusses the standard free benchmark datasets and evaluation methods for sentiment analysis. The authors conclude the chapter with a comparative study of some state-of-the-art methods for sentiment analysis and some possible future research directions in opinion mining and sentiment analysis.

Download Full-text

Review of machine learning methods in soft robotics

PLoS ONE ◽

10.1371/journal.pone.0246102 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0246102

Author(s):

Daekyum Kim ◽

Sang-Hun Kim ◽

Taekyoung Kim ◽

Brian Byunghyun Kang ◽

Minhyuk Lee ◽

...

Keyword(s):

Machine Learning ◽

Soft Materials ◽

Research Field ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Learning Methods ◽

Soft Robots ◽

Machine Learning Methods ◽

Wearable Robots ◽

Soft Actuators

Soft robots have been extensively researched due to their flexible, deformable, and adaptive characteristics. However, compared to rigid robots, soft robots have issues in modeling, calibration, and control in that the innate characteristics of the soft materials can cause complex behaviors due to non-linearity and hysteresis. To overcome these limitations, recent studies have applied various approaches based on machine learning. This paper presents existing machine learning techniques in the soft robotic fields and categorizes the implementation of machine learning approaches in different soft robotic applications, which include soft sensors, soft actuators, and applications such as soft wearable robots. An analysis of the trends of different machine learning approaches with respect to different types of soft robot applications is presented; in addition to the current limitations in the research field, followed by a summary of the existing machine learning methods for soft robots.

Download Full-text

Machine learning to predict early recurrence after oesophageal cancer surgery

10.1101/19001073 ◽

2019 ◽

Author(s):

Saqib A Rahman ◽

Robert C Walker ◽

Megan A Lloyd ◽

Ben L Grace ◽

Gijs I van Boxel ◽

...

Keyword(s):

Machine Learning ◽

External Validation ◽

Neoadjuvant Treatment ◽

Early Recurrence ◽

Machine Learning Techniques ◽

Oesophageal Adenocarcinoma ◽

Ensemble Model ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods

ABSTRACTObjectiveTo develop a predictive model for early recurrence after surgery for oesophageal adenocarcinoma using a large multi-national cohort.Summary Background DataEarly cancer recurrence after oesophagectomy is a common problem with an incidence of 20-30% despite the widespread use of neoadjuvant treatment. Quantification of this risk is difficult and existing models perform poorly. Machine learning techniques potentially allow more accurate prognostication and have been applied in this study.MethodsConsecutive patients who underwent oesophagectomy for adenocarcinoma and had neoadjuvant treatment in 6 UK and 1 Dutch oesophago-gastric units were analysed. Using clinical characteristics and post-operative histopathology, models were generated using elastic net regression (ELR) and the machine learning methods random forest (RF) and XG boost (XGB). Finally, a combined (Ensemble) model of these was generated. The relative importance of factors to outcome was calculated as a percentage contribution to the model.ResultsIn total 812 patients were included. The recurrence rate at less than 1 year was 29.1%. All of the models demonstrated good discrimination. Internally validated AUCs were similar, with the Ensemble model performing best (ELR=0.785, RF=0.789, XGB=0.794, Ensemble=0.806). Performance was similar when using internal-external validation (validation across sites, Ensemble AUC=0.804). In the final model the most important variables were number of positive lymph nodes (25.7%) and vascular invasion (16.9%).ConclusionsThe derived model using machine learning approaches and an international dataset provided excellent performance in quantifying the risk of early recurrence after surgery and will be useful in prognostication for clinicians and patients.DRAFT VISUAL ABSTRACTIcons taken from www.flaticon.com, made by ‘Freepik’, ‘smashicons’, and ‘prettycons’. Reproduced under creative commons attribution licenseMINI-ABSTRACTEarly recurrence after surgery for adenocarcinoma of the oesophagus is common. We derived a risk prediction model using modern machine learning methods that accurately predicts risk of early recurrence using post-operative pathology

Download Full-text

A Very Large-Scale Bioactivity Comparison of Deep Learning and Multiple Machine Learning Algorithms for Drug Discovery

10.26434/chemrxiv.12781241 ◽

2020 ◽

Author(s):

Thomas R. Lane ◽

Daniel H. Foil ◽

Eni Minerali ◽

Fabio Urbina ◽

Kimberley M. Zorn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Drug Discovery ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay CentralTM with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay CentralTM and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay CentralTM may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay CentralTMperformance, but support vector classification seems to be a strong competitor. We also apply Assay CentralTM to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models.

Download Full-text

Practical Foundations of Machine Learning for Addiction Research. Part II. Workflow and use cases.

10.31234/osf.io/6239v ◽

2021 ◽

Author(s):

Pablo Cresta Morgado ◽

Martín Carusso ◽

Laura Alonso Alemany ◽

Laura Acion

Keyword(s):

Machine Learning ◽

Research Process ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Learning Methods ◽

Design Data ◽

Machine Learning Methods ◽

Applied Machine Learning ◽

Collection Data ◽

Analytical Tools

In a continuum with applied statistics, machine learning offers a wide variety of tools to explore, analyze, and understand addiction data. These tools include algorithms that can leverage useful information from data to build models. These models are capable of addressing different scientific problems. In this second part of this two-part machine learning review, we develop how to apply machine learning methods. We explain the main limitations of machine learning approaches and ways to address them. Like other analytical tools, machine learning methods require careful implementation to carry out a reproducible and transparent research process with reliable results. This review describes a helpful workflow to guide the application of machine learning. This workflow has several steps: study design, data collection, data pre-processing, modeling, and communication. How to train, validate and test a model, detect and characterize overfitting, and determine an adequate sample size are some of the key issues to handle when applying machine learning. We also illustrate the process and particular nuances with examples of how researchers in addiction have applied machine learning techniques with different goals, study designs, or data sources.

Download Full-text

A Very Large-Scale Bioactivity Comparison of Deep Learning and Multiple Machine Learning Algorithms for Drug Discovery

10.26434/chemrxiv.12781241.v1 ◽

2020 ◽

Author(s):

Thomas R. Lane ◽

Daniel H. Foil ◽

Eni Minerali ◽

Fabio Urbina ◽

Kimberley M. Zorn ◽

...

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Drug Discovery ◽

Deep Neural Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods

Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay CentralTM with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay CentralTM and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay CentralTM may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay CentralTMperformance, but support vector classification seems to be a strong competitor. We also apply Assay CentralTM to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models.

Download Full-text

Using Machine Learning in Burnout Prediction: A Survey

Child and Adolescent Social Work Journal ◽

10.1007/s10560-020-00733-w ◽

2021 ◽

Author(s):

Małgorzata Grządzielewska

Keyword(s):

Machine Learning ◽

Theoretical Models ◽

Research Area ◽

Economic Benefits ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Occupational Burnout ◽

Learning Methods ◽

Machine Learning Methods ◽

Traditional Approaches

AbstractAccurate prediction provides a number of important benefits for research and decision-making. Occupational burnout is intertwined with individual, cultural, and social factors, the resolution of which requires methods that can deal with large amounts of data. The application of such methods capable of dealing with large datasets is a relatively novel research area in social science. For this purpose, this article presents insights into machine learning methods, mainly related to prediction tasks. A brief review of these techniques in burnout domain was applied. It is shown that the choice of a method depends on the presence of certain dependent variables. This paper also presents a comparison between novel and traditional approaches, which shows that the appropriateness of a technique depends on the aim of the research. The theoretical and practical implications of using machine learning methods in this context is also presented in the paper. It is found that a gap in the study of burnout exists which requires the attention of social work researchers. Through machine learning techniques, new theoretical models of burnout can be created. These algorithms can also provide new approaches to create data-driven interventions. Burnout monitoring systems supported by machine-learning algorithms can also be used in recruitment processes and to supervise employees. Applying machine learning methods in reducing burnout can also provide socio-economic benefits such as help to reduce employee turnover and improve general working conditions.

Download Full-text

Machine Learning Methods Applied to the Prediction of Pseudo-nitzschia spp. Blooms in the Galician Rias Baixas (NW Spain)

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10040199 ◽

2021 ◽

Vol 10 (4) ◽

pp. 199

Author(s):

Francisco M. Bellas Aláez ◽

Jesus M. Torres Palenzuela ◽

Evangelos Spyrakos ◽

Luis González Vilas

Keyword(s):

Machine Learning ◽

Performance Metrics ◽

Prediction Models ◽

Support Vector ◽

False Alarms ◽

Learning Approaches ◽

Learning Methods ◽

Machine Learning Methods ◽

Rías Baixas ◽

New Algorithms

This work presents new prediction models based on recent developments in machine learning methods, such as Random Forest (RF) and AdaBoost, and compares them with more classical approaches, i.e., support vector machines (SVMs) and neural networks (NNs). The models predict Pseudo-nitzschia spp. blooms in the Galician Rias Baixas. This work builds on a previous study by the authors (doi.org/10.1016/j.pocean.2014.03.003) but uses an extended database (from 2002 to 2012) and new algorithms. Our results show that RF and AdaBoost provide better prediction results compared to SVMs and NNs, as they show improved performance metrics and a better balance between sensitivity and specificity. Classical machine learning approaches show higher sensitivities, but at a cost of lower specificity and higher percentages of false alarms (lower precision). These results seem to indicate a greater adaptation of new algorithms (RF and AdaBoost) to unbalanced datasets. Our models could be operationally implemented to establish a short-term prediction system.

Download Full-text

Systematic literature review of machine learning methods used in the analysis of real-world data for patient-provider decision making

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01403-2 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Alan Brnabic ◽

Lisa M. Hess

Keyword(s):

Machine Learning ◽

Decision Making ◽

Literature Review ◽

Systematic Literature Review ◽

Real World ◽

Learning Algorithms ◽

External Validation ◽

Machine Learning Algorithms ◽

Learning Methods ◽

Machine Learning Methods

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.

Download Full-text