scholarly journals A Machine Learning Approach Predicts Tissue-Specific Drug Adverse Events

2018 ◽  
Author(s):  
Neel S. Madhukar ◽  
Kaitlyn Gayvert ◽  
Coryandar Gilvary ◽  
Olivier Elemento

ABSTRACTOne of the main causes for failure in the drug development pipeline or withdrawal post approval is the unexpected occurrence of severe drug adverse events. Even though such events should be detected by in vitro, in vivo, and human trials, they continue to unexpectedly arise at different stages of drug development causing costly clinical trial failures and market withdrawal. Inspired by the “moneyball” approach used in baseball to integrate diverse features to predict player success, we hypothesized that a similar approach could leverage existing adverse event and tissue-specific toxicity data to learn how to predict adverse events. We introduce MAESTER, a data-driven machine learning approach that integrates information on a compound’s structure, targets, and phenotypic effects with tissue-wide genomic profiling and our toxic target database to predict the probability of a compound presenting with different types of tissue-specific adverse events. When tested on 6 different types of adverse events MAESTER maintains a high accuracy, sensitivity, and specificity across both the training data and new test sets. Additionally, MAESTER scores could flag a number of drugs that were approved, but later withdrawn due to unknown adverse events – highlighting its potential to identify events missed by traditional methods. MAESTER can also be used to identify toxic targets for each tissue type. Overall MAESTER provides a broadly applicable framework to identify toxic targets and predict specific adverse events and can accelerate the drug development pipeline and drive the design of new safer compounds.

Terminology ◽  
2021 ◽  
Author(s):  
Ayla Rigouts Terryn ◽  
Véronique Hoste ◽  
Els Lefever

Abstract Automatic term extraction (ATE) is an important task within natural language processing, both separately, and as a preprocessing step for other tasks. In recent years, research has moved far beyond the traditional hybrid approach where candidate terms are extracted based on part-of-speech patterns and filtered and sorted with statistical termhood and unithood measures. While there has been an explosion of different types of features and algorithms, including machine learning methodologies, some of the fundamental problems remain unsolved, such as the ambiguous nature of the concept “term”. This has been a hurdle in the creation of data for ATE, meaning that datasets for both training and testing are scarce, and system evaluations are often limited and rarely cover multiple languages and domains. The ACTER Annotated Corpora for Term Extraction Research contain manual term annotations in four domains and three languages and have been used to investigate a supervised machine learning approach for ATE, using a binary random forest classifier with multiple types of features. The resulting system (HAMLET Hybrid Adaptable Machine Learning approach to Extract Terminology) provides detailed insights into its strengths and weaknesses. It highlights a certain unpredictability as an important drawback of machine learning methodologies, but also shows how the system appears to have learnt a robust definition of terms, producing results that are state-of-the-art, and contain few errors that are not (part of) terms in any way. Both the amount and the relevance of the training data have a substantial effect on results, and by varying the training data, it appears to be possible to adapt the system to various desired outputs, e.g., different types of terms. While certain issues remain difficult – such as the extraction of rare terms and multiword terms – this study shows how supervised machine learning is a promising methodology for ATE.


2019 ◽  
Vol 47 (1) ◽  
pp. 216-248
Author(s):  
Annelen Brunner

Abstract This contribution presents a quantitative approach to speech, thought and writing representation (ST&WR) and steps towards its automatic detection. Automatic detection is necessary for studying ST&WR in a large number of texts and thus identifying developments in form and usage over time and in different types of texts. The contribution summarizes results of a pilot study: First, it describes the manual annotation of a corpus of short narrative texts in relation to linguistic descriptions of ST&WR. Then, two different techniques of automatic detection – a rule-based and a machine learning approach – are described and compared. Evaluation of the results shows success with automatic detection, especially for direct and indirect ST&WR.


Electronics ◽  
2021 ◽  
Vol 10 (18) ◽  
pp. 2208
Author(s):  
Maria Anna Ferlin ◽  
Michał Grochowski ◽  
Arkadiusz Kwasigroch ◽  
Agnieszka Mikołajczyk ◽  
Edyta Szurowska ◽  
...  

Machine learning-based systems are gaining interest in the field of medicine, mostly in medical imaging and diagnosis. In this paper, we address the problem of automatic cerebral microbleeds (CMB) detection in magnetic resonance images. It is challenging due to difficulty in distinguishing a true CMB from its mimics, however, if successfully solved, it would streamline the radiologists work. To deal with this complex three-dimensional problem, we propose a machine learning approach based on a 2D Faster RCNN network. We aimed to achieve a reliable system, i.e., with balanced sensitivity and precision. Therefore, we have researched and analysed, among others, impact of the way the training data are provided to the system, their pre-processing, the choice of model and its structure, and also the ways of regularisation. Furthermore, we also carefully analysed the network predictions and proposed an algorithm for its post-processing. The proposed approach enabled for obtaining high precision (89.74%), sensitivity (92.62%), and F1 score (90.84%). The paper presents the main challenges connected with automatic cerebral microbleeds detection, its deep analysis and developed system. The conducted research may significantly contribute to automatic medical diagnosis.


2020 ◽  
Vol 6 (39) ◽  
pp. eaba9338 ◽  
Author(s):  
George W. Ashdown ◽  
Michelle Dimon ◽  
Minjie Fan ◽  
Fernando Sánchez-Román Terán ◽  
Kathrin Witmer ◽  
...  

Drug resistance threatens the effective prevention and treatment of an ever-increasing range of human infections. This highlights an urgent need for new and improved drugs with novel mechanisms of action to avoid cross-resistance. Current cell-based drug screens are, however, restricted to binary live/dead readouts with no provision for mechanism of action prediction. Machine learning methods are increasingly being used to improve information extraction from imaging data. These methods, however, work poorly with heterogeneous cellular phenotypes and generally require time-consuming human-led training. We have developed a semi-supervised machine learning approach, combining human- and machine-labeled training data from mixed human malaria parasite cultures. Designed for high-throughput and high-resolution screening, our semi-supervised approach is robust to natural parasite morphological heterogeneity and correctly orders parasite developmental stages. Our approach also reproducibly detects and clusters drug-induced morphological outliers by mechanism of action, demonstrating the potential power of machine learning for accelerating cell-based drug discovery.


2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Eunchul Yoon ◽  
Soonbum Kwon ◽  
Unil Yun ◽  
Sun-Yong Kim

In this paper, we propose a Doppler spread estimation approach based on machine learning for an OFDM system. We present a carefully designed neural network architecture to achieve good performance in a mixed-channel scenario in which channel characteristic variables such as Rician K factor, azimuth angle of arrival (AOA) width, mean direction of azimuth AOA, and channel estimation errors are randomly generated. When preprocessing the channel state information (CSI) collected under the mixed-channel scenario, we propose averaged power spectral density (PSD) sequence as high-quality training data in machine learning for Doppler spread estimation. We detail intermediate mathematical derivatives of the machine learning process, making it easy to graft the derived results into other wireless communication technologies. Through simulation, we show that the machine learning approach using the averaged PSD sequence as training data outperforms the other machine learning approach using the channel frequency response (CFR) sequence as training data and two other existing Doppler estimation approaches.


Critical advancement has been made with profound neural systems as of late. Sharing prepared models of profound neural systems has been a significant in the fast advancement of innovative work of these frameworks. In digital environment, there are different types of applications face security related attack sequences from third parties. Most of the machine learning related approaches was introduced to describe security in wind and vulnerable attack sequences. Digital Watermarking is one of the approach to handle adversary related security approach to handle attacks appeared in digital environment. But it has some limitations to describe efficient security behind the web related applications appeared in real time environment. So that in this paper, we propose and implement advanced machine learning approach i.e Neural Network based Click Prediction (NNBCP) to handle web related attack sequences in real time environment. It uses Integrated CAPTCHA procedure to provide machine learning based captcha generation for user login and registration to handle different types of attacks in digital systems.


2020 ◽  
Vol 21 (10) ◽  
pp. 3585 ◽  
Author(s):  
Neann Mathai ◽  
Johannes Kirchmair

Computational methods for predicting the macromolecular targets of drugs and drug-like compounds have evolved as a key technology in drug discovery. However, the established validation protocols leave several key questions regarding the performance and scope of methods unaddressed. For example, prediction success rates are commonly reported as averages over all compounds of a test set and do not consider the structural relationship between the individual test compounds and the training instances. In order to obtain a better understanding of the value of ligand-based methods for target prediction, we benchmarked a similarity-based method and a random forest based machine learning approach (both employing 2D molecular fingerprints) under three testing scenarios: a standard testing scenario with external data, a standard time-split scenario, and a scenario that is designed to most closely resemble real-world conditions. In addition, we deconvoluted the results based on the distances of the individual test molecules from the training data. We found that, surprisingly, the similarity-based approach generally outperformed the machine learning approach in all testing scenarios, even in cases where queries were structurally clearly distinct from the instances in the training (or reference) data, and despite a much higher coverage of the known target space.


Sign in / Sign up

Export Citation Format

Share Document