scholarly journals AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics

2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i380-i388
Author(s):  
Hossein Sharifi-Noghabi ◽  
Shuman Peng ◽  
Olga Zolotareva ◽  
Colin C Collins ◽  
Martin Ester

Abstract Motivation The goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: (i) in the input space, the gene expression data due to difference in the basic biology, and (ii) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution. Results We propose Adversarial Inductive Transfer Learning (AITL), a deep neural network method for addressing discrepancies in input and output space between the pre-clinical and clinical datasets. AITL takes gene expression of patients and cell lines as the input, employs adversarial domain adaptation and multi-task learning to address these discrepancies, and predicts the drug response as the output. To the best of our knowledge, AITL is the first adversarial inductive transfer learning method to address both input and output discrepancies. Experimental results indicate that AITL outperforms state-of-the-art pharmacogenomics and transfer learning baselines and may guide precision oncology more accurately. Availability and implementation https://github.com/hosseinshn/AITL. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Hossein Sharifi-Noghabi ◽  
Shuman Peng ◽  
Olga Zolotareva ◽  
Colin C. Collins ◽  
Martin Ester

AbstractMotivationThe goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: 1) in the input space, the gene expression data due to difference in the basic biology, and 2) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution.ResultsWe propose Adversarial Inductive Transfer Learning (AITL), a deep neural network method for addressing discrepancies in input and output space between the pre-clinical and clinical datasets. AITL takes gene expression of patients and cell lines as the input, employs adversarial domain adaptation and multi-task learning to address these discrepancies, and predicts the drug response as the output. To the best of our knowledge, AITL is the first adversarial inductive transfer learning method to address both input and output discrepancies. Experimental results indicate that AITL outperforms state-of-the-art pharmacogenomics and transfer learning baselines and may guide precision oncology more accurately.Availability of codes and supplementary materialhttps://github.com/hosseinshn/[email protected] and [email protected]


2019 ◽  
Vol 35 (14) ◽  
pp. i501-i509 ◽  
Author(s):  
Hossein Sharifi-Noghabi ◽  
Olga Zolotareva ◽  
Colin C Collins ◽  
Martin Ester

Abstract Motivation Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance. Results We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology. Availability and implementation https://github.com/hosseinshn/MOLI. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Hossein Sharifi-Noghabi ◽  
Parsa Alamzadeh Harjandi ◽  
Olga Zolotareva ◽  
Colin C Collins ◽  
Martin Ester

Data discrepancy between preclinical and clinical datasets poses a major challenge for accurate drug response prediction based on gene expression data. Different methods of transfer learning have been proposed to address this data discrepancy. These methods generally use cell lines as source domains and patients, patient-derived xenografts, or other cell lines as target domains. However, they assume that they have access to the target domain during training or fine-tuning and they can only take labeled source domains as input. The former is a strong assumption that is not satisfied during deployment of these models in the clinic. The latter means these methods rely on labeled source domains which are of limited size. To avoid this assumption, we formulate drug response prediction as an out-of-distribution generalization problem which does not assume that the target domain is accessible during training. Moreover, to exploit unlabeled source domain data, which tends to be much more plentiful than labeled data, we adopt a semi-supervised approach. We propose Velodrome, a semi-supervised method of out-of-distribution generalization that takes labeled and unlabeled data from different resources as input and makes generalizable predictions. Velodrome achieves this goal by introducing an objective function that combines a supervised loss for accurate prediction, an alignment loss for generalization, and a consistency loss to incorporate unlabeled samples. Our experimental results demonstrate that Velodrome outperforms state-of-the-art pharmacogenomics and transfer learning baselines on cell lines, patient-derived xenografts, and patients and therefore, may guide precision oncology more accurately.


2008 ◽  
Vol 73 (3) ◽  
pp. 215-220 ◽  
Author(s):  
Daniel L. Silver ◽  
Kristin P. Bennett

2021 ◽  
pp. 502-517
Author(s):  
Michael Wilbur ◽  
Ayan Mukhopadhyay ◽  
Sayyed Vazirizade ◽  
Philip Pugliese ◽  
Aron Laszka ◽  
...  

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Yufeng Yao ◽  
Zhiming Cui

Epilepsy is a chronic disease caused by sudden abnormal discharge of brain neurons, causing transient brain dysfunction. The seizures of epilepsy have the characteristics of being sudden and repetitive, which has seriously endangered patients’ health, cognition, etc. In the current condition, EEG plays a vital role in the diagnosis, judgment, and qualitative location of epilepsy among the clinical diagnosis of various epileptic seizures and is an indispensable means of detection. The study of the EEG signals of patients with epilepsy can provide a strong basis and useful information for in-depth understanding of its pathogenesis. Although, intelligent classification technologies based on machine learning have been widely used to the classification of epilepsy EEG signals and show the effectiveness. In fact, it is difficult to ensure that there is always enough EEG data available for training the model in real life, which will affect the performance of the algorithms. In view of this, to reduce the impact of insufficient data on the detection performance of the algorithms, a novel discriminate least squares regression- (DLSR-) based inductive transfer learning method was introduced which is on the basis of DLSR and the inductive transfer learning. And, it is applied to promote the adaptability and accuracy of the epilepsy EEG signal recognition. The proposed method inherits the advantages of DLSR; it can be more suitable for classification scenarios by expanding the interval between different classes. Meanwhile, it can simultaneously use the data of the target domain and the knowledge of the source domain, which is helpful for getting better performance. The results show that the improved method has more advantages in EEG signal recognition comparing to several other representative methods.


Sign in / Sign up

Export Citation Format

Share Document