scholarly journals PRECISE: a domain adaptation approach to transfer predictors of drug response from pre-clinical models to tumors

2019 ◽  
Vol 35 (14) ◽  
pp. i510-i519 ◽  
Author(s):  
Soufiane Mourragui ◽  
Marco Loog ◽  
Mark A van de Wiel ◽  
Marcel J T Reinders ◽  
Lodewyk F A Wessels

Abstract Motivation Cell lines and patient-derived xenografts (PDXs) have been used extensively to understand the molecular underpinnings of cancer. While core biological processes are typically conserved, these models also show important differences compared to human tumors, hampering the translation of findings from pre-clinical models to the human setting. In particular, employing drug response predictors generated on data derived from pre-clinical models to predict patient response remains a challenging task. As very large drug response datasets have been collected for pre-clinical models, and patient drug response data are often lacking, there is an urgent need for methods that efficiently transfer drug response predictors from pre-clinical models to the human setting. Results We show that cell lines and PDXs share common characteristics and processes with human tumors. We quantify this similarity and show that a regression model cannot simply be trained on cell lines or PDXs and then applied on tumors. We developed PRECISE, a novel methodology based on domain adaptation that captures the common information shared amongst pre-clinical models and human tumors in a consensus representation. Employing this representation, we train predictors of drug response on pre-clinical data and apply these predictors to stratify human tumors. We show that the resulting domain-invariant predictors show a small reduction in predictive performance in the pre-clinical domain but, importantly, reliably recover known associations between independent biomarkers and their companion drugs on human tumors. Availability and implementation PRECISE and the scripts for running our experiments are available on our GitHub page (https://github.com/NKI-CCB/PRECISE). Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Soufiane Mourragui ◽  
Marco Loog ◽  
Marcel JT Reinders ◽  
Lodewyk FA Wessels

AbstractMotivationCell lines and patient-derived xenografts (PDX) have been used extensively to understand the molecular underpinnings of cancer. While core biological processes are typically conserved, these models also show important differences compared to human tumors, hampering the translation of findings from pre-clinical models to the human setting. In particular, employing drug response predictors generated on data derived from pre-clinical models to predict patient response, remains a challenging task. As very large drug response datasets have been collected for pre-clinical models, and patient drug response data is often lacking, there is an urgent need for methods that efficiently transfer drug response predictors from pre-clinical models to the human setting.ResultsWe show that cell lines and PDXs share common characteristics and processes with human tumors. We quantify this similarity and show that a regression model cannot simply be trained on cell lines or PDXs and then applied on tumors. We developed PRECISE, a novel methodology based on domain adaptation that captures the common information shared amongst pre-clinical models and human tumors in a consensus representation. Employing this representation, we train predictors of drug response on pre-clinical data and apply these predictors to stratify human tumors. We show that the resulting domain-invariant predictors show a small reduction in predictive performance in the pre-clinical domain but, importantly, reliably recover known associations between independent biomarkers and their companion drugs on human tumors.AvailabilityPRECISE and the scripts for running our experiments are available on our GitHub page (https://github.com/NKI-CCB/PRECISE)[email protected] informationSupplementary data are available. online.


2021 ◽  
Vol 118 (49) ◽  
pp. e2106682118
Author(s):  
Soufiane M. C. Mourragui ◽  
Marco Loog ◽  
Daniel J. Vis ◽  
Kat Moore ◽  
Anna G. Manjon ◽  
...  

Preclinical models have been the workhorse of cancer research, producing massive amounts of drug response data. Unfortunately, translating response biomarkers derived from these datasets to human tumors has proven to be particularly challenging. To address this challenge, we developed TRANSACT, a computational framework that builds a consensus space to capture biological processes common to preclinical models and human tumors and exploits this space to construct drug response predictors that robustly transfer from preclinical models to human tumors. TRANSACT performs favorably compared to four competing approaches, including two deep learning approaches, on a set of 23 drug prediction challenges on The Cancer Genome Atlas and 226 metastatic tumors from the Hartwig Medical Foundation. We demonstrate that response predictions deliver a robust performance for a number of therapies of high clinical importance: platinum-based chemotherapies, gemcitabine, and paclitaxel. In contrast to other approaches, we demonstrate the interpretability of the TRANSACT predictors by correctly identifying known biomarkers of targeted therapies, and we propose potential mechanisms that mediate the resistance to two chemotherapeutic agents.


2019 ◽  
Vol 35 (17) ◽  
pp. 3055-3062 ◽  
Author(s):  
Amrit Singh ◽  
Casey P Shannon ◽  
Benoît Gautier ◽  
Florian Rohart ◽  
Michaël Vacher ◽  
...  

Abstract Motivation In the continuously expanding omics era, novel computational and statistical strategies are needed for data integration and identification of biomarkers and molecular signatures. We present Data Integration Analysis for Biomarker discovery using Latent cOmponents (DIABLO), a multi-omics integrative method that seeks for common information across different data types through the selection of a subset of molecular features, while discriminating between multiple phenotypic groups. Results Using simulations and benchmark multi-omics studies, we show that DIABLO identifies features with superior biological relevance compared with existing unsupervised integrative methods, while achieving predictive performance comparable to state-of-the-art supervised approaches. DIABLO is versatile, allowing for modular-based analyses and cross-over study designs. In two case studies, DIABLO identified both known and novel multi-omics biomarkers consisting of mRNAs, miRNAs, CpGs, proteins and metabolites. Availability and implementation DIABLO is implemented in the mixOmics R Bioconductor package with functions for parameters’ choice and visualization to assist in the interpretation of the integrative analyses, along with tutorials on http://mixomics.org and in our Bioconductor vignette. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Soufiane Mourragui ◽  
Marco Loog ◽  
Daniel J. Vis ◽  
Kat Moore ◽  
Anna G. Manjon ◽  
...  

AbstractPre-clinical models have been the workhorse of cancer research for decades. While powerful, these models do not fully recapitulate the complexity of human tumors. Consequently, translating biomarkers of drug response from pre-clinical models to human tumors has been particularly challenging. To explicitly take these differences into account and enable an efficient exploitation of the vast pre-clinical drug response resources, we developed TRANSACT, a novel computational framework for clinical drug response prediction. First, TRANSACT employs non-linear manifold learning to capture biological processes active in pre-clinical models and human tumors. Then, TRANSACT builds predictors on cell line response only and transfers these to Patient-Derived Xenografts (PDXs) and human tumors. TRANSACT outperforms four competing approaches, including Deep Learning approaches, for a set of 15 drugs on PDXs, TCGA cohorts and 226 metastatic tumors from the Hartwig Medical Foundation data. For only four drugs Deep Learning outperforms TRANSACT. We further derived an algorithmic approach to interpret TRANSACT and used it to validate the approach by identifying known biomarkers to targeted therapies and we propose novel putative biomarkers of resistance to Paclitaxel and Gemcitabine.


2017 ◽  
Author(s):  
Chayaporn Supahvilai ◽  
Denis Bertrand ◽  
Niranjan Nagarajan

AbstractMotivationAs we move towards an era of precision medicine, the ability to predict patient-specific drug responses in cancer based on molecular information such as gene expression data represents both an opportunity and a challenge. In particular, methods are needed that can accommodate the high-dimensionality of data to learn interpretable models capturing drug response mechanisms, as well as providing robust predictions across datasets.ResultsWe propose a method based on ideas from “recommender systems” (CaDRReS) that predicts cancer drug responses for unseen cell-lines/patients based on learning projections for drugs and cell-lines into a latent “pharmacogenomic” space. Comparisons with other proposed approaches for this problem based on large public datasets (CCLE, GDSC) shows that CaDRReS provides consistently good models and robust predictions even across unseen patient-derived cell-line datasets. Analysis of the pharmacogenomic spaces inferred by CaDRReS also suggests that they can be used to understand drug mechanisms, identify cellular subtypes, and further characterize drug-pathway associations.AvailabilitySource code and datasets are available at https://github.com/CSB5/[email protected] informationSupplementary data are available online.


2019 ◽  
Vol 35 (18) ◽  
pp. 3263-3272
Author(s):  
Sahand Khakabimamaghani ◽  
Yogeshwar D Kelkar ◽  
Bruno M Grande ◽  
Ryan D Morin ◽  
Martin Ester ◽  
...  

Abstract Motivation Patient stratification methods are key to the vision of precision medicine. Here, we consider transcriptional data to segment the patient population into subsets relevant to a given phenotype. Whereas most existing patient stratification methods focus either on predictive performance or interpretable features, we developed a method striking a balance between these two important goals. Results We introduce a Bayesian method called SUBSTRA that uses regularized biclustering to identify patient subtypes and interpretable subtype-specific transcript clusters. The method iteratively re-weights feature importance to optimize phenotype prediction performance by producing more phenotype-relevant patient subtypes. We investigate the performance of SUBSTRA in finding relevant features using simulated data and successfully benchmark it against state-of-the-art unsupervised stratification methods and supervised alternatives. Moreover, SUBSTRA achieves predictive performance competitive with the supervised benchmark methods and provides interpretable transcriptional features in diverse biological settings, such as drug response prediction, cancer diagnosis, or kidney transplant rejection. Availability and implementation The R code of SUBSTRA is available at https://github.com/sahandk/SUBSTRA. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Calvin Chi ◽  
Yuting Ye ◽  
Bin Chen ◽  
Haiyan Huang1

Abstract Motivation In pharmacogenomic studies, the biological context of cell lines influences the predictive ability of drug-response models and the discovery of biomarkers. Thus, similar cell lines are often studied together based on prior knowledge of biological annotations. However, this selection approach is not scalable with the number of annotations, and the relationship between gene-drug association patterns and biological context may not be obvious. Results We present a procedure to compare cell lines based on their gene-drug association patterns. Starting with a grouping of cell lines from biological annotation, we model gene-drug association patterns for each group as a bipartite graph between genes and drugs. This is accomplished by applying sparse canonical correlation analysis (SCCA) to extract the gene-drug associations, and using the canonical vectors to construct the edge weights. Then, we introduce a nuclear norm-based dissimilarity measure to compare the bipartite graphs. Accompanying our procedure is a permutation test to evaluate the significance of similarity of cell line groups in terms of gene-drug associations. In the pharmacogenomics datasets CTRP2, GDSC2, and CCLE, hierarchical clustering of carcinoma groups based on this dissimilarity measure uniquely reveals clustering patterns driven by carcinoma subtype rather than primary site. Next, we show that the top associated drugs or genes from SCCA can be used to characterize the clustering patterns of haematopoietic and lymphoid malignancies. Finally, we confirm by simulation that when drug responses are linearly-dependent on expression, our approach is the only one that can effectively infer the true hierarchy compared to existing approaches. Availability Bipartite graph-based hierarchical clustering is implemented in R and can be obtained from CRAN: https://CRAN.R-project.org/package=hierBipartite. The source code is available at https://github.com/CalvinTChi/hierBipartite Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i380-i388
Author(s):  
Hossein Sharifi-Noghabi ◽  
Shuman Peng ◽  
Olga Zolotareva ◽  
Colin C Collins ◽  
Martin Ester

Abstract Motivation The goal of pharmacogenomics is to predict drug response in patients using their single- or multi-omics data. A major challenge is that clinical data (i.e. patients) with drug response outcome is very limited, creating a need for transfer learning to bridge the gap between large pre-clinical pharmacogenomics datasets (e.g. cancer cell lines), as a source domain, and clinical datasets as a target domain. Two major discrepancies exist between pre-clinical and clinical datasets: (i) in the input space, the gene expression data due to difference in the basic biology, and (ii) in the output space, the different measures of the drug response. Therefore, training a computational model on cell lines and testing it on patients violates the i.i.d assumption that train and test data are from the same distribution. Results We propose Adversarial Inductive Transfer Learning (AITL), a deep neural network method for addressing discrepancies in input and output space between the pre-clinical and clinical datasets. AITL takes gene expression of patients and cell lines as the input, employs adversarial domain adaptation and multi-task learning to address these discrepancies, and predicts the drug response as the output. To the best of our knowledge, AITL is the first adversarial inductive transfer learning method to address both input and output discrepancies. Experimental results indicate that AITL outperforms state-of-the-art pharmacogenomics and transfer learning baselines and may guide precision oncology more accurately. Availability and implementation https://github.com/hosseinshn/AITL. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 11 (2) ◽  
pp. 203-210 ◽  
Author(s):  
Jiguang Wang ◽  
Judith Kribelbauer ◽  
Raul Rabadan

Sign in / Sign up

Export Citation Format

Share Document