MOLI: multi-omics late integration with deep neural networks for drug response prediction

Hossein Sharifi-Noghabi; Olga Zolotareva; Colin C Collins; Martin Ester

doi:10.1093/bioinformatics/btz318

MOLI: multi-omics late integration with deep neural networks for drug response prediction

Bioinformatics ◽

10.1093/bioinformatics/btz318 ◽

2019 ◽

Vol 35 (14) ◽

pp. i501-i509 ◽

Cited By ~ 28

Author(s):

Hossein Sharifi-Noghabi ◽

Olga Zolotareva ◽

Colin C Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Prediction Accuracy ◽

Drug Response ◽

Deep Neural Networks ◽

Response Prediction ◽

Supplementary Information ◽

Precision Oncology

Abstract Motivation Historically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance. Results We propose MOLI, a multi-omics late integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding sub-networks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each other and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology. Availability and implementation https://github.com/hosseinshn/MOLI. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MOLI: Multi-Omics Late Integration with deep neural networks for drug response prediction

10.1101/531327 ◽

2019 ◽

Author(s):

Hossein Sharifi-Noghabi ◽

Olga Zolotareva ◽

Colin C. Collins ◽

Martin Ester

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Prediction Accuracy ◽

Drug Response ◽

Deep Neural Networks ◽

Response Prediction ◽

Precision Oncology ◽

Early Integration

AbstractMotivationHistorically, gene expression has been shown to be the most informative data for drug response prediction. Recent evidence suggests that integrating additional omics can improve the prediction accuracy which raises the question of how to integrate the additional omics. Regardless of the integration strategy, clinical utility and translatability are crucial. Thus, we reasoned a multi-omics approach combined with clinical datasets would improve drug response prediction and clinical relevance.ResultsWe propose MOLI, a Multi-Omics Late Integration method based on deep neural networks. MOLI takes somatic mutation, copy number aberration, and gene expression data as input, and integrates them for drug response prediction. MOLI uses type-specific encoding subnetworks to learn features for each omics type, concatenates them into one representation and optimizes this representation via a combined cost function consisting of a triplet loss and a binary cross-entropy loss. The former makes the representations of responder samples more similar to each and different from the non-responders, and the latter makes this representation predictive of the response values. We validate MOLI on in vitro and in vivo datasets for five chemotherapy agents and two targeted therapeutics. Compared to state-of-the-art single-omics and early integration multi-omics methods, MOLI achieves higher prediction accuracy in external validations. Moreover, a significant improvement in MOLI’s performance is observed for targeted drugs when training on a pan-drug input, i.e. using all the drugs with the same target compared to training only on drug-specific inputs. MOLI’s high predictive power suggests it may have utility in precision oncology.Availability of the implemented codeshttps://github.com/hosseinshn/[email protected] and [email protected]

Download Full-text

Clinical Drug Response Prediction by Using a Lq Penalized Network-Constrained Logistic Regression Method

Cellular Physiology and Biochemistry ◽

10.1159/000495826 ◽

2018 ◽

Vol 51 (5) ◽

pp. 2073-2084 ◽

Cited By ~ 5

Author(s):

Hai-Hui Huang ◽

Jing-Guo Dai ◽

Yong Liang

Keyword(s):

Gene Expression ◽

Logistic Regression ◽

Personalized Medicine ◽

Gene Expression Data ◽

Drug Response ◽

Prediction Models ◽

Response Prediction ◽

Expression Data

Background/Aims: One of the most important impacts of personalized medicine is the connection between patients’ genotypes and their drug responses. Despite a series of studies exploring this relationship, the predictive ability of such analyses still needs to be strengthened. Methods: Here we present the Lq penalized network-constrained logistic regression (Lq-NLR) method to meet this need, in which the predictors are integrated into the gene expression data and biological network knowledge and are combined with a more aggressive penalty function. Response prediction models for two cancer targeting drugs (erlotinib and sorafenib) were developed from gene expression data and IC50 values from a large panel of cancer cell lines by utilizing the proposed approach. Then the drug responders were tested with the baseline tumor gene expression data, yielding an in vivo drug sensitivity prediction. Results: These results demonstrated the high effectiveness of this approach. One of the best results achieved by our method was a correlation of 0.841 between the cell line in vitro drug response and patient’s in vivo drug response. We then applied these two drug prediction models to develop a personalized medicine approach in which the subsequent treatment depends on each patient’s gene-expression profile. Conclusion: The proposed method is much better than the existing approach and can capture a more accurate reflection of the relationship between genotypes and phenotypes.

Download Full-text

FORESEE: a tool for the systematic comparison of translational drug response modeling pipelines

10.7287/peerj.preprints.27256v1 ◽

2018 ◽

Author(s):

Lisa-Katrin Turnhoff ◽

Ali Hadizadeh Esfahani ◽

Maryam Montazeri ◽

Nina Kusch ◽

Andreas Schuppert

Keyword(s):

Drug Response ◽

Drug Efficacy ◽

Response Prediction ◽

R Package ◽

Supplementary Information ◽

Supplementary File ◽

Data Sets ◽

Training Algorithms ◽

Model Training

Translational models that utilize omics data generated in in vitro studies to predict the drug efficacy of anti-cancer compounds in patients are highly distinct, which complicates the benchmarking process for new computational approaches. In reaction to this, we introduce the uniFied translatiOnal dRug rESponsE prEdiction platform FORESEE, an open-source R-package. FORESEE not only provides a uniform data format for public cell line and patient data sets, but also establishes a standardized environment for drug response prediction pipelines, incorporating various state-of-the-art preprocessing methods, model training algorithms and validation techniques. The modular implementation of individual elements of the pipeline facilitates a straightforward development of combinatorial models, which can be used to re-evaluate and improve already existing pipelines as well as to develop new ones. Availability and Implementation: FORESEE is licensed under GNU General Public License v3.0 and available at https://github.com/JRC-COMBINE/FORESEE . Supplementary Information: Supplementary Files 1 and 2 provide detailed descriptions of the pipeline and the data preparation process, while Supplementary File 3 presents basic use cases of the package. Contact: [email protected]

Download Full-text

Deciphering the rules of mRNA structure differentiation in Saccharomyces cerevisiae in vivo and in vitro with deep neural networks

RNA Biology ◽

10.1080/15476286.2019.1612692 ◽

2019 ◽

Vol 16 (8) ◽

pp. 1044-1054 ◽

Cited By ~ 4

Author(s):

Haopeng Yu ◽

Wenjing Meng ◽

Yuanhui Mao ◽

Yi Zhang ◽

Qing Sun ◽

...

Keyword(s):

Saccharomyces Cerevisiae ◽

Neural Networks ◽

Deep Neural Networks ◽

Mrna Structure

Download Full-text

In vivo drug-response in patients with leukemic non-Hodgkin's lymphomas is associated with in vitro chemosensitivity and gene expression profiling

Pharmacological Research ◽

10.1016/j.phrs.2005.09.001 ◽

2006 ◽

Vol 53 (1) ◽

pp. 49-61 ◽

Cited By ~ 14

Author(s):

K CHOW ◽

D NOWAK ◽

S KIM ◽

B SCHNEIDER ◽

M KOMOR ◽

...

Keyword(s):

Gene Expression ◽

Gene Expression Profiling ◽

Expression Profiling ◽

Drug Response ◽

In Vitro Chemosensitivity ◽

Hodgkin's Lymphomas

Download Full-text

NPalmitoylDeep-PseAAC: A Predictor for N-Palmitoylation sites in Proteins using Deep Representations of Proteins and PseAAC via modified 5-steps rule

Current Bioinformatics ◽

10.2174/1574893615999200605142828 ◽

2020 ◽

Vol 15 ◽

Author(s):

Sheraz Naseer ◽

Waqar Hussain ◽

Yaser Daanial Khan ◽

Nouman Rasool

Keyword(s):

Neural Networks ◽

Prediction Model ◽

Deep Neural Networks ◽

Ex Vivo ◽

Feature Representation ◽

Post Translational Modification ◽

Lipid Modifications ◽

Gated Recurrent Unit

Background: Among all the major Post-translational modification, lipid modifications possess special significance due to their widespread functional importance in eukaryotic cells. There exist multiple types of lipid modifications and Palmitoylation, among them, is one of the broader types of modification, having three different types. The N-Palmitoylation is carried out by attachment of palmitic acid to an N-terminal cysteine. Due to the association of N-Palmitoylation with various biological functions and diseases such as Alzheimer’s and other neurodegenerative diseases, carrying out important processes in the life cycle of various pathogens, its identification is very important. Objective: The in vitro, ex vivo and in vivo identification of Palmitoylation is laborious, time-taking and costly. There is a dire need of an efficient and accurate computational model to help researchers and biologists identifying these sites, in an easy manner. Herein, we propose a novel prediction model for identification of N-Palmitoylation sites in proteins. Method: Proposed prediction model is developed by combining the Chou’s Pseudo Amino Acid Composition (PseAAC) with deep neural networks. We used well-known deep neural networks (DNNs) for both the tasks of learning a feature representation of peptide sequences and developing prediction model to perform classification. Results: Among different DNNs, Gated Recurrent Unit (GRU) based RNN model showed highest scores in terms of accuracy, and all other computed measures, and outperforms all the previously reported predictors. Conclusion: The proposed GRU based RNN model can help identifying N-Palmitoylation in a very efficient and accurate manner which can help scientists understand the mechanism of this modification in proteins.

Download Full-text

FORESEE: a tool for the systematic comparison of translational drug response modeling pipelines

Bioinformatics ◽

10.1093/bioinformatics/btz145 ◽

2019 ◽

Vol 35 (19) ◽

pp. 3846-3848 ◽

Cited By ~ 6

Author(s):

Lisa-Katrin Turnhoff ◽

Ali Hadizadeh Esfahani ◽

Maryam Montazeri ◽

Nina Kusch ◽

Andreas Schuppert

Keyword(s):

Drug Response ◽

Drug Efficacy ◽

Response Prediction ◽

R Package ◽

Supplementary Information ◽

Training Algorithms ◽

Anti Cancer ◽

Model Training ◽

Response Modeling

Abstract Summary Translational models that utilize omics data generated in in vitro studies to predict the drug efficacy of anti-cancer compounds in patients are highly distinct, which complicates the benchmarking process for new computational approaches. In reaction to this, we introduce the uniFied translatiOnal dRug rESponsE prEdiction platform FORESEE, an open-source R-package. FORESEE not only provides a uniform data format for public cell line and patient datasets, but also establishes a standardized environment for drug response prediction pipelines, incorporating various state-of-the-art pre-processing methods, model training algorithms and validation techniques. The modular implementation of individual elements of the pipeline facilitates a straightforward development of combinatorial models, which can be used to re-evaluate and improve already existing pipelines as well as to develop new ones. Availability and implementation FORESEE is licensed under GNU General Public License v3.0 and available at https://github.com/JRC-COMBINE/FORESEE and https://doi.org/10.17605/OSF.IO/RF6QK, and provides vignettes for documentation and application both online and in the Supplementary Files 2 and 3. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

iTOP: Inferring the Topology of Omics Data

10.1101/293993 ◽

2018 ◽

Author(s):

Nanne Aben ◽

Johan A. Westerhuis ◽

Yipeng Song ◽

Henk A.L. Kiers ◽

Magali Michaut ◽

...

Keyword(s):

Gene Expression ◽

Binary Data ◽

Drug Response ◽

Response Prediction ◽

R Package ◽

Supplementary Information ◽

Omics Data ◽

Reconstruction Algorithms ◽

Phenotypic Data ◽

Rv Coefficient

AbstractMotivationIn biology, we are often faced with multiple datasets recorded on the same set of objects, such as multi-omics and phenotypic data of the same tumors. These datasets are typically not independent from each other. For example, methylation may influence gene expression, which may, in turn, influence drug response. Such relationships can strongly affect analyses performed on the data, as we have previously shown for the identification of biomarkers of drug response. Therefore, it is important to be able to chart the relationships between datasets.ResultsWe present iTOP, a methodology to infera topology of relationships between datasets. We base this methodology on the RV coefficient, a measure of matrix correlation, which can be used to determine how much information is shared between two datasets. We extended the RV coefficient for partial matrix correlations, which allows the use of graph reconstruction algorithms, such as the PC algorithm, to infer the topologies. In addition, since multi-omics data often contain binary data (e.g. mutations), we also extended the RV coefficient for binary data. Applying iTOP to pharmacogenomics data, we found that gene expression acts as a mediator between most other datasets and drug response: only proteomics clearly shares information with drug response that is not present in gene expression. Based on this result, we used TANDEM, a method for drug response prediction, to identify which variables predictive of drug response were distinct to either gene expression or proteomics.AvailabilityAn implementation of our methodology is available in the R package iTOP on CRAN. Additionally, an R Markdown document with code to reproduce all figures is provided as Supplementary [email protected] and [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Modeling positional effects of regulatory sequences with spline transformations increases prediction accuracy of deep neural networks

10.1101/165183 ◽

2017 ◽

Cited By ~ 1

Author(s):

Žiga Avsec ◽

Mohammadamin Barekatain ◽

Jun Cheng ◽

Julien Gagneur

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Prediction Accuracy ◽

Deep Neural Networks ◽

Rna Binding ◽

Piecewise Linear ◽

Regulatory Sequences ◽

Link Type

AbstractMotivationRegulatory sequences are not solely defined by their nucleic acid sequence but also by their relative distances to genomic landmarks such as transcription start site, exon boundaries, or polyadenylation site. Deep learning has become the approach of choice for modeling regulatory sequences because of its strength to learn complex sequence features. However, modeling relative distances to genomic landmarks in deep neural networks has not been addressed.ResultsHere we developed spline transformation, a neural network module based on splines to flexibly and robustly model distances. Modeling distances to various genomic landmarks with spline transformations significantly increased state-of-the-art prediction accuracy of in vivo RNA-binding protein binding sites for 114 out of 123 proteins. We also developed a deep neural network for human splice branchpoint based on spline transformations that outperformed the current best, already distance-based, machine learning model. Compared to piecewise linear transformation, as obtained by composition of rectified linear units, spline transformation yields higher prediction accuracy as well as faster and more robust training. As spline transformation can be applied to further quantities beyond distances, such as methylation or conservation, we foresee it as a versatile component in the genomics deep learning toolbox.AvailabilitySpline transformation is implemented as a Keras layer in the CONCISE python package: https://github.com/gagneurlab/concise. Analysis code is available at goo.gl/[email protected]; [email protected]

Download Full-text

Identification of Novel Therapeutic Strategies for NUP98-NSD1-Positive AML By Drug Sensitivity Profiling

Blood ◽

10.1182/blood.v124.21.2160.2160 ◽

2014 ◽

Vol 124 (21) ◽

pp. 2160-2160

Author(s):

Jarno Kivioja ◽

Mika Kontro ◽

Angeliki Thanasopoulou ◽

Muntasir Mamun Majumder ◽

Bhagwan Yadav ◽

...

Keyword(s):

Gene Expression ◽

Research Funding ◽

Drug Response ◽

Drug Sensitivity ◽

Lentiviral Vector ◽

Potential Candidate ◽

Flt3 Inhibitors ◽

Cells Cultured

Abstract Background The t(5;11)(q35;p15.5) translocation resulting in fusion of the nucleoporin NUP98 and methyltransferase NSD1 (NUP98-NSD1) genes is a recurrent aberration observed in pediatric and adult AML. The NUP98-NSD1 fusion often co-occurs with the FLT3-ITD mutation and characterizes a group of cytogenetically normal AML patients with very poor prognosis. Despite advances in the understanding of the biology of NUP98-NSD1-positive AML, its therapeutic success rate has remained low. We aimed to identify novel candidate drugs for NUP98-NSD1-positive AML by testing primary patient cells and in vitro cell models with a high-throughput drug sensitivity platform. Methods Leukemic blasts were Ficoll separated from bone marrow (BM) aspirates of an AML patient positive for t(5;11)(q35;p15.5) and FLT3-ITD. RNA extracted from primary cells was used for RNA sequencing and gene expression analysis. NUP98-NSD1 cDNA was amplified from primary cell RNA and expressed from a lentiviral vector (LeGO-iCer2) also encoding the cerulean fluorescent marker. The NUP98-NSD1/LeGo-iCer2 and empty LeGo-iCer2 viruses were used to establish stably expressing Ba/F3 cell lines. Primary murine (BALB/c) BM cells were transduced with NUP98-NSD1 and FLT3-ITD retroviruses alone or in combination (NNF) in vitro (“preleukemic”) or passaged in vivo (“leukemic”) as previously described (Thanasopoulou et al, 2014). For screening, 309 small molecule inhibitors including FDA/EMA-approved and investigational oncology drugs were plated on 384-well plates in a 10,000-fold concentration range. Cells were dispensed on the pre-drugged plates and incubated at 37°C for 72h, and then cell viability measured using the CellTiter-Glo® luminescent assay. Drug response curves were generated and a drug sensitivity score determined (Yadav et al, 2014). Select drug sensitivity was calculated for each drug by comparing results between primary leukemic and healthy donor BM cells or between the cell constructs and empty vector transduced controls cells. Results Primary patient cells and murine BM cells expressing FLT3-ITD alone or in combination with NUP98-NSD1 were selectively sensitive to specific FLT3 inhibitors (e.g. quizartinib, sorafenib and lestaurtinib), and broad-spectrum receptor tyrosine kinase inhibitors targeting FLT3-ITD (e.g. cabozantinib, crenolanib, foretinib, midostaurin, MGCD-265 and ponatinib). Furthermore, these cells were highly sensitive to checkpoint kinase 1/2- inhibitor AZD7762. The primary murine cells expressing both NUP98-NSD1 and FLT3-ITD showed higher sensitivity to all of the above-mentioned drugs compared to cells expressing either of the events alone indicating functional synergy. A very distinct drug response pattern was observed in the leukemic NNF cells cultured in vivo compared to the same cells cultured in vitro suggesting that microenvironment may also affect the observed drug responses. Interestingly, the preleukemic murine cells expressing NUP98-NSD1 with or without FLT3-ITD as well as the primary patient cells showed extreme vulnerability to BCL2/BCL-xL inhibitor navitoclax. Furthermore, primary murine cells expressing NUP98-NSD1 alone showed high select sensitivity to JAK-inhibitors ruxolitinib, BMS-911543, AZD1480 and tofacitinib indicating the fusion may stimulate JAK/STAT-signaling. Similar sensitivity was also observed in the Ba/F3-cells expressing NUP98-NSD1. In support of these findings, gene expression analyses showed high expression of anti-apoptotic factors BCL2, BCL-xL and MCL1 in the patient cells. MCL1 is regulated by STAT3 while BCL-xL is regulated by STAT5, which were also highly expressed. Conclusions In summary, we have observed an enhanced response to specific and non-specific FLT3 inhibitors in cells expressing NUP98-NSD1 and FLT3-ITD together compared to cells expressing either of the two alone. This coincides with previous findings that functional co-operation between NUP98-NSD1 and FLT3-ITD is important in AML (Thanasopoulou et al, 2014). We have seen high in-vitro-in-vivo correlation between primary patient cells and murine cells expressing NUP98-NSD1 and FLT3-ITD. Moreover, we have identified potential candidate compounds targeting oncogenic signaling activated by these two events. These data form a basis for clinical evaluation of candidate compounds for NUP98-NSD1-positive AML. Disclosures Porkka: Bristol-Myers Squibb: Honoraria, Research Funding; Novartis: Honoraria, Research Funding. Heckman:Celgene: Research Funding.

Download Full-text