Clinical drug response prediction from preclinical cancer cell lines by logistic matrix factorization approach

Author(s):  
Akram Emdadi ◽  
Changiz Eslahchi

Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (IC 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model’s performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF’s logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Akram Emdadi ◽  
Changiz Eslahchi

Abstract Background Predicting the response of cancer cell lines to specific drugs is an essential problem in personalized medicine. Since drug response is closely associated with genomic information in cancer cells, some large panels of several hundred human cancer cell lines are organized with genomic and pharmacogenomic data. Although several methods have been developed to predict the drug response, there are many challenges in achieving accurate predictions. This study proposes a novel feature selection-based method, named Auto-HMM-LMF, to predict cell line-drug associations accurately. Because of the vast dimensions of the feature space for predicting the drug response, Auto-HMM-LMF focuses on the feature selection issue for exploiting a subset of inputs with a significant contribution. Results This research introduces a novel method for feature selection of mutation data based on signature assignments and hidden Markov models. Also, we use the autoencoder models for feature selection of gene expression and copy number variation data. After selecting features, the logistic matrix factorization model is applied to predict drug response values. Besides, by comparing to one of the most powerful feature selection methods, the ensemble feature selection method (EFS), we showed that the performance of the predictive model based on selected features introduced in this paper is much better for drug response prediction. Two datasets, the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) are used to indicate the efficiency of the proposed method across unseen patient cell-line. Evaluation of the proposed model showed that Auto-HMM-LMF could improve the accuracy of the results of the state-of-the-art algorithms, and it can find useful features for the logistic matrix factorization method. Conclusions We depicted an application of Auto-HMM-LMF in exploring the new candidate drugs for head and neck cancer that showed the proposed method is useful in drug repositioning and personalized medicine. The source code of Auto-HMM-LMF method is available in https://github.com/emdadi/Auto-HMM-LMF.


2021 ◽  
Author(s):  
Krzysztof Koras ◽  
Ewa Kizling ◽  
Dilafruz Juraeva ◽  
Eike Staub ◽  
Ewa Szczurek

Computational models for drug sensitivity prediction have the potential to revolutionise personalized cancer medicine. Drug sensitivity assays, as well as profiling of cancer cell lines and drugs becomes increasingly available for training such models. Machine learning methods for drug sensitivity prediction must be optimized for: (i) leveraging the wealth of information about both cancer cell lines and drugs, (ii) predictive performance and (iii) interpretability. Multiple methods were proposed for predicting drug sensitivity from cancer cell line features, some in a multi-task fashion. So far, no such model leveraged drug inhibition profiles. Recent neural network-based recommender systems arise as models capable of predicting cancer cell line response to drugs from their biological features with high prediction accuracy. These models, however, require a tailored approach to model interpretability. In this work, we develop a neural network recommender system for kinase inhibitor sensitivity prediction called DEERS. The model utilizes molecular features of the cancer cell lines and kinase inhibition profiles of the drugs. DEERS incorporates two autoencoders to project cell line and drug features into 10-dimensional hidden representations and a feed-forward neural network to combine them into response prediction. We propose a novel model interpretability approach offering the widest possible assessment of the specific genes and biological processes that underlie the action of the drugs on the cell lines. The approach considers also such genes and processes that were not included in the set of modeled features. Our approach outperforms simpler matrix factorization models, achieving R=0.82 correlation between true and predicted response for the unseen cell lines. Using the interpretability analysis, we evaluate correlation of all human genes with each of the hidden cell line dimensions. Subsequently, we identify 67 biological processes associated with these dimensions. Combined with drug response data, these associations point at the processes that drive the cell line sensitivity to particular compounds. Detailed case studies are shown for PHA-793887, XMD14-99 and Dabrafenib. Our framework provides an expressive, multitask neural network model with a custom interpretability approach for inferring underlying biological factors and explaining cancer cell response to drugs.


2020 ◽  
Author(s):  
Banabithi Bose ◽  
Serdar Bozdag

ABSTRACTIn cancer research and drug development, human tumor-derived cell lines are used as popular model for cancer patients to evaluate the biological functions of genes, drug efficacy, side-effects, and drug metabolism. Using these cell lines, the functional relationship between genes and drug response and prediction of drug response based on genomic and chemical features have been studied. Knowing the drug response on the real patients, however, is a more important and challenging task. To tackle this challenge, some studies integrate data from primary tumors and cancer cell lines to find associations between cell lines and tumors. These studies, however, do not integrate multi-omics datasets to their full extent. Also, several studies rely on a genome-wide correlation-based approach between cell lines and bulk tumor samples without considering the heterogeneous cell population in bulk tumors. To address these gaps, we developed a computational pipeline, CTDPathSim, a pathway activity-based approach to compute similarity between primary tumor samples and cell lines at genetic, genomic, and epigenetic levels integrating multi-omics datasets. We utilized a deconvolution method to get cell type-specific DNA methylation and gene expression profiles and computed deconvoluted methylation and expression profiles of tumor samples. We assessed CTDPathSim by applying on breast and ovarian cancer data in The Cancer Genome Atlas (TCGA) and cancer cell lines data in the Cancer Cell Line Encyclopedia (CCLE) databases. Our results showed that highly similar sample-cell line pairs have similar drug response compared to lowly similar pairs in several FDA-approved cancer drugs, such as Paclitaxel, Vinorelbine and Mitomycin-c. CTDPathSim outperformed state-of-the-art methods in recapitulating the known drug responses between samples and cell lines. Also, CTDPathSim selected higher number of significant cell lines belonging to the same cancer types than other methods. Furthermore, our aligned cell lines to samples were found to be clinical biomarkers for patients’ survival whereas unaligned cell lines were not. Our method could guide the selection of appropriate cell lines to be more intently serve as proxy of patient tumors and could direct the pre-clinical translation of drug testing into clinical platform towards the personalized therapies. Furthermore, this study could guide the new uses for old drugs and benefits the development of new drugs in cancer treatments.CCS CONCEPTSComputational biologyGenomicsSystems biologyBioinformaticsGeneticsACM Reference formatBanabithi Bose, Serdar Bozdag. 2020. CTDPathSim: Cell line-tumor deconvoluted pathway-based similarity in the context of precision medicine in cancer.


2021 ◽  
Author(s):  
Hossein Sharifi-Noghabi ◽  
Soheil Jahangiri-Tazehkand ◽  
Casey Hon ◽  
Petr Smirnov ◽  
Anthony Mammoliti ◽  
...  

ABSTRACTThe goal of precision oncology is to tailor treatment for patients individually using the genomic profile of their tumors. Pharmacogenomics datasets such as cancer cell lines are among the most valuable resources for drug sensitivity prediction, a crucial task of precision oncology. Machine learning methods have been employed to predict drug sensitivity based on the multiple omics data available for large panels of cancer cell lines. However, there are no comprehensive guidelines on how to properly train and validate such machine learning models for drug sensitivity prediction. In this paper, we introduce a set of guidelines for different aspects of training a predictor using cell line datasets. These guidelines provide extensive analysis of the generalization of drug sensitivity predictors, and challenge many current practices in the community including the choice of training dataset and measure of drug sensitivity. Application of these guidelines in future studies will enable the development of more robust preclinical biomarkers.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Krzysztof Koras ◽  
Ewa Kizling ◽  
Dilafruz Juraeva ◽  
Eike Staub ◽  
Ewa Szczurek

AbstractComputational models for drug sensitivity prediction have the potential to significantly improve personalized cancer medicine. Drug sensitivity assays, combined with profiling of cancer cell lines and drugs become increasingly available for training such models. Multiple methods were proposed for predicting drug sensitivity from cancer cell line features, some in a multi-task fashion. So far, no such model leveraged drug inhibition profiles. Importantly, multi-task models require a tailored approach to model interpretability. In this work, we develop DEERS, a neural network recommender system for kinase inhibitor sensitivity prediction. The model utilizes molecular features of the cancer cell lines and kinase inhibition profiles of the drugs. DEERS incorporates two autoencoders to project cell line and drug features into 10-dimensional hidden representations and a feed-forward neural network to combine them into response prediction. We propose a novel interpretability approach, which in addition to the set of modeled features considers also the genes and processes outside of this set. Our approach outperforms simpler matrix factorization models, achieving R $$=$$ =  0.82 correlation between true and predicted response for the unseen cell lines. The interpretability analysis identifies 67 biological processes that drive the cell line sensitivity to particular compounds. Detailed case studies are shown for PHA-793887, XMD14-99 and Dabrafenib.


2021 ◽  
Author(s):  
David Earl Hostallero ◽  
Lixuan Wei ◽  
Liewei Wang ◽  
Junmei Cairns ◽  
Amin Emad

Background: Prediction of the response of cancer patients to different treatments and identification of biomarkers of drug sensitivity are two major goals of individualized medicine. In this study, we developed a deep learning framework called TINDL, completely trained on preclinical cancer cell lines, to predict the response of cancer patients to different treatments. TINDL utilizes a tissue-informed normalization to account for the tissue and cancer type of the tumours and to reduce the statistical discrepancies between cell lines and patient tumours. In addition, this model identifies a small set of genes whose mRNA expression are predictive of drug response in the trained model, enabling identification of biomarkers of drug sensitivity. Results: Using data from two large databases of cancer cell lines and cancer tumours, we showed that this model can distinguish between sensitive and resistant tumours for 10 (out of 14) drugs, outperforming various other machine learning models. In addition, our siRNA knockdown experiments on 10 genes identified by this model for one of the drugs (tamoxifen) confirmed that all of these genes significantly influence the drug sensitivity of the MCF7 cell line to this drug. In addition, genes implicated for multiple drugs pointed to shared mechanism of action among drugs and suggested several important signaling pathways. Conclusions: In summary, this study provides a powerful deep learning framework for prediction of drug response and for identification of biomarkers of drug sensitivity in cancer.


Genes ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 844
Author(s):  
Abhishek Majumdar ◽  
Yueze Liu ◽  
Yaoqin Lu ◽  
Shaofeng Wu ◽  
Lijun Cheng

Background: Cancer cell lines are frequently used in research as in-vitro tumor models. Genomic data and large-scale drug screening have accelerated the right drug selection for cancer patients. Accuracy in drug response prediction is crucial for success. Due to data-type diversity and big data volume, few methods can integrative and efficiently find the principal low-dimensional manifold of the high-dimensional cancer multi-omics data to predict drug response in precision medicine. Method: A novelty k-means Ensemble Support Vector Regression (kESVR) is developed to predict each drug response values for single patient based on cell-line gene expression data. The kESVR is a blend of supervised and unsupervised learning methods and is entirely data driven. It utilizes embedded clustering (Principal Component Analysis and k-means clustering) and local regression (Support Vector Regression) to predict drug response and obtain the global pattern while overcoming missing data and outliers’ noise. Results: We compared the efficiency and accuracy of kESVR to 4 standard machine learning regression models: (1) simple linear regression, (2) support vector regression (3) random forest (quantile regression forest) and (4) back propagation neural network. Our results, which based on drug response across 610 cancer cells from Cancer Cell Line Encyclopedia (CCLE) and Cancer Therapeutics Response Portal (CTRP v2), proved to have the highest accuracy (smallest mean squared error (MSE) measure). We next compared kESVR with existing 17 drug response prediction models based a varied range of methods such as regression, Bayesian inference, matrix factorization and deep learning. After ranking the 18 models based on their accuracy of prediction, kESVR ranks first (best performing) in majority (74%) of the time. As for the remaining (26%) cases, kESVR still ranked in the top five performing models. Conclusion: In this paper we introduce a novel model (kESVR) for drug response prediction using high dimensional cell-line gene expression data. This model outperforms current existing prediction models in terms of prediction accuracy and speed and overcomes overfitting. This can be used in future to develop a robust drug response prediction system for cancer patients using the cancer cell-lines guidance and multi-omics data.


2019 ◽  
Author(s):  
Maryam Pouryahya ◽  
Jung Hun Oh ◽  
James C. Mathews ◽  
Zehor Belkhatir ◽  
Caroline Moosmüller ◽  
...  

AbstractThe study of large-scale pharmacogenomics provides an unprecedented opportunity to develop computational models that can accurately predict large cohorts of cell lines and drugs. In this work, we present a novel method for predicting drug sensitivity in cancer cell lines which considers both cell line genomic features and drug chemical features. Our network-based approach combines the theory of optimal mass transport (OMT) with machine learning techniques. It starts with unsupervised clustering of both cell line and drug data, followed by the prediction of drug sensitivity in the paired cluster of cell lines and drugs. We show that prior clustering of the heterogenous cell lines and structurally diverse drugs significantly improves the accuracy of the prediction. In addition, it facilities the interpretability of the results and identification of molecular biomarkers which are significant for both clustering of the cell lines and predicting the drug response.


2017 ◽  
Vol 63 (1) ◽  
pp. 141-145
Author(s):  
Yuliya Khochenkova ◽  
Eliso Solomko ◽  
Oksana Ryabaya ◽  
Yevgeniya Stepanova ◽  
Dmitriy Khochenkov

The discovery for effective combinations of anticancer drugs for treatment for breast cancer is the actual problem in the experimental chemotherapy. In this paper we conducted a study of antitumor effect of the combination of sunitinib and bortezomib against MDA-MB-231 and SKBR-3 breast cancer cell lines in vitro. We found that bortezomib in non-toxic concentrations can potentiate the antitumor activity of sunitinib. MDA-MB-231 cell line has showed great sensitivity to the combination of bortezomib and sunitinib in vitro. Bortezomib and sunitinib caused reduced expression of receptor tyrosine kinases VEGFR1, VEGFR2, PDGFRa, PDGFRß and c-Kit on HER2- and HER2+ breast cancer cell lines


2020 ◽  
Vol 20 (23) ◽  
pp. 2070-2079
Author(s):  
Srimadhavi Ravi ◽  
Sugata Barui ◽  
Sivapriya Kirubakaran ◽  
Parul Duhan ◽  
Kaushik Bhowmik

Background: The importance of inhibiting the kinases of the DDR pathway for radiosensitizing cancer cells is well established. Cancer cells exploit these kinases for their survival, which leads to the development of resistance towards DNA damaging therapeutics. Objective: In this article, the focus is on targeting the key mediator of the DDR pathway, the ATM kinase. A new set of quinoline-3-carboxamides, as potential inhibitors of ATM, is reported. Methods: Quinoline-3-carboxamide derivatives were synthesized and cytotoxicity assay was performed to analyze the effect of molecules on different cancer cell lines like HCT116, MDA-MB-468, and MDA-MB-231. Results: Three of the synthesized compounds showed promising cytotoxicity towards a selected set of cancer cell lines. Western Blot analysis was also performed by pre-treating the cells with quercetin, a known ATM upregulator, by causing DNA double-strand breaks. SAR studies suggested the importance of the electron-donating nature of the R group for the molecule to be toxic. Finally, Western-Blot analysis confirmed the down-regulation of ATM in the cells. Additionally, the PTEN negative cell line, MDA-MB-468, was more sensitive towards the compounds in comparison with the PTEN positive cell line, MDA-MB-231. Cytotoxicity studies against 293T cells showed that the compounds were at least three times less toxic when compared with HCT116. Conclusion: In conclusion, these experiments will lay the groundwork for the evolution of potent and selective ATM inhibitors for the radio- and chemo-sensitization of cancer cells.


Sign in / Sign up

Export Citation Format

Share Document