scholarly journals Predicting drug sensitivity of cancer cells based on DNA methylation levels

PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0238757
Author(s):  
Sofia P. Miranda ◽  
Fernanda A. Baião ◽  
Julia L. Fleck ◽  
Stephen R. Piccolo

Cancer cell lines, which are cell cultures derived from tumor samples, represent one of the least expensive and most studied preclinical models for drug development. Accurately predicting drug responses for a given cell line based on molecular features may help to optimize drug-development pipelines and explain mechanisms behind treatment responses. In this study, we focus on DNA methylation profiles as one type of molecular feature that is known to drive tumorigenesis and modulate treatment responses. Using genome-wide, DNA methylation profiles from 987 cell lines in the Genomics of Drug Sensitivity in Cancer database, we used machine-learning algorithms to evaluate the potential to predict cytotoxic responses for eight anti-cancer drugs. We compared the performance of five classification algorithms and four regression algorithms representing diverse methodologies, including tree-, probability-, kernel-, ensemble-, and distance-based approaches. We artificially subsampled the data to varying degrees, aiming to understand whether training based on relatively extreme outcomes would yield improved performance. When using classification or regression algorithms to predict discrete or continuous responses, respectively, we consistently observed excellent predictive performance when the training and test sets consisted of cell-line data. Classification algorithms performed best when we trained the models using cell lines with relatively extreme drug-response values, attaining area-under-the-receiver-operating-characteristic-curve values as high as 0.97. The regression algorithms performed best when we trained the models using the full range of drug-response values, although this depended on the performance metrics we used. Finally, we used patient data from The Cancer Genome Atlas to evaluate the feasibility of classifying clinical responses for human tumors based on models derived from cell lines. Generally, the algorithms were unable to identify patterns that predicted patient responses reliably; however, predictions by the Random Forests algorithm were significantly correlated with Temozolomide responses for low-grade gliomas.

2020 ◽  
Author(s):  
Sofia P. Miranda ◽  
Fernanda A. Baião ◽  
Paula M. Maçaira ◽  
Julia L. Fleck ◽  
Stephen R. Piccolo

AbstractCancer cell lines, which are cell cultures developed from tumor samples, represent one of the least expensive and most studied preclinical models for drug development. Accurately predicting drug response for a given cell line based on molecular features may help to optimize drug-development pipelines and explain mechanisms behind treatment responses. In this study, we focus on DNA methylation profiles as one type of molecular feature that is known to drive tumorigenesis and modulate treatment responses. Using genome-wide, DNA methylation profiles from 987 cell lines from the Genomics of Drug Sensitivity in Cancer database, we applied machine-learning algorithms to evaluate the potential to predict cytotoxic responses for eight anti-cancer drugs. We compared the performance of five classification algorithms and four regression algorithms that use diverse methodologies, including tree-, probability-, kernel-, ensemble-, and distance-based approaches. For both types of algorithm, we artificially subsampled the data to varying degrees, aiming to understand whether training models based on relatively extreme outcomes would yield improved performance. We also performed an information-gain analysis to examine which genes were most predictive of drug responses. Finally, we used tumor data from The Cancer Genome Atlas to evaluate the feasibility of predicting clinical responses in humans based on models derived from cell lines. When using classification or regression algorithms to predict discrete or continuous responses, respectively, we consistently observed excellent predictive performance when the training and test sets both consisted of cell-line data. However, classification models derived from cell-line data failed to generalize effectively for tumors.


2013 ◽  
Vol 31 (15_suppl) ◽  
pp. e14544-e14544
Author(s):  
Eva Budinska ◽  
Jenny Wilding ◽  
Vlad Calin Popovici ◽  
Edoardo Missiaglia ◽  
Arnaud Roth ◽  
...  

e14544 Background: We identified CRC gene expression subtypes (ASCO 2012, #3511), which associate with established parameters of outcome as well as relevant biological motifs. We now substantiate their biological and potentially clinical significance by linking them with cell line data and drug sensitivity, primarily attempting to identify models for the poor prognosis subtypes Mesenchymal and CIMP-H like (characterized by EMT/stroma and immune-associated gene modules, respectively). Methods: We analyzed gene expression profiles of 35 publicly available cell lines with sensitivity data for 82 drug compounds, and our 94 cell lines with data on sensitivity for 7 compounds and colony morphology. As in vitro, stromal and immune-associated genes loose their relevance, we trained a new classifier based on genes expressed in both systems, which identifies the subtypes in both tissue and cell cultures. Cell line subtypes were validated by comparing their enrichment for molecular markers with that of our CRC subtypes. Drug sensitivity was assessed by linking original subtypes with 92 drug response signatures (MsigDB) via gene set enrichment analysis, and by screening drug sensitivity of cell line panels against our subtypes (Kruskal-Wallis test). Results: Of the cell lines 70% could be assigned to a subtype with a probability as high as 0.95. The cell line subtypes were significantly associated with their KRAS, BRAF and MSI status and corresponded to our CRC subtypes. Interestingly, the cell lines which in matrigel created a network of undifferentiated cells were assigned to the Mesenchymal subtype. Drug response studies revealed potential sensitivity of subtypes to multiple compounds, in addition to what could be predicted based on their mutational profile (e.g. sensitivity of the CIMP-H subtype to Dasatinib, p<0.01). Conclusions: Our data support the biological and potentially clinical significance of the CRC subtypes in their association with cell line models, including results of drug sensitivity analysis. Our subtypes might not only have prognostic value but might also be predictive for response to drugs. Subtyping cell lines further substantiates their significance as relevant model for functional studies.


2021 ◽  
Author(s):  
Sara Pidò ◽  
Carolina Testa ◽  
Pietro Pinoli

AbstractLarge annotated cell line collections have been proven to enable the prediction of drug response in the preclinical setting. We present an enhancement of Non-Negative Matrix Tri-Factorization method, which allows the integration of different data types for the prediction of missing associations. To test our method we retrieved a dataset from CCLE, containing the connections among cell lines and drugs by means of their IC50 values. We performed two different kind of experiments: a) prediction of missing values in the matrix, b) prediction of the complete drug profile of a new cell line, demonstrating the validity of the method in both scenarios.


Author(s):  
Akram Emdadi ◽  
Changiz Eslahchi

Predicting tumor drug response using cancer cell line drug response values for a large number of anti-cancer drugs is a significant challenge in personalized medicine. Predicting patient response to drugs from data obtained from preclinical models is made easier by the availability of different knowledge on cell lines and drugs. This paper proposes the TCLMF method, a predictive model for predicting drug response in tumor samples that was trained on preclinical samples and is based on the logistic matrix factorization approach. The TCLMF model is designed based on gene expression profiles, tissue type information, the chemical structure of drugs and drug sensitivity (IC 50) data from cancer cell lines. We use preclinical data from the Genomics of Drug Sensitivity in Cancer dataset (GDSC) to train the proposed drug response model, which we then use to predict drug sensitivity of samples from the Cancer Genome Atlas (TCGA) dataset. The TCLMF approach focuses on identifying successful features of cell lines and drugs in order to calculate the probability of the tumor samples being sensitive to drugs. The closest cell line neighbours for each tumor sample are calculated using a description of similarity between tumor samples and cell lines in this study. The drug response for a new tumor is then calculated by averaging the low-rank features obtained from its neighboring cell lines. We compare the results of the TCLMF model with the results of the previously proposed methods using two databases and two approaches to test the model’s performance. In the first approach, 12 drugs with enough known clinical drug response, considered in previous methods, are studied. For 7 drugs out of 12, the TCLMF can significantly distinguish between patients that are resistance to these drugs and the patients that are sensitive to them. These approaches are converted to classification models using a threshold in the second approach, and the results are compared. The results demonstrate that the TCLMF method provides accurate predictions across the results of the other algorithms. Finally, we accurately classify tumor tissue type using the latent vectors obtained from TCLMF’s logistic matrix factorization process. These findings demonstrate that the TCLMF approach produces effective latent vectors for tumor samples. The source code of the TCLMF method is available in https://github.com/emdadi/TCLMF.


2020 ◽  
Author(s):  
Evanthia Koukouli ◽  
Dennis Wang ◽  
Frank Dondelinger ◽  
Juhyun Park

AbstractCancer treatments can be highly toxic and frequently only a subset of the patient population will benefit from a given treatment. Tumour genetic makeup plays an important role in cancer drug sensitivity. We suspect that gene expression markers could be used as a decision aid for treatment selection or dosage tuning. Using in vitro cancer cell line dose-response and gene expression data from the Genomics of Drug Sensitivity in Cancer (GDSC) project, we build a dose-varying regression model. Unlike existing approaches, this allows us to estimate dosage-dependent associations with gene expression. We include the transcriptomic profiles as dose-invariant covariates into the regression model and assume that their effect varies smoothly over the dosage levels. A two-stage variable selection algorithm (variable screening followed by penalised regression) is used to identify genetic factors that are associated with drug response over the varying dosages. We evaluate the effectiveness of our method using simulation studies focusing on the choice of tuning parameters and cross-validation for predictive accuracy assessment. We further apply the model to data from five BRAF targeted compounds applied to different cancer cell lines under different dosage levels. We highlight the dosage-dependent dynamics of the associations between the selected genes and drug response, and we perform pathway enrichment analysis to show that the selected genes play an important role in pathways related to tumourgenesis and DNA damage response.Author SummaryTumour cell lines allow scientists to test anticancer drugs in a laboratory environment. Cells are exposed to the drug in increasing concentrations, and the drug response, or amount of surviving cells, is measured. Generally, drug response is summarized via a single number such as the concentration at which 50% of the cells have died (IC50). To avoid relying on such summary measures, we adopted a functional regression approach that takes the dose-response curves as inputs, and uses them to find biomarkers of drug response. One major advantage of our approach is that it describes how the effect of a biomarker on the drug response changes with the drug dosage. This is useful for determining optimal treatment dosages and predicting drug response curves for unseen drug-cell line combinations. Our method scales to large numbers of biomarkers by using regularisation and, in contrast with existing literature, selects the most informative genes by accounting for drug response at untested dosages. We demonstrate its value using data from the Genomics of Drug Sensitivity in Cancer project to identify genes whose expression is associated with drug response. We show that the selected genes recapitulate prior biological knowledge, and belong to known cancer pathways.


2019 ◽  
Author(s):  
Rene Quevedo ◽  
Nehme El-Hachem ◽  
Petr Smirnov ◽  
Zhaleh Safikhani ◽  
Trevor J. Pugh ◽  
...  

ABSTRACTBackgroundSomatic copy-number alterations that affect large genomic regions are a major source of genomic diversity in cancer and can impact cellular phenotypes. Clonal heterogeneity within cancer cell lines can affect phenotypic presentation, including drug response.MethodsWe aggregated and analyzed SNP and copy number profiles from six pharmacogenomic datasets encompassing 1,691 cell lines screened for 13 molecules. To look for sources of genotype and karyotype discordances, we compared SNP genotypes and segmental copy-ratios across 5 kb genomic bins. To assess the impact of genomic discordances on pharmacogenomic studies, we assessed gene expression and drug sensitivity data for compared discordant and concordant lines.ResultsWe found 6/1,378 (0.4%) cell lines profiled in two studies to be discordant in both genotypic and karyotypic identity, 51 (3.7%) discordant in genotype, 97 (7.0%) discordant in karyotype, and 125 (9.1%) potential misidentifications. We highlight cell lines REH, NCI-H23 and PSN1 as having drug response discordances that may hinge on divergent copy-number qConclusionsOur study highlights the low level of misidentification as evidence of effective cell line authentication standards in recent pharmacogenomic studies. However, the proclivity of cell lines to acquire somatic copy-number variants can alter the cellular phenotype, resulting in a biological and predictable effects on drug sensitivity. These findings highlight the need for verification of cell line copy number profiles to inform interpretation of drug sensitivity data in biomedical studies.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Suleyman Vural ◽  
Alida Palmisano ◽  
William C. Reinhold ◽  
Yves Pommier ◽  
Beverly A. Teicher ◽  
...  

Abstract Background Altered DNA methylation patterns play important roles in cancer development and progression. We examined whether expression levels of genes directly or indirectly involved in DNA methylation and demethylation may be associated with response of cancer cell lines to chemotherapy treatment with a variety of antitumor agents. Results We analyzed 72 genes encoding epigenetic factors directly or indirectly involved in DNA methylation and demethylation processes. We examined association of their pretreatment expression levels with methylation beta-values of individual DNA methylation probes, DNA methylation averaged within gene regions, and average epigenome-wide methylation levels. We analyzed data from 645 cancer cell lines and 23 cancer types from the Cancer Cell Line Encyclopedia and Genomics of Drug Sensitivity in Cancer datasets. We observed numerous correlations between expression of genes encoding epigenetic factors and response to chemotherapeutic agents. Expression of genes encoding a variety of epigenetic factors, including KDM2B, DNMT1, EHMT2, SETDB1, EZH2, APOBEC3G, and other genes, was correlated with response to multiple agents. DNA methylation of numerous target probes and gene regions was associated with expression of multiple genes encoding epigenetic factors, underscoring complex regulation of epigenome methylation by multiple intersecting molecular pathways. The genes whose expression was associated with methylation of multiple epigenome targets encode DNA methyltransferases, TET DNA methylcytosine dioxygenases, the methylated DNA-binding protein ZBTB38, KDM2B, SETDB1, and other molecular factors which are involved in diverse epigenetic processes affecting DNA methylation. While baseline DNA methylation of numerous epigenome targets was correlated with cell line response to antitumor agents, the complex relationships between the overlapping effects of each epigenetic factor on methylation of specific targets and the importance of such influences in tumor response to individual agents require further investigation. Conclusions Expression of multiple genes encoding epigenetic factors is associated with drug response and with DNA methylation of numerous epigenome targets that may affect response to therapeutic agents. Our findings suggest complex and interconnected pathways regulating DNA methylation in the epigenome, which may both directly and indirectly affect response to chemotherapy.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 1623-1623 ◽  
Author(s):  
Karen Dybkær ◽  
Hanne Due ◽  
Rasmus Froberg Brøndum ◽  
Ken H. Young ◽  
Martin Bøgsted

Background: Patients with Diffuse large B-cell lymphoma (DLBCL) in approximately 40% of cases suffer from primary refractory disease and treatment induced immuno-chemotherapy resistance demonstrating that standard provided treatment regimens are not sufficient to cure all patients. Early detection of resistance is of great importance and defining microRNA (miRNA) involvement in resistance could be useful to guide treatment selection and help monitor treatment administration while sparing patients for inefficient, but still toxic therapy. Concept and Aims: With information on drug-response specific miRNAs, we hypothesized that multi-miRNA panels can improve robustness of individual clinical markers and serve as a prognostic classifier predicting disease progression in DLBCL patients. Methods: Fifteen DLBCL cell lines were tested for sensitivity towards rituximab (R), cyclophosphamide (C), doxorubicin (H), and vincristine (O). Cell line specific seeding concentrations was used to ensure exponential growth and each cell line was subjected to 16 concentrations in serial 2-fold dilutions and number of metabolic active cells was evaluated after 48 hours of drug exposure using MTS assay. For each drug, we ranked the cell lines according to their sensitivity and categorized them as sensitive, intermediate responsive, or resistant. Differential miRNA expression analysis between sensitive and resistant cell lines identified 43 miRNAs to be associated with response to compounds of the R-CHOP regimen, by selecting probes with a log fold change larger than 2. Baseline miRNA expression data were obtained for each cell line in untreated condition, and differential miRNA expression analysis identified 43 miRNAs associated to response to R-CHOP. Using the Affymetrix HG-U133+2 platform, expression levels of the miRNA precursors were assessed in 701 diagnostic DLBCL biopsies, and miRNA-panel classifiers were build using multiple Cox regression or random survival forest. Results: Generated prognostic miRNA-panel classifiers were tested for predictive accuracies and were subsequently evaluated by Brier scores and time varying area under the ROC curves (tAUC). Progression-free survival (PFS) was chosen as the outcome, since it is a treatment evaluation parameter as closely as possible to the time of drug exposure and the tested miRNAs were all associated directly to drug specific response. Furthermore, overall survival (OS) was used for verification of findings. Comparison of analyses conducted for the respective cohorts (All DLBCL, ABC, and GCB patients) showed the lowest prediction errors for all models within the GCB subclass with a multivariate Cox miRNA-panel model including miR-146a, miR-155, miR-21, miR-34a, and miR-23a~miR-27a~miR-24-2 cluster performed the best and successfully stratified GCB-DLBCL patients into high- and low-risk of disease progression. In addition, combination of the miRNA-panel and international prognostic index (IPI) substantially increased prognostic performance in GCB classified patients, indicating a prognostic signal from the response-specific miRNAs independent of IPI. In conclusion: We found as proof of concept that adding gene expression data detecting drug-response specific miRNAs to the clinically established IPI improved the prognostic stratification of GCB-DLBCL patients treated with R-CHOP. Disclosures No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Ali Reza Ebadi ◽  
Ali Soleimani ◽  
Abdulbaghi Ghaderzadeh

Abstract Anti-cancer medicine for a particular patient has been a personal medical goal. Many computational models have been proposed by researchers to predict drug response. But predictive accuracy still remains a challenge. Base on this concept which “Similar cells have similar responses to drugs”, we developed the basic method of matrix factorization method by adding fines to similarity. So that the distance of latent factors to two cell lines or (drug) should be inversely related to similarity. This means that two similar drugs or similar cell lines should have a short distance, whereas two similar cell lines or non-similar drugs should have a large gap with their latent factors. We proposed a Dual similarity-regularized matrix factorization (DSRMF) model, then generated new data for drug similarity from the two-dimensional three-dimensional chemical structure, which were obtained from the CCLE and GDSC databases. In this research, by using the proposed model, and generating new drug similarity data we achieved the average Pearson correlation coefficient (PCC) about 0.96, and average mean square error (RMSE) Root about 0.30, between the observed value and the predicted value for the cell line response to the drug. Our analysis in this research showed, using heterogeneous data, has better results, and can be obtained with the proposed model, using other panels’ cancer cell lines, to calculate similarity between cells. Also, by imposing more restrictions on the similarity between cells, we were able to achieve more accurate prediction for the response of the cell line to the anticancer drug.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 4249-4249
Author(s):  
Amit Kumar Mitra ◽  
Ujjal Mukherjee ◽  
Taylor Harding ◽  
Holly Stessman ◽  
Ying Li ◽  
...  

Abstract Multiple myeloma (MM) is characterized by significant genetic diversity at subclonal levels that likely plays a defining role in the heterogeneity of tumor progression, clinical aggressiveness and drug sensitivity. Such heterogeneity is a driving factor in the evolution of MM, from founder clones through outgrowth of subclonal fractions. DNA Sequencing studies on MM samples have indeed demonstrated such heterogeneity in subclonal architecture at diagnosis based on recurrent mutations in pathologically relevant genes that may ultimately to lead to relapse. However, no study so far has reported a predictive gene expression signature that can identify, distinguish and quantify drug sensitive and drug-resistant subpopulations within a bulk population of myeloma cells. In recent years, our laboratory has successfully developed a gene expression profile (GEP)-based signature that could not only distinguish drug response of MM cell lines, but also was effective in stratifying patient outcomes when applied to GEP profiles from MM clinical trials using proteasome inhibitors (PI) as chemotherapeutic agents. Further, we noted myeloma cell lines that responded to the drug often contained residual sub-population of cells that did not respond, and likely were selectively propagated during drug treatment in vitro, and in patients. In this study, we performed targeted qRT-PCR analysis of single cells using a gene panel that included PI sensitivity genes and gene signatures that could discriminate between low and high-risk myeloma followed by intensive bioinformatics and statistical analysis for the classification and prediction of PI response in individual cells within bulk multiple myeloma tumors. Fluidigm's C1 Single-Cell Auto Prep System was used to perform automated single-cell capture, processing and cDNA synthesis on 576 pre-treatment cells from 12 cell lines representing a wide range of PI-sensitivity and 370 cells from 7 patient samples undergoing PI treatment followed by targeted gene expression profiling of single cells using automated, high-throughput on-chip qRT-PCR analysis using 96.96 Dynamic Array IFCs on the BioMark HD System. Probability of resistance for each individual cell was predicted using a pipeline that employed the machine learning methods Random Forest, Support Vector Machine (radial and sigmoidal), LASSO and kNN (k Nearest Neighbor) for making single-cell GEP data-driven predictions/ decisions. The weighted probabilities from each of the algorithms were used to quantify resistance of each individual cell and plotted using Ensemble forecasting algorithm. Using our drug response GEP signature at the single cell level, we could successfully identify distinct subpopulations of tumor cells that were predicted to be sensitive or resistant to PIs. Subsequently, we developed a R Statistical analysis package (http://cran.r-project.org), SCATTome (Single Cell Analysis of Targeted Transcriptome), that can restructure data obtained from Fluidigm qPCR analysis run, filter missing data, perform scaling of filtered data, build classification models and successfully predict drug response of individual cells and classify each cell's probability of response based on the targeted transcriptome. We will present the program output as graphical displays of single cell response probabilities. This package provides a novel classification method that has the potential to predict subclonal response to a variety of therapeutic agents. Disclosures Kumar: Skyline: Consultancy, Honoraria; BMS: Consultancy; Onyx: Consultancy, Research Funding; Sanofi: Consultancy, Research Funding; Janssen: Consultancy, Research Funding; Novartis: Research Funding; Takeda: Consultancy, Research Funding; Celgene: Consultancy, Research Funding.


Sign in / Sign up

Export Citation Format

Share Document