scholarly journals A random walk down personalized single-cell networks: predicting the response of any gene to any drug for any patient

2019 ◽  
Author(s):  
Haripriya Harikumar ◽  
Thomas P. Quinn ◽  
Santu Rana ◽  
Sunil Gupta ◽  
Svetha Venkatesh

AbstractBackgroundThe last decade has seen a major increase in the availability of genomic data. This includes expert-curated databases that describe the biological activity of genes, as well as high-throughput assays that measure the gene expression of bulk tissue and single cells. Integrating these heterogeneous data sources can generate new hypotheses about biological systems. Our primary objective is to combine population-level drug-response data with patient-level single-cell expression data to predict how any gene will respond to any drug for any patient.MethodsWe use a “dual-channel” random walk with restart (RWR) algorithm to perform 3 analyses. First, we use glioblastoma single cells from 5 individual patients to discover genes whose functions differ between cancers. Second, we use drug screening data from the Library of Integrated Network-Based Cellular Signatures (LINCS) to show how a cell-specific drug-response signature can be accurately predicted from a baseline (drug-free) gene co-expression network. Finally, we combine both data streams to show how the RWR algorithm can predict how any gene will respond to any drug for each of the 5 glioblastoma patients.ConclusionsOur manuscript introduces two innovations to the integration of heterogeneous biological data. First, we use a “dual-channel” RWR method to predict up-regulation and down-regulation separately. Second, we use individualized single-cell gene co-expression networks to make personalized predictions. These innovations let us predict gene function and drug response for individual patients. When applied to real data, we identify a number of genes that exhibit a patient-specific drug response, including the pan-cancer oncogene EGFR.

2020 ◽  
Author(s):  
Haripriya Harikumar ◽  
Thomas P Quinn ◽  
Santu Rana ◽  
Sunil Gupta ◽  
Svetha Venkatesh

Abstract Background: The last decade has seen a major increase in the availability of genomic data. This includes expert-curated databases that describe the biological activity of genes, as well as high-throughput assays that measure the gene expression of bulk tissue and single cells. Integrating these heterogeneous data sources can generate new hypotheses about biological systems. Our primary objective is to combine population-level drug-response data with patient-level single-cell expression data to predict how any gene will respond to any drug for any patient. Methods: We use a “dual-channel” random walk with restart algorithm to perform 3 analyses. First, we use glioblastoma single cells from 5 individual patients to discover genes whose functions differ between cancers. Second, we use drug screening data from the Library of Integrated Network-Based Cellular Signatures (LINCS) to show how a cell-specific drug-response signature can be accurately predicted from a baseline (drug-free) gene co-expression network. Finally, we combine both data streams to show how we can predict how any gene will respond to any drug for each of the 5 glioblastoma patients. Conclusions: Our manuscript introduces two innovations to the integration of heterogeneous biological data. First, we use a “dual-channel” method to predict up-regulation and down-regulation separately. Second, we use individualized single-cell gene co-expression networks to make personalized predictions. These innovations let us predict gene function and drug response for individual patients. When applied to real data, we identify a number of genes that exhibit a patient-specific drug response, including the pan-cancer oncogene EGFR.


2020 ◽  
Author(s):  
HARIPRIYA HARIKUMAR ◽  
Thomas P Quinn ◽  
Santu Rana ◽  
Sunil Gupta ◽  
Svetha Venkatesh

Abstract Background: The last decade has seen a major increase in the availability of genomic data. This includes expert-curated databases that describe the biological activity of genes, as well as high-throughput assays that measure gene expression in bulk tissue and single cells. Integrating these heterogeneous data sources can generate new hypotheses about biological systems. Our primary objective is to combine population-level drug-response data with patient-level single-cell expression data to predict how any gene will respond to any drug for any patient.Methods: We take 2 approaches to benchmarking a “dual-channel” random walk with restart (RWR) for data integration. First, we evaluate how well RWR can predict known gene functions from single-cell gene co-expression networks. Second, we evaluate how well RWR can predict known drug responses from individual cell networks. We then present two exploratory applications. In the first application, we combine the Gene Ontology database with glioblastoma single cells from 5 individual patients to identify genes whose functions differ between cancers. In the second application, we combine the LINCS drug-response database with the same glioblastoma data to identify genes that may exhibit patient-specific drug responses.Conclusions: Our manuscript introduces two innovations to the integration of heterogeneous biological data. First, we use a “dual-channel” method to predict up-regulation and down-regulation separately. Second, we use individualized single-cell gene co-expression networks to make personalized predictions. These innovations let us predict gene function and drug response for individual patients. Taken together, our work shows promise that single-cell co-expression data could be combined in heterogeneous information networks to facilitate precision medicine.


2021 ◽  
Vol 14 (1) ◽  
Author(s):  
Haripriya Harikumar ◽  
Thomas P. Quinn ◽  
Santu Rana ◽  
Sunil Gupta ◽  
Svetha Venkatesh

Abstract Background The last decade has seen a major increase in the availability of genomic data. This includes expert-curated databases that describe the biological activity of genes, as well as high-throughput assays that measure gene expression in bulk tissue and single cells. Integrating these heterogeneous data sources can generate new hypotheses about biological systems. Our primary objective is to combine population-level drug-response data with patient-level single-cell expression data to predict how any gene will respond to any drug for any patient. Methods We take 2 approaches to benchmarking a “dual-channel” random walk with restart (RWR) for data integration. First, we evaluate how well RWR can predict known gene functions from single-cell gene co-expression networks. Second, we evaluate how well RWR can predict known drug responses from individual cell networks. We then present two exploratory applications. In the first application, we combine the Gene Ontology database with glioblastoma single cells from 5 individual patients to identify genes whose functions differ between cancers. In the second application, we combine the LINCS drug-response database with the same glioblastoma data to identify genes that may exhibit patient-specific drug responses. Conclusions Our manuscript introduces two innovations to the integration of heterogeneous biological data. First, we use a “dual-channel” method to predict up-regulation and down-regulation separately. Second, we use individualized single-cell gene co-expression networks to make personalized predictions. These innovations let us predict gene function and drug response for individual patients. Taken together, our work shows promise that single-cell co-expression data could be combined in heterogeneous information networks to facilitate precision medicine.


2020 ◽  
Vol 117 (46) ◽  
pp. 28784-28794
Author(s):  
Sisi Chen ◽  
Paul Rivaud ◽  
Jong H. Park ◽  
Tiffany Tsou ◽  
Emeric Charles ◽  
...  

Single-cell measurement techniques can now probe gene expression in heterogeneous cell populations from the human body across a range of environmental and physiological conditions. However, new mathematical and computational methods are required to represent and analyze gene-expression changes that occur in complex mixtures of single cells as they respond to signals, drugs, or disease states. Here, we introduce a mathematical modeling platform, PopAlign, that automatically identifies subpopulations of cells within a heterogeneous mixture and tracks gene-expression and cell-abundance changes across subpopulations by constructing and comparing probabilistic models. Probabilistic models provide a low-error, compressed representation of single-cell data that enables efficient large-scale computations. We apply PopAlign to analyze the impact of 40 different immunomodulatory compounds on a heterogeneous population of donor-derived human immune cells as well as patient-specific disease signatures in multiple myeloma. PopAlign scales to comparisons involving tens to hundreds of samples, enabling large-scale studies of natural and engineered cell populations as they respond to drugs, signals, or physiological change.


2018 ◽  
Author(s):  
Jingtian Zhou ◽  
Jianzhu Ma ◽  
Yusi Chen ◽  
Chuankai Cheng ◽  
Bokan Bao ◽  
...  

3D genome structure plays a pivotal role in gene regulation and cellular function. Single-cell analysis of genome architecture has been achieved using imaging and chromatin conformation capture methods such as Hi-C. To study variation in chromosome structure between different cell types, computational approaches are needed that can utilize sparse and heterogeneous single-cell Hi-C data. However, few methods exist that are able to accurately and efficiently cluster such data into constituent cell types. Here, we describe HiCluster, a single-cell clustering algorithm for Hi-C contact matrices that is based on imputations using linear convolution and random walk. Using both simulated and real data as benchmarks, HiCluster significantly improves clustering accuracy when applied to low coverage Hi-C datasets compared to existing methods. After imputation by HiCluster, structures similar to topologically associating domains (TADs) could be identified within single cells, and their consensus boundaries among cells were enriched at the TAD boundaries observed in bulk samples. In summary, HiCluster facilitates visualization and comparison of single-cell 3D genomes.


2021 ◽  
Vol 3 (2) ◽  
Author(s):  
Xin Wang ◽  
Jane Frederick ◽  
Hongbin Wang ◽  
Sheng Hui ◽  
Vadim Backman ◽  
...  

Abstract The transcriptional plasticity of cancer cells promotes intercellular heterogeneity in response to anticancer drugs and facilitates the generation of subpopulation surviving cells. Characterizing single-cell transcriptional heterogeneity after drug treatments can provide mechanistic insights into drug efficacy. Here, we used single-cell RNA-seq to examine transcriptomic profiles of cancer cells treated with paclitaxel, celecoxib and the combination of the two drugs. By normalizing the expression of endogenous genes to spike-in molecules, we found that cellular mRNA abundance shows dynamic regulation after drug treatment. Using a random forest model, we identified gene signatures classifying single cells into three states: transcriptional repression, amplification and control-like. Treatment with paclitaxel or celecoxib alone generally repressed gene transcription across single cells. Interestingly, the drug combination resulted in transcriptional amplification and hyperactivation of mitochondrial oxidative phosphorylation pathway linking to enhanced cell killing efficiency. Finally, we identified a regulatory module enriched with metabolism and inflammation-related genes activated in a subpopulation of paclitaxel-treated cells, the expression of which predicted paclitaxel efficacy across cancer cell lines and in vivo patient samples. Our study highlights the dynamic global transcriptional activity driving single-cell heterogeneity during drug response and emphasizes the importance of adding spike-in molecules to study gene expression regulation using single-cell RNA-seq.


Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. 4249-4249
Author(s):  
Amit Kumar Mitra ◽  
Ujjal Mukherjee ◽  
Taylor Harding ◽  
Holly Stessman ◽  
Ying Li ◽  
...  

Abstract Multiple myeloma (MM) is characterized by significant genetic diversity at subclonal levels that likely plays a defining role in the heterogeneity of tumor progression, clinical aggressiveness and drug sensitivity. Such heterogeneity is a driving factor in the evolution of MM, from founder clones through outgrowth of subclonal fractions. DNA Sequencing studies on MM samples have indeed demonstrated such heterogeneity in subclonal architecture at diagnosis based on recurrent mutations in pathologically relevant genes that may ultimately to lead to relapse. However, no study so far has reported a predictive gene expression signature that can identify, distinguish and quantify drug sensitive and drug-resistant subpopulations within a bulk population of myeloma cells. In recent years, our laboratory has successfully developed a gene expression profile (GEP)-based signature that could not only distinguish drug response of MM cell lines, but also was effective in stratifying patient outcomes when applied to GEP profiles from MM clinical trials using proteasome inhibitors (PI) as chemotherapeutic agents. Further, we noted myeloma cell lines that responded to the drug often contained residual sub-population of cells that did not respond, and likely were selectively propagated during drug treatment in vitro, and in patients. In this study, we performed targeted qRT-PCR analysis of single cells using a gene panel that included PI sensitivity genes and gene signatures that could discriminate between low and high-risk myeloma followed by intensive bioinformatics and statistical analysis for the classification and prediction of PI response in individual cells within bulk multiple myeloma tumors. Fluidigm's C1 Single-Cell Auto Prep System was used to perform automated single-cell capture, processing and cDNA synthesis on 576 pre-treatment cells from 12 cell lines representing a wide range of PI-sensitivity and 370 cells from 7 patient samples undergoing PI treatment followed by targeted gene expression profiling of single cells using automated, high-throughput on-chip qRT-PCR analysis using 96.96 Dynamic Array IFCs on the BioMark HD System. Probability of resistance for each individual cell was predicted using a pipeline that employed the machine learning methods Random Forest, Support Vector Machine (radial and sigmoidal), LASSO and kNN (k Nearest Neighbor) for making single-cell GEP data-driven predictions/ decisions. The weighted probabilities from each of the algorithms were used to quantify resistance of each individual cell and plotted using Ensemble forecasting algorithm. Using our drug response GEP signature at the single cell level, we could successfully identify distinct subpopulations of tumor cells that were predicted to be sensitive or resistant to PIs. Subsequently, we developed a R Statistical analysis package (http://cran.r-project.org), SCATTome (Single Cell Analysis of Targeted Transcriptome), that can restructure data obtained from Fluidigm qPCR analysis run, filter missing data, perform scaling of filtered data, build classification models and successfully predict drug response of individual cells and classify each cell's probability of response based on the targeted transcriptome. We will present the program output as graphical displays of single cell response probabilities. This package provides a novel classification method that has the potential to predict subclonal response to a variety of therapeutic agents. Disclosures Kumar: Skyline: Consultancy, Honoraria; BMS: Consultancy; Onyx: Consultancy, Research Funding; Sanofi: Consultancy, Research Funding; Janssen: Consultancy, Research Funding; Novartis: Research Funding; Takeda: Consultancy, Research Funding; Celgene: Consultancy, Research Funding.


Author(s):  
Tania Velletri ◽  
Carlo Emanuele Villa ◽  
Domenica Cilli ◽  
Bianca Barzaghi ◽  
Pietro Lo Riso ◽  
...  

AbstractHigh Grade Serous Ovarian cancer (HGSOC) is a major unmet need in oncology, due to its precocious dissemination and the lack of meaningful human models for the investigation of disease pathogenesis in a patient-specific manner. To overcome this roadblock, we present a new method to isolate and grow single cells directly from patients’ metastatic ascites, establishing the conditions for propagating them as 3D cultures that we refer to as single cell-derived metastatic ovarian cancer spheroids (sMOCS). By single cell RNA sequencing (scRNAseq) we define the cellular composition of metastatic ascites and trace its propagation in 2D and 3D culture paradigms, finding that sMOCS retain and amplify key subpopulations from the original patients’ samples and recapitulate features of the original metastasis that do not emerge from classical 2D culture, including retention of individual patients’ specificities. By enabling the enrichment of uniquely informative cell subpopulations from HGSOC metastasis and the clonal interrogation of their diversity at the functional and molecular level, this method provides a powerful instrument for precision oncology in ovarian cancer.


2019 ◽  
Author(s):  
Kazumitsu Maehara ◽  
Yasuyuki Ohkawa

AbstractSingle-cell analysis is a powerful technique used to identify a specific cell population of interest during differentiation, aging, or oncogenesis. Individual cells occupy a particular transient state in the cell cycle, circadian rhythm, or during cell death. An appealing concept of pseudo-time trajectory analysis of single-cell RNA sequencing data was proposed in the software Monocle, and several methods of trajectory analysis have since been published to date. These aim to infer the ordering of cells and enable the tracing of gene expression profile trajectories in cell differentiation and reprogramming. However, the methods are restricted in terms of time structure because of the pre-specified structure of trajectories (linear, branched, tree or cyclic) which contrasts with the mixed state of single cells.Here, we propose a technique to extract underlying flows in single-cell data based on the Hodge decomposition (HD). HD is a theorem of vector fields on a manifold which guarantees that any given flow can decompose into three types of orthogonal component: gradient-flow (acyclic), curl-, and harmonic-flow (cyclic). HD is generalized on a simplicial complex (graph) and the discretized HD has only a weak assumption that the graph is directed. Therefore, in principle, HD can extract flows from any mixture of tree and cyclic time flows of observed cells. The decomposed flows provide intuitive interpretations about complex flow because of their linearity and orthogonality. Thus, each extracted flow can be focused on separately with no need to consider crosstalk.We developed ddhodge software, which aims to model the underlying flow structure that implies unobserved time or causal relations in the hodge-podge collection of data points. We demonstrated that the mathematical framework of HD is suitable to reconstruct a sparse graph representation of diffusion process as a candidate model of differentiation while preserving the divergence of the original fully-connected graph. The preserved divergence can be used as an indicator of the source and sink cells in the observed population. A sparse graph representation of the diffusion process transforms data analysis of the non-linear structure embedded in the high-dimensional space of single-cell data into inspection of the visible flow using graph algorithms. Hence, ddhodge is a suitable toolkit to visualize, inspect, and subsequently interpret large data sets including, but not limited to, high-throughput measurements of biological data.The beta version of ddhodge R package is available at:https://github.com/kazumits/ddhodge


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii412-iii413
Author(s):  
Bradley Gampel ◽  
Luca Szalontay ◽  
Wenting Zhao ◽  
James Garvin ◽  
Chankrit Sethi ◽  
...  

Abstract Children with relapsed brain tumors are less responsive to treatment. These children often receive therapies without having any robust predictive method of potential benefit. Acute slice culturing(ASC) is a methodology permitting freshly operated tumor to undergo a culturing process preserving the tumor’s micro-environment. With the current study, we investigated the feasibility of obtaining therapeutically meaningful data in a timely manner (3–5 days), performing direct drug testing and single cell sequencing using ASC. Previously, we have combined ex vivo slices of intact, patient-derived Glioblastoma tissue with single-cell RNA-seq for small-scale drug screening and assessment of patient and cell type-specific drug responses. We generated slices from preclinical mouse glioma models and surgical specimens from adult Glioblastoma patients, as well as from children with relapsed Ependymomas, Medulloblastomas, and Gliomas. We demonstrated that these acute slices preserved both the tumor heterogeneity and tumor microenvironment observed in single-cell RNA-seq of cells directly isolated from tumor tissue. Testing drug responses, we then treated tissue slices from the Glioblastoma mouse models and different patients with multiple drugs and combinations. This technique allowed us to identify drug-induced transcriptional responses in specific subpopulations of tumor cells, patient-specific drug sensitivities, and drug effects conserved in both mouse and human tumors. Preliminary data suggests that we can apply this procedure within 5–7 days and provide real-time drug screening/single cell sequencing ASC results to Recurrent/ Progressive pediatric Low-Grade Gliomas, High Grade Gliomas, Ependymomas and Medulloblastomas.


Sign in / Sign up

Export Citation Format

Share Document