scholarly journals Single Cell RNA Sequencing Reveals Heterogeneity of Human MSC Chondrogenesis: Lasso Regularized Logistic Regression to Identify Gene and Regulatory Signatures

2019 ◽  
Author(s):  
Nguyen P.T. Huynh ◽  
Natalie H. Kelly ◽  
Dakota B. Katz ◽  
Minh Pham ◽  
Farshid Guilak

AbstractBone marrow-derived mesenchymal stem cells (MSCs) exhibit the potential to undergo chondrogenesis in vitro, forming de novo tissues with a cartilage-like extracellular matrix that is rich in glycosaminoglycan and collagen type II. However, it is now apparent that MSCs comprise an inhomogeneous population of cells, and the fate of individual subpopulations during this differentiation process is not well understood. We analyzed the trajectory of MSC differentiation during chondrogenesis using single cell RNA sequencing (scRNA-seq). Using a machine learning technique – lasso regularized logistic regression – we showed that multiple subpopulations of cells existed at all stages during MSC chondrogenesis and were better-defined by transcription factor activity rather than gene expression. Trajectory analysis indicated that subpopulations of MSCs were not intrinsically specified or restricted, but instead remained multipotent and could differentiate into three main cell types: cartilage, hypertrophic cartilage, and bone. Lasso regularized logistic regression showed several advances in scRNA-seq analysis, namely identification of a small number of highly influential genes or transcription factors for downstream validation, and cell type classification with high accuracy. Additionally, we showed that MSC differentiation trajectory may exhibit donor to donor variation, although key influential pathways were comparable between donors. Our data provide an important resource to study gene expression and to deconstruct gene regulatory networks in MSC differentiation.


2019 ◽  
Author(s):  
Katelyn Donahue ◽  
Yaqing Zhang ◽  
Veerin Sirihorachai ◽  
Stephanie The ◽  
Arvind Rao ◽  
...  


2019 ◽  
Author(s):  
Daniel Osorio ◽  
Xue Yu ◽  
Peng Yu ◽  
Erchin Serpedin ◽  
James J. Cai

AbstractIn biomedical research, lymphoblastoid cell lines (LCLs), often established byin vitroinfection of resting B cells with Epstein Barr Virus, are commonly used as surrogates for peripheral blood lymphocytes. Genomic and transcriptomic information on LCLs has been used to study the impact of genetic variation on gene expression in humans. Here we present single-cell RNA sequencing (scRNA-seq) data on GM12878 and GM18502—two LCLs derived from the blood of female donors of European and African ancestry, respectively. Cells from three samples (the two LCLs and a 1:1 mixture of the two) were prepared separately using a 10X Genomics Chromium Controller and deeply sequenced. The final dataset contained 7,045 cells from GM12878, 5,189 from GM18502, and 5,820 from the mixture, offering valuable information on single-cell gene expression in highly homogenous cell populations. This dataset is a suitable reference of population differentiation in gene expression at the single-cell level. Data from the mixture provides additional valuable information facilitating the development of statistical methods for data normalization and batch effect correction.



2020 ◽  
Author(s):  
Weimiao Wu ◽  
Qile Dai ◽  
Yunqing Liu ◽  
Xiting Yan ◽  
Zuoheng Wang

AbstractSingle-cell RNA sequencing provides an opportunity to study gene expression at single-cell resolution. However, prevalent dropout events result in high data sparsity and noise that may obscure downstream analyses. We propose a novel method, G2S3, that imputes dropouts by borrowing information from adjacent genes in a sparse gene graph learned from gene expression profiles across cells. We applied G2S3 and other existing methods to seven single-cell datasets to compare their performance. Our results demonstrated that G2S3 is superior in recovering true expression levels, identifying cell subtypes, improving differential expression analyses, and recovering gene regulatory relationships, especially for mildly expressed genes.



eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Christopher A Jackson ◽  
Dayanne M Castro ◽  
Giuseppe-Antonio Saldi ◽  
Richard Bonneau ◽  
David Gresham

Understanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing (scRNAseq) captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for scRNAseq in budding yeast (Saccharomyces cerevisiae). We pooled diverse transcriptionally barcoded gene deletion mutants in 11 different environmental conditions and determined their expression state by sequencing 38,285 individual cells. We benchmarked a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12,228 interactions.



Author(s):  
Zilong Zhang ◽  
Feifei Cui ◽  
Chunyu Wang ◽  
Lingling Zhao ◽  
Quan Zou

Abstract Single-cell RNA sequencing (scRNA-seq) has enabled researchers to study gene expression at the cellular level. However, due to the extremely low levels of transcripts in a single cell and technical losses during reverse transcription, gene expression at a single-cell resolution is usually noisy and highly dimensional; thus, statistical analyses of single-cell data are a challenge. Although many scRNA-seq data analysis tools are currently available, a gold standard pipeline is not available for all datasets. Therefore, a general understanding of bioinformatics and associated computational issues would facilitate the selection of appropriate tools for a given set of data. In this review, we provide an overview of the goals and most popular computational analysis tools for the quality control, normalization, imputation, feature selection and dimension reduction of scRNA-seq data.



2021 ◽  
Vol 17 (5) ◽  
pp. e1009029
Author(s):  
Weimiao Wu ◽  
Yunqing Liu ◽  
Qile Dai ◽  
Xiting Yan ◽  
Zuoheng Wang

Single-cell RNA sequencing technology provides an opportunity to study gene expression at single-cell resolution. However, prevalent dropout events result in high data sparsity and noise that may obscure downstream analyses in single-cell transcriptomic studies. We propose a new method, G2S3, that imputes dropouts by borrowing information from adjacent genes in a sparse gene graph learned from gene expression profiles across cells. We applied G2S3 and ten existing imputation methods to eight single-cell transcriptomic datasets and compared their performance. Our results demonstrated that G2S3 has superior overall performance in recovering gene expression, identifying cell subtypes, reconstructing cell trajectories, identifying differentially expressed genes, and recovering gene regulatory and correlation relationships. Moreover, G2S3 is computationally efficient for imputation in large-scale single-cell transcriptomic datasets.



2019 ◽  
Author(s):  
Christopher A Jackson ◽  
Dayanne M Castro ◽  
Giuseppe-Antonio Saldi ◽  
Richard Bonneau ◽  
David Gresham

AbstractUnderstanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing (scRNAseq) captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for transcriptionally barcoding gene deletion mutants and performing scRNAseq in budding yeast (Saccharomyces cerevisiae). We pooled diverse genotypes in 11 different environmental conditions and determined their expression state by sequencing 38,285 individual cells. We developed, and benchmarked, a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12,018 interactions. Our study establishes a general approach to gene regulatory network reconstruction from scRNAseq data that can be employed in any organism.



Author(s):  
Sarah A. Dugger ◽  
Ryan S. Dhindsa ◽  
Gabriela De Almeida Sampaio ◽  
Elizabeth E. Rafikian ◽  
Sabrina Petri ◽  
...  

AbstractHeterozygous de novo loss-of-function mutations in the gene expression regulator HNRNPU cause an early-onset developmental and epileptic encephalopathy. To gain insight into pathological mechanisms and lay the groundwork for developing targeted therapies, we characterized the neurophysiologic and cell-type-specific transcriptomic consequences of a mouse model of HNRNPU haploinsufficiency. Heterozygous mutants demonstrated neuroanatomical abnormalities, global developmental delay and impaired ultrasonic vocalizations, and increased seizure susceptibility, thus modeling aspects of the human disease. Single-cell RNA-sequencing of hippocampal and neocortical cells revealed widespread, yet modest, dysregulation of gene expression across mutant neuronal subtypes. We observed an increased burden of differentially-expressed genes in mutant excitatory neurons of the subiculum—a region of the hippocampus implicated in temporal lobe epilepsy. Evaluation of transcriptomic signature reversal as a therapeutic strategy highlighted the potential importance of generating cell-type-specific signatures. Overall, this work provides insight into HNRNPU-mediated disease mechanisms, and provides a framework for using single-cell RNA-sequencing to study transcriptional regulators implicated in disease.



2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David S. Fischer ◽  
Meshal Ansari ◽  
Karolin I. Wagner ◽  
Sebastian Jarosch ◽  
Yiqi Huang ◽  
...  

AbstractThe in vivo phenotypic profile of T cells reactive to severe acute respiratory syndrome (SARS)-CoV-2 antigens remains poorly understood. Conventional methods to detect antigen-reactive T cells require in vitro antigenic re-stimulation or highly individualized peptide-human leukocyte antigen (pHLA) multimers. Here, we use single-cell RNA sequencing to identify and profile SARS-CoV-2-reactive T cells from Coronavirus Disease 2019 (COVID-19) patients. To do so, we induce transcriptional shifts by antigenic stimulation in vitro and take advantage of natural T cell receptor (TCR) sequences of clonally expanded T cells as barcodes for ‘reverse phenotyping’. This allows identification of SARS-CoV-2-reactive TCRs and reveals phenotypic effects introduced by antigen-specific stimulation. We characterize transcriptional signatures of currently and previously activated SARS-CoV-2-reactive T cells, and show correspondence with phenotypes of T cells from the respiratory tract of patients with severe disease in the presence or absence of virus in independent cohorts. Reverse phenotyping is a powerful tool to provide an integrated insight into cellular states of SARS-CoV-2-reactive T cells across tissues and activation states.





Sign in / Sign up

Export Citation Format

Share Document