scholarly journals Integrating RNA-seq and assay for transposase-accessible chromatin by sequencing (ATAC-seq) predicts functionally-relevant chromatin regions

2021 ◽  
Author(s):  
Collin B Merrill ◽  
Austin B Montgomery ◽  
Miguel A Pabon ◽  
Aylin R Rodan ◽  
Adrian Rothenfluh

Gene regulation is critical for proper cellular function. Next-generation sequencing technology has revealed the presence of regulatory networks that regulate gene expression and essential cellular functions. Studies investigating the epigenome have begun to uncover the complex mechanisms regulating transcription. Assay for transposase-accessible chromatin by sequencing (ATAC-seq) is quickly becoming the assay of choice for epigenomic investigations. Integrating epigenomic and transcriptomic data has the potential to reveal the chromatin-mediated mechanisms regulating transcription. However, integrating these two data types remains challenging. We used the insulin signaling pathway as a model to investigate chromatin regions and gene expression changes using ATAC- and RNA-seq in insulin-treated Drosophila S2 cells. We show that insulin causes widespread changes in chromatin accessibility and gene expression. Then, we attempted to integrate ATAC- and RNA-seq data to predict functionally-relevant chromatin regions that control the transcriptional response to insulin. We show that using differential chromatin accessibility can predict functionally-relevant genome regions, but that stratifying differentially-accessible chromatin regions by annotated feature type provides a better prediction of whether a chromatin region regulates gene expression. In particular, our data demonstrate a strong correlation between chromatin regions annotated to distal promoters (1-2 kb from the transcription start site). To test this prediction, we cloned candidate distal promoter regions upstream of luciferase and validated the functional relevance of these chromatin regions. Our data show that distal promoter regions selected by correlations with RNA-seq are more likely to control gene expression. Thus, correlating ATAC- and RNA-seq data can home in on functionally-relevant chromatin regions.

2019 ◽  
Vol 97 (Supplement_2) ◽  
pp. 15-16
Author(s):  
Sylvain Foissac ◽  
Sarah Djebali ◽  
Kylie Munyard ◽  
Nathalie Vialaneix ◽  
Andrea Rau ◽  
...  

Abstract Improving the functional annotation of animal genomes is a key challenge in bridging the gap between genotype and phenotype, thus enabling predictive biology. Regarding livestock production, major outcomes are expected from a better understanding of the genetic architecture underlying quantitative traits. As part of the Functional Annotation of ANimal Genomes action (FAANG: www.faang.org), the FR-AgENCODE project generated omics data to improve the reference annotation of the cattle, pig, goat and chicken genome. High-throughput molecular assays have been performed on tissues/cells relevant to immune and metabolic traits. From two males and two females per species (pig, cattle, goat, chicken), strand-oriented RNA-seq gene expression and ATAC-seq chromatin accessibility assays were performed on liver and two PBMC-sorted T-cell types (CD4+ and CD8+). Chromosome Conformation Capture (in situ Hi-C) was also carried out on liver samples. About 4,000 samples have been collected at the INRA biorepository and registered at the EBI BioSamples registry. More than 80% of the planned experiments could be completed, generating ~11.5 billions of sequencing reads over the 3 assays. While most (50–80%) RNA-seq reads mapped to annotated exons, thousands of novel transcripts were found, with ~60K mRNAs and ~22K lncRNAs in cattle. Differentially expressed genes between cell types were enriched for immunity- or metabolism-related terms, and differentially accessible chromatin regions were identified as potential regulatory sites. Interestingly, correlations between gene expression and promoter accessibility across samples were skewed towards both positive and negative values, suggesting distinct regulatory mechanisms of gene expression. These patterns have been further investigated using human data from the Epigenome Roadmap Mapping Consortium. Altogether, this study illustrates the interest of a coordinated effort to tackle the genome-to-phenome challenge and provides a useful resource to the community. Availability: www.fragencode.org.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 3824-3824
Author(s):  
Gabriela Krivdova ◽  
Schoof E Erwin ◽  
Veronique Voisin ◽  
Alex Murison ◽  
Karin G. Hermans ◽  
...  

Abstract Background: Residing at the apex of the blood system hierarchy, hematopoietic stem cells (HSCs) are endowed with multi-potency and self-renewal potential. Hematopoietic homeostasis is tightly regulated by controlling the balance between quiescence, self-renewal and lineage-commitment of HSCs. Although many studies have profiled gene expression patterns and epigenomes of HSC and downstream progenitors, post-transcriptional regulation of determinants that control these regulatory networks is largely unknown. MicroRNAs (miRNAs) represent a large class of post-transcriptional regulators that mediate repression of multiple target mRNAs by inhibiting their translation and/or inducing their degradation. A limited number of reports suggest that miRNAs are differentially expressed across the hematopoietic hierarchy and control lineage commitment and cell fate decisions by orchestrating gene regulatory networks, however the mechanisms remain unexplored. Methods: To identify miRNA(s) that play a functional role in human hematopoiesis, we performed an in vivo competitive repopulation screen in which candidate miRNAs were over-expressed (OE) in human CD34+CD38- umbilical cord blood (CB) cells and subsequently transplanted into immune-deficient mice for 24 weeks. miR-130a was shown to enhance long-term hematopoietic reconstitution and chosen for further investigation. Results: As miRNAs are negative regulators of gene expression, we studied the functional impact of miR-130a on long-term hematopoietic reconstitution by enforcing its expression in CB cells using lentiviral vector containing orange fluorescent protein (OFP) reporter. At 12 and 24 weeks after transplantation, increased miR-130a expression conferred a statistically significant, competitive advantage to transduced CB cells demonstrated by increased human chimerism and the proportion of OFP+/hCD45+ cells in the injected femur (IF), bone marrow (BM) and spleen of recipient mice. Xenografts produced by miR-130 O/E showed multi-lineage engraftment with myeloid skewing at the expense of B-lymphoid development and significantly enhanced erythroid output in RF, BM and spleen. In addition, ectopic expression of miR-130a caused splenomegaly in recipient mice. Flow cytometry analysis using several markers expressed during erythroid development revealed accumulation of immature GlyA+/CD71+/CD36+ erythroid progenitors, suggesting an erythroid differentiation block. Enforced expression of miR-130a also perturbed myeloid differentiation shown by the presence of abnormal CD14+/CD66b+ myeloid cells in the BM. At the primitive and progenitor cell stages, miR-130a O/E caused significant expansion of primitive CD34+/CD38- cells and increased the proportion of immuno-phenotypic HSC. Secondary transplantation involving limited dilution analysis revealed 10-fold increase in HSC frequency, consistent with a role of miR-130a in HSC self-renewal. Analysis of chromatin accessibility surrounding the miR-130a locus across the human hematopoietic hierarchy revealed peaks of accessible chromatin in HSC and downstream progenitors that were absent in mature cells. To ascertain the molecular mechanism of miR-130a function, label-free semi-quantitative proteomics was performed to determine differentially expressed proteins between miR-130a O/E and control-transduced CD34+ CB cells. Gene set enrichment analysis (GSEA) identified top miR-130a downregulated gene sets centered on chromatin remodelling. Components of SMRT/N-CoR co-repressor complex and polycomb repressive complex (PRC2) were identified to be among the top downregulated miR-130a targets. We assessed the impact of miR-130a O/E on the global chromatin accessibility landscape by performing ATAC-seq on CD34+ CB cells transduced with miR-130a or control lentivirus. Enforced expression of miR-130a resulted in a gain of approximately 450 accessible chromatin peaks. Transcription factor DNA recognition motif analysis revealed significant enrichment of GATA3 motif in accessible sites specific to miR-130a O/E cells. Conclusion: Together, our data suggests that miR-130a regulates HSC self-renewal and lineage specification. miR-130a mediates repression of several gene networks centered on chromatin remodelling and focally reshapes the accessible chromatin landscape of HSPC. Disclosures No relevant conflicts of interest to declare.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Li Tong ◽  
◽  
Po-Yen Wu ◽  
John H. Phan ◽  
Hamid R. Hassazadeh ◽  
...  

Abstract To use next-generation sequencing technology such as RNA-seq for medical and health applications, choosing proper analysis methods for biomarker identification remains a critical challenge for most users. The US Food and Drug Administration (FDA) has led the Sequencing Quality Control (SEQC) project to conduct a comprehensive investigation of 278 representative RNA-seq data analysis pipelines consisting of 13 sequence mapping, three quantification, and seven normalization methods. In this article, we focused on the impact of the joint effects of RNA-seq pipelines on gene expression estimation as well as the downstream prediction of disease outcomes. First, we developed and applied three metrics (i.e., accuracy, precision, and reliability) to quantitatively evaluate each pipeline’s performance on gene expression estimation. We then investigated the correlation between the proposed metrics and the downstream prediction performance using two real-world cancer datasets (i.e., SEQC neuroblastoma dataset and the NIH/NCI TCGA lung adenocarcinoma dataset). We found that RNA-seq pipeline components jointly and significantly impacted the accuracy of gene expression estimation, and its impact was extended to the downstream prediction of these cancer outcomes. Specifically, RNA-seq pipelines that produced more accurate, precise, and reliable gene expression estimation tended to perform better in the prediction of disease outcome. In the end, we provided scenarios as guidelines for users to use these three metrics to select sensible RNA-seq pipelines for the improved accuracy, precision, and reliability of gene expression estimation, which lead to the improved downstream gene expression-based prediction of disease outcome.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 812-812
Author(s):  
Mudit Chaand ◽  
Chris Fiore ◽  
Brian T Johnston ◽  
Diane H Moon ◽  
John P Carulli ◽  
...  

Human beta-like globin gene expression is developmentally regulated. Erythroblasts (EBs) derived from fetal tissues, such as umbilical cord blood (CB), primarily express gamma globin mRNA (HBG) and HbF, while EBs derived from adult tissues, such as bone marrow (BM), predominantly express beta globin mRNA (HBB) and adult hemoglobin. Human genetics has validated de-repression of HBG in adult EBs as a powerful therapeutic paradigm in diseases involving defective HBB, such as sickle cell anemia. To identify novel factors involved in the switch from HBG to HBB expression, and to better understand the global regulatory networks driving the fetal and adult cell states, we performed transcriptome profiling (RNA-seq) and chromatin accessibility profiling (ATAC-seq) on sorted EB cell populations from CB or BM. This approach improves upon previous studies that used unsorted cells (Huang J, Dev Cell 2016) or that did not measure chromatin accessibility (Yan H, Am J Hematol 2018). CD34+ cells from CB and BM were differentiated using a 3-phase in vitro culture system (Giarratana M, Blood 2011). Fluorescence-activated cell sorting and the cell surface markers CD36 and GYPA were used to isolate 7 discrete populations, with each sorting gate representing increasingly mature, stage-matched EBs from CB or BM (Fig 1A, B). RNA-seq analysis revealed expected expression patterns of the beta-like globins, with total levels increasing during erythroid maturation and primarily composed of HBB or HBG transcripts in BM or CB, respectively (Fig 1C). Erythroid maturation led to progressive increases in chromatin accessibility at the HBB promoter in BM populations. In CB-derived cells, erythroid maturation led to progressive increases in chromatin accessibility at the HBG promoters through the CD36+GYPA+ stage (Pops 1-5). Chromatin accessibility shifted from the HBG promoters to the HBB promoter during the final stages of differentiation (Pops 6-7), suggesting that HBG gene activation is transient in CB EBs (Fig 1D). Hierarchical clustering and principal component analysis of ATAC-seq data revealed that cell populations cluster based on differentiation stage rather than by BM or CB lineage, suggesting most molecular changes are stage-specific, not lineage-specific (Fig 2A, B). To identify transcription factors driving cell state, and potentially beta-like globin expression preference, we searched for DNA binding motifs within regions of differential chromatin accessibility and found NFI factor motifs enriched under peaks that were larger in BM relative to CB (Fig 2C). Transcription factor footprinting analysis showed that both flanking accessibility and footprint depth at NFI motifs were also increased in BM relative to CB (Fig 2D). Increased chromatin accessibility was observed at the NFIX promoter in BM relative to CB populations, and in HUDEP-2 relative to HUDEP-1 cell lines (Fig 2E). Furthermore, accessibility at the NFIX promoter correlated with elevated NFIX mRNA in BM and HUDEP-2 relative to CB and HUDEP-1, respectively. Together these data implicated NFIX in HbF repression, a finding consistent with previous genome-wide association and DNA methylation studies that suggested a possible role for NFIX in regulating beta-like globin gene expression (Fabrice D, Nat Genet 2016; Lessard S, Genome Med 2015). To directly test the hypothesis that NFIX represses HbF, short hairpin RNAs were used to knockdown (KD) NFIX in primary erythroblasts derived from human CD34+ BM cells (Fig 3A). NFIX KD led to a time-dependent induction of HBG mRNA, HbF, and F-cells comparable to KD of the known HbF repressor BCL11A (Fig 3B-D). A similar effect on HbF was observed in HUDEP-2 cells following NFIX KD (Fig 3E). Consistent with HbF induction, NFIX KD also increased chromatin accessibility and decreased DNA methylation at the HBG promoters in primary EBs (Fig 3F, G). NFIX KD led to a delay in erythroid differentiation as measured by CD36 and GYPA expression (Fig 3H). Despite this delay, by day 14 a high proportion of fully enucleated erythroblasts was observed, suggesting NFIX KD cells are capable of terminal differentiation (Fig 3H). Collectively, these data have enabled identification and validation of NFIX as a novel repressor of HbF, a finding that enhances the understanding of beta-like globin gene regulation and has potential implications in the development of therapeutics for sickle cell disease. Disclosures Chaand: Syros Pharmaceuticals: Employment, Equity Ownership. Fiore:Syros Pharmaceuticals: Employment, Equity Ownership. Johnston:Syros Pharmaceuticals: Employment, Equity Ownership. Moon:Syros Pharmaceuticals: Employment, Equity Ownership. Carulli:Syros Pharmaceuticals: Employment, Equity Ownership. Shearstone:Syros Pharmaceuticals: Employment, Equity Ownership.


2021 ◽  
Author(s):  
Dennis A Sun ◽  
Nipam H Patel

AbstractEmerging research organisms enable the study of biology that cannot be addressed using classical “model” organisms. The development of novel data resources can accelerate research in such animals. Here, we present new functional genomic resources for the amphipod crustacean Parhyale hawaiensis, facilitating the exploration of gene regulatory evolution using this emerging research organism. We use Omni-ATAC-Seq, an improved form of the Assay for Transposase-Accessible Chromatin coupled with next-generation sequencing (ATAC-Seq), to identify accessible chromatin genome-wide across a broad time course of Parhyale embryonic development. This time course encompasses many major morphological events, including segmentation, body regionalization, gut morphogenesis, and limb development. In addition, we use short- and long-read RNA-Seq to generate an improved Parhyale genome annotation, enabling deeper classification of identified regulatory elements. We leverage a variety of bioinformatic tools to discover differential accessibility, predict nucleosome positioning, infer transcription factor binding, cluster peaks based on accessibility dynamics, classify biological functions, and correlate gene expression with accessibility. Using a Minos transposase reporter system, we demonstrate the potential to identify novel regulatory elements using this approach, including distal regulatory elements. This work provides a platform for the identification of novel developmental regulatory elements in Parhyale, and offers a framework for performing such experiments in other emerging research organisms.Primary Findings-Omni-ATAC-Seq identifies cis-regulatory elements genome-wide during crustacean embryogenesis-Combined short- and long-read RNA-Seq improves the Parhyale genome annotation-ImpulseDE2 analysis identifies dynamically regulated candidate regulatory elements-NucleoATAC and HINT-ATAC enable inference of nucleosome occupancy and transcription factor binding-Fuzzy clustering reveals peaks with distinct accessibility and chromatin dynamics-Integration of accessibility and gene expression reveals possible enhancers and repressors-Omni-ATAC can identify known and novel regulatory elements


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Giancarlo Bonora ◽  
Vijay Ramani ◽  
Ritambhara Singh ◽  
He Fang ◽  
Dana L. Jackson ◽  
...  

Abstract Background Mammalian development is associated with extensive changes in gene expression, chromatin accessibility, and nuclear structure. Here, we follow such changes associated with mouse embryonic stem cell differentiation and X inactivation by integrating, for the first time, allele-specific data from these three modalities obtained by high-throughput single-cell RNA-seq, ATAC-seq, and Hi-C. Results Allele-specific contact decay profiles obtained by single-cell Hi-C clearly show that the inactive X chromosome has a unique profile in differentiated cells that have undergone X inactivation. Loss of this inactive X-specific structure at mitosis is followed by its reappearance during the cell cycle, suggesting a “bookmark” mechanism. Differentiation of embryonic stem cells to follow the onset of X inactivation is associated with changes in contact decay profiles that occur in parallel on both the X chromosomes and autosomes. Single-cell RNA-seq and ATAC-seq show evidence of a delay in female versus male cells, due to the presence of two active X chromosomes at early stages of differentiation. The onset of the inactive X-specific structure in single cells occurs later than gene silencing, consistent with the idea that chromatin compaction is a late event of X inactivation. Single-cell Hi-C highlights evidence of discrete changes in nuclear structure characterized by the acquisition of very long-range contacts throughout the nucleus. Novel computational approaches allow for the effective alignment of single-cell gene expression, chromatin accessibility, and 3D chromosome structure. Conclusions Based on trajectory analyses, three distinct nuclear structure states are detected reflecting discrete and profound simultaneous changes not only to the structure of the X chromosomes, but also to that of autosomes during differentiation. Our study reveals that long-range structural changes to chromosomes appear as discrete events, unlike progressive changes in gene expression and chromatin accessibility.


2021 ◽  
Vol 15 (Supplement_1) ◽  
pp. S062-S062
Author(s):  
A Lewis ◽  
B Pan-Castillo ◽  
G Berti ◽  
C Felice ◽  
H Gordon ◽  
...  

Abstract Background Histone-deacetylase (HDAC) enzymes are a broad class of ubiquitously expressed enzymes that modulate histone acetylation, chromatin accessibility and gene expression. In models of Inflammatory bowel disease (IBD), HDAC inhibitors, such as Valproic acid (VPA) are proven anti-inflammatory agents and evidence suggests that they also inhibit fibrosis in non-intestinal organs. However, the role of HDAC enzymes in stricturing Crohn’s disease (CD) has not been characterised; this is key to understanding the molecular mechanism and developing novel therapies. Methods To evaluate HDAC expression in the intestine of SCD patients, we performed unbiased single-cell RNA sequencing (sc-RNA-seq) of over 10,000 cells isolated from full-thickness surgical resection specimens of non-SCD (NSCD; n=2) and SCD intestine (n=3). Approximately, 1000 fibroblasts were identified for further analysis, including a distinct cluster of myofibroblasts. Changes in gene expression were compared between myofibroblasts and other resident intestinal fibroblasts using the sc-RNA-seq analysis pipeline in Partek. Changes in HDAC expression and markers of HDAC activity (H3K27ac) were confirmed by immunohistochemistry in FFPE tissue from patient matched NSCD and SCD intestine (n=14 pairs). The function of HDACs in intestinal fibroblasts in the CCD-18co cell line and primary CD myofibroblast cultures (n=16 cultures) was assessed using VPA, a class I HDAC inhibitor. Cells were analysed using a variety of molecular techniques including ATAC-seq, gene expression arrays, qPCR, western blot and immunofluorescent protein analysis. Results Class I HDAC (HDAC1, p= 2.11E-11; HDAC2, p= 4.28E-11; HDAC3, p= 1.60E-07; and HDAC8, p= 2.67E-03) expression was increased in myofibroblasts compared to other intestinal fibroblasts subtypes. IHC also showed an increase in the percentage of stromal HDAC2 positive cells, coupled with a decrease in the percentage of H3K27ac positive cells, in the mucosa overlying SCD intestine relative to matched NSCD areas. In the CCD-18co cell line and primary myofibroblast cultures, VPA reduced chromatin accessibility at Collagen-I gene promoters and suppressed their transcription. VPA also inhibited TGFB-induced up-regulation of Collagen-I, in part by inhibiting TGFB1|1/SMAD4 signalling. TGFB1|1 was identified as a mesenchymal specific target of VPA and siRNA knockdown of TGFB1|1 was sufficient suppress TGFB-induced up-regulation of Collagen-I. Conclusion In SCD patients, class I HDAC expression is increased in myofibroblasts. Class I HDACs inhibitors impair TGFB-signalling and inhibit Collagen-I expression. Selective targeting of TGFB1|1 offers the opportunity to increase treatment specificity by selectively targeting meschenymal cells.


2018 ◽  
Author(s):  
Koen Van Den Berge ◽  
Katharina Hembach ◽  
Charlotte Soneson ◽  
Simone Tiberi ◽  
Lieven Clement ◽  
...  

Gene expression is the fundamental level at which the result of various genetic and regulatory programs are observable. The measurement of transcriptome-wide gene expression has convincingly switched from microarrays to sequencing in a matter of years. RNA sequencing (RNA-seq) provides a quantitative and open system for profiling transcriptional outcomes on a large scale and therefore facilitates a large diversity of applications, including basic science studies, but also agricultural or clinical situations. In the past 10 years or so, much has been learned about the characteristics of the RNA-seq datasets as well as the performance of the myriad of methods developed. In this review, we give an overall view of the developments in RNA-seq data analysis, including experimental design, with an explicit focus on quantification of gene expression and statistical approaches for differential expression. We also highlight emerging data types, such as single-cell RNA-seq and gene expression profiling using long-read technologies.


2017 ◽  
Author(s):  
Mikhail Pachkov ◽  
Piotr J Balwierz ◽  
Phil Arnold ◽  
Andreas J Gruber ◽  
Mihaela Zavolan ◽  
...  

As the costs of high-throughput measurement technologies continue to fall, experimental approaches in biomedicine are increasingly data intensive and the advent of big data is justifiably seen as holding the promise to transform medicine. However, as data volumes mount, researchers increasingly realize that extracting concrete, reliable, and actionable biological predictions from high-throughput data can be very challenging. Our laboratory has pioneered a number of methods for inferring key gene regulatory interactions from high-throughput data. For example, we developed motif activity response analysis (MARA)[, which models genome-wide gene expression (RNA-Seq, or microarray) and chromatin state (ChIP-Seq) data in terms of comprehensive predictions of regulatory sites for hundreds of mammalian regulators (TFs and micro-RNAs). Using these models, MARA identifies the key regulators driving gene expression and chromatin state changes, the activities of these regulators across the input samples, their target genes, and the sites on the genome through which these regulators act. We recently completely automated MARA in an integrated web-server (ismara.unibas.ch) that allows researchers to analyze their own data by simply uploading RNA-Seq or ChIP-Seq datasets, and provides results in an integrated web interface as well as in downloadable flat form.


2021 ◽  
Author(s):  
Ruifang Li ◽  
Sara A Grimm ◽  
Paul A Wade

AbstractDeciphering epigenetic regulation of gene expression requires measuring the epigenome and transcriptome jointly. Single-cell multi-omics technologies have been developed for concurrent profiling of chromatin accessibility and gene expression. However, multi-omics profiling of low-input bulk samples remains challenging. Therefore, we developed low-input ATAC&mRNA-seq, a simple and robust method for studying the role of chromatin structure in gene regulation in a single experiment with thousands of cells, to maximize insights from limited input material by obtaining ATAC-seq and mRNA-seq data simultaneously from the same cells with data quality comparable to conventional mono-omics assays. Integrative data analysis revealed similar strong association between promoter accessibility and gene expression using the data of low-input ATAC&mRNA-seq as using single-assay data, underscoring the accuracy and reliability of our dual-omics assay to generate both data types simultaneously with just thousands of cells. We envision our method to be widely applied in many biological disciplines with limited materials.


Sign in / Sign up

Export Citation Format

Share Document