scholarly journals Modeling the causal regulatory network by integrating chromatin accessibility and transcriptome data

2016 ◽  
Vol 3 (2) ◽  
pp. 240-251 ◽  
Author(s):  
Yong Wang ◽  
Rui Jiang ◽  
Wing Hung Wong

Abstract Cell packs a lot of genetic and regulatory information through a structure known as chromatin, i.e. DNA is wrapped around histone proteins and is tightly packed in a remarkable way. To express a gene in a specific coding region, the chromatin would open up and DNA loop may be formed by interacting enhancers and promoters. Furthermore, the mediator and cohesion complexes, sequence-specific transcription factors, and RNA polymerase II are recruited and work together to elaborately regulate the expression level. It is in pressing need to understand how the information, about when, where, and to what degree genes should be expressed, is embedded into chromatin structure and gene regulatory elements. Thanks to large consortia such as Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomic projects, extensive data on chromatin accessibility and transcript abundance are available across many tissues and cell types. This rich data offer an exciting opportunity to model the causal regulatory relationship. Here, we will review the current experimental approaches, foundational data, computational problems, interpretive frameworks, and integrative models that will enable the accurate interpretation of regulatory landscape. Particularly, we will discuss the efforts to organize, analyze, model, and integrate the DNA accessibility data, transcriptional data, and functional genomic regions together. We believe that these efforts will eventually help us understand the information flow within the cell and will influence research directions across many fields.

2021 ◽  
Author(s):  
Sneha Gopalan ◽  
Yuqing Wang ◽  
Nicholas W. Harper ◽  
Manuel Garber ◽  
Thomas G Fazzio

Methods derived from CUT&RUN and CUT&Tag enable genome-wide mapping of the localization of proteins on chromatin from as few as one cell. These and other mapping approaches focus on one protein at a time, preventing direct measurements of co-localization of different chromatin proteins in the same cells and requiring prioritization of targets where samples are limiting. Here we describe multi-CUT&Tag, an adaptation of CUT&Tag that overcomes these hurdles by using antibody-specific barcodes to simultaneously map multiple proteins in the same cells. Highly specific multi-CUT&Tag maps of histone marks and RNA Polymerase II uncovered sites of co-localization in the same cells, active and repressed genes, and candidate cis-regulatory elements. Single-cell multi-CUT&Tag profiling facilitated identification of distinct cell types from a mixed population and characterization of cell type-specific chromatin architecture. In sum, multi-CUT&Tag increases the information content per cell of epigenomic maps, facilitating direct analysis of the interplay of different proteins on chromatin.


2017 ◽  
Vol 114 (25) ◽  
pp. E4914-E4923 ◽  
Author(s):  
Zhana Duren ◽  
Xi Chen ◽  
Rui Jiang ◽  
Yong Wang ◽  
Wing Hung Wong

The rapid increase of genome-wide datasets on gene expression, chromatin states, and transcription factor (TF) binding locations offers an exciting opportunity to interpret the information encoded in genomes and epigenomes. This task can be challenging as it requires joint modeling of context-specific activation of cis-regulatory elements (REs) and the effects on transcription of associated regulatory factors. To meet this challenge, we propose a statistical approach based on paired expression and chromatin accessibility (PECA) data across diverse cellular contexts. In our approach, we model (i) the localization to REs of chromatin regulators (CRs) based on their interaction with sequence-specific TFs, (ii) the activation of REs due to CRs that are localized to them, and (iii) the effect of TFs bound to activated REs on the transcription of target genes (TGs). The transcriptional regulatory network inferred by PECA provides a detailed view of how trans- and cis-regulatory elements work together to affect gene expression in a context-specific manner. We illustrate the feasibility of this approach by analyzing paired expression and accessibility data from the mouse Encyclopedia of DNA Elements (ENCODE) and explore various applications of the resulting model.


2019 ◽  
Vol 47 (W1) ◽  
pp. W142-W150 ◽  
Author(s):  
Selim Kalayci ◽  
Myvizhi Esai Selvan ◽  
Irene Ramos ◽  
Chris Cotsapas ◽  
Eva Harris ◽  
...  

Abstract Humans vary considerably both in their baseline and activated immune phenotypes. We developed a user-friendly open-access web portal, ImmuneRegulation, that enables users to interactively explore immune regulatory elements that drive cell-type or cohort-specific gene expression levels. ImmuneRegulation currently provides the largest centrally integrated resource on human transcriptome regulation across whole blood and blood cell types, including (i) ∼43,000 genotyped individuals with associated gene expression data from ∼51,000 experiments, yielding genetic variant-gene expression associations on ∼220 million eQTLs; (ii) 14 million transcription factor (TF)-binding region hits extracted from 1945 ChIP-seq studies; and (iii) the latest GWAS catalog with 67,230 published variant-trait associations. Users can interactively explore associations between queried gene(s) and their regulators (cis-eQTLs, trans-eQTLs or TFs) across multiple cohorts and studies. These regulators may explain genotype-dependent gene expression variations and be critical in selecting the ideal cohorts or cell types for follow-up studies or in developing predictive models. Overall, ImmuneRegulation significantly lowers the barriers between complex immune regulation data and researchers who want rapid, intuitive and high-quality access to the effects of regulatory elements on gene expression in multiple studies to empower investigators in translating these rich data into biological insights and clinical applications, and is freely available at https://immuneregulation.mssm.edu.


Author(s):  
Tiit Örd ◽  
Kadri Õunap ◽  
Lindsey Stolze ◽  
Rédouane Aherrahrou ◽  
Valtteri Nurminen ◽  
...  

Rationale: Genome-wide association studies (GWAS) have identified hundreds of loci associated with coronary artery disease (CAD). Many of these loci are enriched in cis-regulatory elements (CREs) but not linked to cardiometabolic risk factors nor to candidate causal genes, complicating their functional interpretation. Objective: Single nucleus chromatin accessibility profiling of the human atherosclerotic lesions was used to investigate cell type-specific patterns of CREs, to understand transcription factors establishing cell identity and to interpret CAD-relevant, non-coding genetic variation. Methods and Results: We used single nucleus ATAC-seq to generate DNA accessibility maps in > 7,000 cells derived from human atherosclerotic lesions. We identified five major lesional cell types including endothelial cells, smooth muscle cells, monocyte/macrophages, NK/T-cells and B-cells and further investigated subtype characteristics of macrophages and smooth muscle cells transitioning into fibromyocytes. We demonstrated that CAD associated genetic variants are particularly enriched in endothelial and smooth muscle cell-specific open chromatin. Using single cell co-accessibility and cis-eQTL information, we prioritized putative target genes and candidate regulatory elements for ~30% of all known CAD loci. Finally, we performed genome-wide experimental fine-mapping of the CAD GWAS variants using epigenetic QTL analysis in primary human aortic endothelial cells and STARR-Seq massively parallel reporter assay in smooth muscle cells. This analysis identified potential causal SNP(s) and the associated target gene for over 30 CAD loci. We present several examples where the chromatin accessibility and gene expression could be assigned to one cell type predicting the cell type of action for CAD loci. Conclusions: These findings highlight the potential of applying snATAC-seq to human tissues in revealing relative contributions of distinct cell types to diseases and in identifying genes likely to be influenced by non-coding GWAS variants.


2020 ◽  
Author(s):  
Yating Liu ◽  
Anthony D. Fischer ◽  
Celine L. St. Pierre ◽  
Juan F. Macias-Velasco ◽  
Heather A. Lawson ◽  
...  

AbstractThe alteration of gene expression due to variations in the sequences of transcriptional regulatory elements has been a focus of substantial inquiry in humans and model organisms. However, less is known about the extent to which natural variation contributes to post-transcriptional regulation. Allelic Expression Imbalance (AEI) is a classical approach for studying the association of specific haplotypes with relative changes in transcript abundance. Here, we piloted a new TRAP based approach to associate genetic variation with transcript occupancy on ribosomes in specific cell types, to determine if it will allow examination of Allelic Translation Imbalance (ATI), and Allelic Translation Efficiency Imbalance, using as a test case mouse astrocytes in vivo. We show that most changes of the mRNA levels on ribosomes were reflected in transcript abundance, though ∼1.5% of transcripts have variants that clearly alter loading onto ribosomes orthogonally to transcript levels. These variants were often in conserved residues and altered sequences known to regulate translation such as upstream ORFs, PolyA sites, and predicted miRNA binding sites. Such variants were also common in transcripts showing altered abundance, suggesting some genetic regulation of gene expression may function through post-transcriptional mechanisms. Overall, our work shows that naturally occurring genetic variants can impact ribosome occupancy in astrocytes in vivo and suggests that mechanisms may also play a role in genetic contributions to disease.


2021 ◽  
Author(s):  
Vinay K Kartha ◽  
Fabiana M Duarte ◽  
Yan Hu ◽  
Sai Ma ◽  
Jennifer G Chew ◽  
...  

Cells require coordinated control over gene expression when responding to environmental stimuli. Here, we apply scATAC-seq and scRNA-seq in resting and stimulated human blood cells. Collectively, we generate ~91,000 single-cell profiles, allowing us to probe the cis -regulatory landscape of immunological response across cell types, stimuli and time. Advancing tools to integrate multi-omic data, we develop FigR - a framework to computationally pair scATAC-seq with scRNA-seq cells, connect distal cis -regulatory elements to genes, and infer gene regulatory networks (GRNs) to identify candidate TF regulators. Utilizing these paired multi-omic data, we define Domains of Regulatory Chromatin (DORCs) of immune stimulation and find that cells alter chromatin accessibility prior to production of gene expression at time scales of minutes. Further, the construction of the stimulation GRN elucidates TF activity at disease-associated DORCs. Overall, FigR enables the elucidation of regulatory interactions across single-cell data, providing new opportunities to understand the function of cells within tissues.


2019 ◽  
Author(s):  
Florian Schmidt ◽  
Alexander Marx ◽  
Marie Hebel ◽  
Martin Wegner ◽  
Nina Baumgarten ◽  
...  

AbstractUnderstanding the complexity of transcriptional regulation is a major goal of computational biology. Because experimental linkage of regulatory sites to genes is challenging, computational methods considering epigenomics data have been proposed to create tissue-specific regulatory maps. However, we showed that these approaches are not well suited to account for the variations of the regulatory landscape between cell-types. To overcome these drawbacks, we developed a new method called STITCHIT, that identifies and links putative regulatory sites to genes. Within STITCHIT, we consider the chromatin accessibility signal of all samples jointly to identify regions exhibiting a signal variation related to the expression of a distinct gene. STITCHIToutperforms previous approaches in various validation experiments and was used with a genome-wide CRISPR-Cas9 screen to prioritize novel doxorubicin-resistance genes and their associated non-coding regulatory regions. We believe that our work paves the way for a more refined understanding of transcriptional regulation at the gene-level.


eLife ◽  
2018 ◽  
Vol 7 ◽  
Author(s):  
Gabriel N Aughey ◽  
Alicia Estacio Gomez ◽  
Jamie Thomson ◽  
Hang Yin ◽  
Tony D Southall

During development eukaryotic gene expression is coordinated by dynamic changes in chromatin structure. Measurements of accessible chromatin are used extensively to identify genomic regulatory elements. Whilst chromatin landscapes of pluripotent stem cells are well characterised, chromatin accessibility changes in the development of somatic lineages are not well defined. Here we show that cell-specific chromatin accessibility data can be produced via ectopic expression of E. coli Dam methylase in vivo, without the requirement for cell-sorting (CATaDa). We have profiled chromatin accessibility in individual cell-types of Drosophila neural and midgut lineages. Functional cell-type-specific enhancers were identified, as well as novel motifs enriched at different stages of development. Finally, we show global changes in the accessibility of chromatin between stem-cells and their differentiated progeny. Our results demonstrate the dynamic nature of chromatin accessibility in somatic tissues during stem cell differentiation and provide a novel approach to understanding gene regulatory mechanisms underlying development.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Jingxue Xin ◽  
Hui Zhang ◽  
Yaoxi He ◽  
Zhana Duren ◽  
Caijuan Bai ◽  
...  

Abstract High-altitude adaptation of Tibetans represents a remarkable case of natural selection during recent human evolution. Previous genome-wide scans found many non-coding variants under selection, suggesting a pressing need to understand the functional role of non-coding regulatory elements (REs). Here, we generate time courses of paired ATAC-seq and RNA-seq data on cultured HUVECs under hypoxic and normoxic conditions. We further develop a variant interpretation methodology (vPECA) to identify active selected REs (ASREs) and associated regulatory network. We discover three causal SNPs of EPAS1, the key adaptive gene for Tibetans. These SNPs decrease the accessibility of ASREs with weakened binding strength of relevant TFs, and cooperatively down-regulate EPAS1 expression. We further construct the downstream network of EPAS1, elucidating its roles in hypoxic response and angiogenesis. Collectively, we provide a systematic approach to interpret phenotype-associated noncoding variants in proper cell types and relevant dynamic conditions, to model their impact on gene regulation.


Science ◽  
2020 ◽  
Vol 370 (6518) ◽  
pp. eaba7612 ◽  
Author(s):  
Silvia Domcke ◽  
Andrew J. Hill ◽  
Riza M. Daza ◽  
Junyue Cao ◽  
Diana R. O’Day ◽  
...  

The chromatin landscape underlying the specification of human cell types is of fundamental interest. We generated human cell atlases of chromatin accessibility and gene expression in fetal tissues. For chromatin accessibility, we devised a three-level combinatorial indexing assay and applied it to 53 samples representing 15 organs, profiling ~800,000 single cells. We leveraged cell types defined by gene expression to annotate these data and cataloged hundreds of thousands of candidate regulatory elements that exhibit cell type–specific chromatin accessibility. We investigated the properties of lineage-specific transcription factors (such as POU2F1 in neurons), organ-specific specializations of broadly distributed cell types (such as blood and endothelial), and cell type–specific enrichments of complex trait heritability. These data represent a rich resource for the exploration of in vivo human gene regulation in diverse tissues and cell types.


Sign in / Sign up

Export Citation Format

Share Document