scholarly journals CharPlant: A De Novo Open Chromatin Region (OCR) Prediction Tool for Plant Genomes

2020 ◽  
Author(s):  
Yin Shen ◽  
Ling-Ling Chen ◽  
Junxiang Gao

AbstractChromatin accessibility is a highly informative structural feature for understanding gene transcription regulation because it indicates the degree to which nuclear macromolecules such as proteins and RNA can access chromosomal DNA. Studies show that chromatin accessibility is highly dynamic during stress response, stimulus response, and developmental transition. Moreover, physical access to chromosomal DNA in eukaryotes is highly cell-specific. Therefore, current technologies such as DNase-seq, ATAC-seq, and FAIRE-seq reveal only a portion of the open chromatin regions (OCRs) present in a given species. Thus, the genome-wide distribution of OCRs remains unknown. In this study, we developed a bioinformatics tool called CharPlant for the de novo prediction of chromatin accessible regions in plant genomes. To develop this tool, we constructed a three-layer convolutional neural network (CNN) and subsequently trained the CNN using DNase-seq and ATAC-seq datasets of four plant species. The model simultaneously learns the sequence motifs and regulatory logics, which are jointly used to determine DNA accessibility. All of these steps are integrated into CharPlant, which can be run using a simple command line. The results of data analysis using CharPlant in this study demonstrate its prediction power and computational efficiency. To our knowledge, CharPlant is the first de novo prediction tool that can identify potential OCRs in the whole genome. The source code of CharPlant and supporting files are freely downloadable from https://github.com/Yin-Shen/CharPlant.

2021 ◽  
Author(s):  
Chenshen Huang ◽  
Ning Wang ◽  
Na Zhang ◽  
Zhizhan Ni ◽  
Xiaohong Liu ◽  
...  

Background: Accumulating evidence suggests that inflammation-related genes may play key roles in tumor immune evasion. Programmed cell death ligand 1 (PD-L1) is an important immune checkpoint involved in mediating antitumor immunity. We performed multi-omics analysis to explore key inflammation-related genes affecting the transcriptional regulation of PD-L1 expression. Methods: The open chromatin region of the PD-L1 promoter was mapped using the assay for transposase-accessible chromatin using sequencing (ATAC-seq) profiles. Correlation analysis of epigenetic data (ATAC-seq) and transcriptome data (RNA-seq) were performed to identify inflammation-related transcription factors whose expression levels were correlated with the chromatin accessibility of the PD-L1 promoter. Chromatin immunoprecipitation sequencing (ChIP-seq) profiles were used to confirm the physical binding of the TF STAT2 and the predicted binding regions. We also confirmed the results of the bioinformatics analysis with cell experiments. Results: We identified chr9:5449463-5449962 and chr9:5450250-5450749 as reproducible open chromatin regions in the PD-L1 promoter. Moreover, we observed a correlation between STAT2 expression and the accessibility of the aforementioned regions. Furthermore, we confirmed its physical binding through ChIP-seq profiles and demonstrated the regulation of PD-L1 by STAT2 overexpression in vitro. Multiple databases were also used for the validation of the results. Conclusion: Our study identified STAT2 as a direct upstream TF regulating PD-L1 expression. The interaction of STAT2 and PD-L1 might be associated with tumor immune evasion in cancers, suggesting the potential value for tumor treatment.


2018 ◽  
Author(s):  
Michal Pawlak ◽  
Katarzyna Z. Kedzierska ◽  
Maciej Migdal ◽  
Karim Abu Nahia ◽  
Jordan A. Ramilowski ◽  
...  

ABSTRACTThe development of an organ involves dynamic regulation of gene transcription and complex multipathway interactions. To better understand transcriptional regulatory mechanism driving heart development and the consequences of its disruption, we isolated cardiomyocytes (CMs) from wild-type zebrafish embryos at 24, 48 and 72 hours post fertilization corresponding to heart looping, chamber formation and heart maturation, and from mutant lines carrying loss-of-function mutations in gata5, tbx5a and hand2, transcription factors (TFs) required for proper heart development. The integration of CM transcriptomics (RNA-seq) and genome-wide chromatin accessibility maps (ATAC-seq) unravelled dynamic regulatory networks driving crucial events of heart development. These networks contained key cardiac TFs including Gata5/6, Nkx2.5, Tbx5/20, and Hand2, and are associated with open chromatin regions enriched for DNA sequence motifs belonging to the family of the corresponding TFs. These networks were disrupted in cardiac TF mutants, indicating their importance in proper heart development. The most prominent gene expression changes, which correlated with chromatin accessibility modifications within their proximal promoter regions, occurred between heart looping and chamber formation, and were associated with metabolic and hematopoietic/cardiac switch during CM maturation. Furthermore, loss of function of cardiac TFs Gata5, Tbx5a, and Hand2 affected the cardiac regulatory networks and caused global changes in chromatin accessibility profile. Among regions with differential chromatin accessibility in mutants were highly conserved non-coding elements which represent putative cis regulatory elements with potential role in heart development and disease. Altogether, our results revealed the dynamic regulatory landscape at key stages of heart development and identified molecular drivers of heart morphogenesis.


2020 ◽  
Author(s):  
Minjun Park ◽  
Salvi Singh ◽  
Francisco Jose Grisanti Canozo ◽  
Md. Abul Hassan Samee

AbstractMassively parallel reporter assays (MPRAs) have enabled the study of transcriptional regulatory mechanisms at an unprecedented scale and with high quantitative resolution. However, this realm lacks models that can discover sequence-specific signals de novo from the data and integrate them in a mechanistic way. We present MuSeAM (Multinomial CNNs for Sequence Activity Modeling), a convolutional neural network that overcomes this gap. MuSeAM utilizes multinomial convolutions that directly model sequence-specific motifs of protein-DNA binding. We demonstrate that MuSeAM fits MPRA data with high accuracy and generalizes over other tasks such as predicting chromatin accessibility and prioritizing potentially functional variants.


2019 ◽  
Author(s):  
Eirene Markenscoff-Papadimitriou ◽  
Sean Whalen ◽  
Pawel Przytycki ◽  
Reuben Thomas ◽  
Fadya Binyameen ◽  
...  

AbstractGene expression differs between cell types and regions within complex tissues such as the developing brain. To discover regulatory elements underlying this specificity, we generated genome-wide maps of chromatin accessibility in eleven anatomically-defined regions of the developing human telencephalon, including upper and deep layers of the prefrontal cortex. We predicted a subset of open chromatin regions (18%) that are most likely to be active enhancers, many of which are dynamic with 26% differing between early and late mid-gestation and 28% present in only one brain region. These region-specific predicted regulatory elements (pREs) are enriched proximal to genes with expression differences across regions and developmental stages and harbor distinct sequence motifs that suggest potential upstream regulators of regional and temporal transcription. We leverage this atlas to identify regulators of genes associated with autism spectrum disorder (ASD) including an enhancer of BCL11A, validated in mouse, and two functional de novo mutations in individuals with ASD in an enhancer of SLC6A1, validated in neuroblastoma cells. These applications demonstrate the utility of this atlas for decoding neurodevelopmental gene regulation in health and disease.SummaryTo discover regulatory elements driving the specificity of gene expression in different cell types and regions of the developing human brain, we generated an atlas of open chromatin from eleven dissected regions of the mid-gestation human telencephalon, including upper and deep layers of the prefrontal cortex. We identified a subset of open chromatin regions (OCRs), termed predicted regulatory elements (pREs), that are likely to function as developmental brain enhancers. pREs showed regional differences in chromatin accessibility, including many specific to one brain region, and were correlated with gene expression differences across the same regions and gestational ages. pREs allowed us to map neurodevelopmental disorder risk genes to developing telencephalic regions, and we identified three functional de novo noncoding variants in pREs that alter enhancer function. In addition, transgenic experiments in mouse validated enhancer activity for a pRE proximal to BCL11A, showing how this atlas serves as a resource for decoding neurodevelopmental gene regulation in health and disease.


2017 ◽  
Author(s):  
Alicia N. Schep ◽  
Beijing Wu ◽  
Jason D. Buenrostro ◽  
William J. Greenleaf

AbstractSingle cell ATAC-seq (scATAC) yields sparse data that makes application of conventional computational approaches for data analysis challenging or impossible. We developed chromVAR, an R package for analyzing sparse chromatin accessibility data by estimating the gain or loss of accessibility within sets of peaks sharing the same motif or annotation while controlling for known technical biases. chromVAR enables accurate clustering of scATAC-seq profiles and enables characterization of known, or the de novo identification of novel, sequence motifs associated with variation in chromatin accessibility across single cells or other sparse epigenomic data sets.


2020 ◽  
Vol 22 (Supplement_2) ◽  
pp. ii76-ii76
Author(s):  
Radhika Mathur ◽  
Sriranga Iyyanki ◽  
Stephanie Hilz ◽  
Chibo Hong ◽  
Joanna Phillips ◽  
...  

Abstract Treatment failure in glioblastoma is often attributed to intratumoral heterogeneity (ITH), which fosters tumor evolution and generation of therapy-resistant clones. While ITH in glioblastoma has been well-characterized at the genomic and transcriptomic levels, the extent of ITH at the epigenomic level and its biological and clinical significance are not well understood. In collaboration with neurosurgeons, neuropathologists, and biomedical imaging experts, we have established a novel topographical approach towards characterizing epigenomic ITH in three-dimensional (3-D) space. We utilize pre-operative MRI scans to define tumor volume and then utilize 3-D surgical neuro-navigation to intra-operatively acquire 10+ samples representing maximal anatomical diversity. The precise spatial location of each sample is mapped by 3-D coordinates, enabling tumors to be visualized in 360-degrees and providing unprecedented insight into their spatial organization and patterning. For each sample, we conduct assay for transposase-accessible chromatin using sequencing (ATAC-Seq), which provides information on the genomic locations of open chromatin, DNA-binding proteins, and individual nucleosomes at nucleotide resolution. We additionally conduct whole-exome sequencing and RNA sequencing for each spatially mapped sample. Integrative analysis of these datasets reveals distinct patterns of chromatin accessibility within glioblastoma tumors, as well as their associations with genetically defined clonal expansions. Our analysis further reveals how differences in chromatin accessibility within tumors reflect underlying transcription factor activity at gene regulatory elements, including both promoters and enhancers, and drive expression of particular gene expression sets, including neuronal and immune programs. Collectively, this work provides the most comprehensive characterization of epigenomic ITH to date, establishing its importance for driving tumor evolution and therapy resistance in glioblastoma. As a resource for further investigation, we have provided our datasets on an interactive data sharing platform – The 3D Glioma Atlas – that enables 360-degree visualization of both genomic and epigenomic ITH.


2020 ◽  
Vol 94 ◽  
Author(s):  
D. Babaran ◽  
M.T. Arts ◽  
R.J. Botelho ◽  
S.A. Locke ◽  
J. Koprivnikar

Abstract The free-living infectious stages of macroparasites, specifically, the cercariae of trematodes (flatworms), are likely to be significant (albeit underappreciated) vectors of nutritionally important polyunsaturated fatty acids (PUFA) to consumers within aquatic food webs, and other macroparasites could serve similar roles. In the context of de novo omega-3 (n-3) PUFA biosynthesis, it was thought that most animals lack the fatty acid (FA) desaturase enzymes that convert stearic acid (18:0) into ɑ-linolenic acid (ALA; 18:3n-3), the main FA precursor for n-3 long-chain PUFA. Recently, novel sequences of these enzymes were recovered from 80 species from six invertebrate phyla, with experimental confirmation of gene function in five phyla. Given this wide distribution, and the unusual attributes of flatworm genomes, we conducted an additional search for genes for de novo n-3 PUFA in the phylum Platyhelminthes. Searches with experimentally confirmed sequences from Rotifera recovered nine relevant FA desaturase sequences from eight species in four genera in the two exclusively endoparasite classes (Trematoda and Cestoda). These results could indicate adaptations of these particular parasite species, or may reflect the uneven taxonomic coverage of sequence databases. Although additional genomic data and, particularly, experimental study of gene functionality are important future validation steps, our results indicate endoparasitic platyhelminths may have enzymes for de novo n-3 PUFA biosynthesis, thereby contributing to global PUFA production, but also representing a potential target for clinical antihelmintic applications.


2021 ◽  
Author(s):  
Eleonora Forte ◽  
Fatma Ayaloglu Butun ◽  
Christian Marinaccio ◽  
Matthew J. Schipma ◽  
Andrea Piunti ◽  
...  

HCMV establishes latency in myeloid cells. Using the Kasumi-3 latency model, we previously showed that lytic gene expression is activated prior to establishment of latency in these cells. The early events in infection may have a critical role in shaping establishment of latency. Here, we have used an integrative multi-omics approach to investigate dynamic changes in host and HCMV gene expression and epigenomes at early times post infection. Our results show dynamic changes in viral gene expression and viral chromatin. Analyses of Pol II, H3K27Ac and H3K27me3 occupancy of the viral genome showed that 1) Pol II occupancy was highest at the MIEP at 4 hours post infection. However, it was observed throughout the genome; 2) At 24 hours, H3K27Ac was localized to the major immediate early promoter/enhancer and to a possible second enhancer in the origin of replication OriLyt; 3) viral chromatin was broadly accessible at 24 hpi. In addition, although HCMV infection activated expression of some host genes, we observed an overall loss of de novo transcription. This was associated with loss of promoter-proximal Pol II and H3K27Ac, but not with changes in chromatin accessibility or a switch in modification of H3K27. Importance. HCMV is an important human pathogen in immunocompromised hosts and developing fetuses. Current anti-viral therapies are limited by toxicity and emergence of resistant strains. Our studies highlight emerging concepts that challenge current paradigms of regulation of HCMV gene expression in myeloid cells. In addition, our studies show that HCMV has a profound effect on de novo transcription and the cellular epigenome. These results may have implications for mechanisms of viral pathogenesis.


2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Karen L. Leung ◽  
Smriti Sanchita ◽  
Catherine T. Pham ◽  
Brett A. Davis ◽  
Mariam Okhovat ◽  
...  

Abstract Background Normal-weight polycystic ovary syndrome (PCOS) women exhibit adipose resistance in vivo accompanied by enhanced subcutaneous (SC) abdominal adipose stem cell (ASC) development to adipocytes with accelerated lipid accumulation per cell in vitro. The present study examines chromatin accessibility, RNA expression and fatty acid (FA) synthesis during SC abdominal ASC differentiation into adipocytes in vitro of normal-weight PCOS versus age- and body mass index-matched normoandrogenic ovulatory (control) women to study epigenetic/genetic characteristics as well as functional alterations of PCOS and control ASCs during adipogenesis. Results SC abdominal ASCs from PCOS women versus controls exhibited dynamic chromatin accessibility during adipogenesis, from significantly less chromatin accessibility at day 0 to greater chromatin accessibility by day 12, with enrichment of binding motifs for transcription factors (TFs) of the AP-1 subfamily at days 0, 3, and 12. In PCOS versus control cells, expression of genes governing adipocyte differentiation (PPARγ, CEBPα, AGPAT2) and function (ADIPOQ, FABP4, LPL, PLIN1, SLC2A4) was increased two–sixfold at days 3, 7, and 12, while that involving Wnt signaling (FZD1, SFRP1, and WNT10B) was decreased. Differential gene expression in PCOS cells at these time points involved triacylglycerol synthesis, lipid oxidation, free fatty acid beta-oxidation, and oxidative phosphorylation of the TCA cycle, with TGFB1 as a significant upstream regulator. There was a broad correspondence between increased chromatin accessibility and increased RNA expression of those 12 genes involved in adipocyte differentiation and function, Wnt signaling, as well as genes involved in the triacylglycerol synthesis functional group at day 12 of adipogenesis. Total content and de novo synthesis of myristic (C14:0), palmitic (C16:0), palmitoleic (C16:1), and oleic (C18:1) acid increased from day 7 to day 12 in all cells, with total content and de novo synthesis of FAs significantly greater in PCOS than controls cells at day 12. Conclusions In normal-weight PCOS women, dynamic chromatin remodeling of SC abdominal ASCs during adipogenesis may enhance adipogenic gene expression as a programmed mechanism to promote greater fat storage.


Sign in / Sign up

Export Citation Format

Share Document