Predicting chromatin interactions between open chromatin regions from DNA sequences

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is very limited. Various computational methods have been developed to predict chromatin interactions. Most of these methods rely on large collections of ChIP-Seq/RNA-Seq/DNase-Seq datasets and predict only enhancer-promoter interactions. Some of the ‘state-of-the-art’ methods have poor experimental designs, leading to over-exaggerated performances and misleading conclusions. Here we developed a computational method, Chromatin Interaction Neural Network (CHINN), to predict chromatin interactions between open chromatin regions by using only DNA sequences of the interacting open chromatin regions. CHINN is able to predict CTCF- and RNA polymerase II-associated chromatin interactions between open chromatin regions. CHINN also shows good across-sample performances and captures various sequence features that are predictive of chromatin interactions. We applied CHINN to 84 chronic lymphocytic leukemia (CLL) samples and detected systematic differences in the chromatin interactome between IGVH-mutated and IGVH-unmutated CLL samples.

Download Full-text

Chromatin interaction neural network (ChINN): a machine learning-based method for predicting chromatin interactions from DNA sequences

Genome Biology ◽

10.1186/s13059-021-02453-5 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Fan Cao ◽

Yu Zhang ◽

Yichao Cai ◽

Sambhavi Animesh ◽

Ying Zhang ◽

...

Keyword(s):

Neural Network ◽

Rna Polymerase Ii ◽

Dna Sequences ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Interaction Prediction ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is limited. We develop a computational method, chromatin interaction neural network (ChINN), to predict chromatin interactions between open chromatin regions using only DNA sequences. ChINN predicts CTCF- and RNA polymerase II-associated and Hi-C chromatin interactions. ChINN shows good across-sample performances and captures various sequence features for chromatin interaction prediction. We apply ChINN to 6 chronic lymphocytic leukemia (CLL) patient samples and a published cohort of 84 CLL open chromatin samples. Our results demonstrate extensive heterogeneity in chromatin interactions among CLL patient samples.

Download Full-text

Chromatin Interaction Neural Network (ChINN): A machine learning-based method for predicting chromatin interactions from DNA sequences

10.1101/2020.12.30.424817 ◽

2020 ◽

Author(s):

Fan Cao ◽

Yu Zhang ◽

Yichao Cai ◽

Sambhavi Animesh ◽

Ying Zhang ◽

...

Keyword(s):

Neural Network ◽

Rna Polymerase Ii ◽

Dna Sequences ◽

Lymphocytic Leukemia ◽

Computational Method ◽

Chromatin Interaction ◽

Open Chromatin ◽

Clinical Patient ◽

Chromatin Interactions ◽

Genome Wide

AbstractChromatin interactions play important roles in regulating gene expression. However, the availability of genome-wide chromatin interaction data is limited. Various computational methods have been developed to predict chromatin interactions. Most of these methods rely on large collections of ChIP-Seq/RNA-Seq/DNase-Seq datasets and predict only enhancer-promoter interactions. Some of the ‘state-of-the-art’ methods have poor experimental designs, leading to over-exaggerated performances and misleading conclusions. Here we developed a computational method, Chromatin Interaction Neural Network (ChINN), to predict chromatin interactions between open chromatin regions by using only DNA sequences of the interacting open chromatin regions. ChINN is able to predict CTCF-, RNA polymerase II- and HiC-associated chromatin interactions between open chromatin regions. ChINN also shows good across-sample performances and captures various sequence features that are predictive of chromatin interactions. To apply our results to clinical patient data, we applied CHINN to predict chromatin interactions in 6 chronic lymphocytic leukemia (CLL) patient samples and a cohort of open chromatin data from 84 CLL samples that was previously published. Our results demonstrated extensive heterogeneity in chromatin interactions in patient samples, and one of the sources of this heterogeneity were the different subtypes of CLL.

Download Full-text

An integrative approach for fine-mapping chromatin interactions

Bioinformatics ◽

10.1093/bioinformatics/btz843 ◽

2019 ◽

Vol 36 (6) ◽

pp. 1704-1711

Author(s):

Artur Jaroszewicz ◽

Jason Ernst

Keyword(s):

Gene Regulation ◽

High Resolution ◽

Biological Significance ◽

Computational Method ◽

Supplementary Information ◽

Integrative Approach ◽

Genome Architecture ◽

Open Chromatin ◽

Chromatin Interactions ◽

Genome Wide

Abstract Motivation Chromatin interactions play an important role in genome architecture and gene regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g. 5-25 kb), which is substantially coarser than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. Results To predict the sources of Hi-C-identified interactions at a high resolution (e.g. 100 bp), we developed a computational method that integrates data from DNase-seq and ChIP-seq of TFs and histone marks. Our method, χ-CNN, uses this data to first train a convolutional neural network (CNN) to discriminate between called Hi-C interactions and non-interactions. χ-CNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also show χ-CNN predictions enrich for evolutionarily conserved bases, eQTLs and CTCF motifs, supporting their biological significance. χ-CNN provides an approach for analyzing important aspects of genome architecture and gene regulation at a higher resolution than previously possible. Availability and implementation χ-CNN software is available on GitHub (https://github.com/ernstlab/X-CNN). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

An Integrative Approach for Fine-Mapping Chromatin Interactions

10.1101/605576 ◽

2019 ◽

Author(s):

Artur Jaroszewicz ◽

Jason Ernst

Keyword(s):

High Resolution ◽

Binding Sites ◽

Biological Significance ◽

Computational Method ◽

Integrative Approach ◽

Genome Architecture ◽

Open Chromatin ◽

Chromatin Interactions ◽

Genome Wide ◽

Evolutionarily Conserved

AbstractChromatin interactions play an important role in genome architecture and regulation. The Hi-C assay generates such interactions maps genome-wide, but at relatively low resolutions (e.g., 5-25kb), which is substantially larger than the resolution of transcription factor binding sites or open chromatin sites that are potential sources of such interactions. To predict the sources of Hi-C identified interactions at a high resolution (e.g., 100bp), we developed a computational method that integrates ChIP-seq data of transcription factors and histone marks and DNase-seq data. Our method,χ-SCNN, uses this data to first train a Siamese Convolutional Neural Network (SCNN) to discriminate between called Hi-C interactions and non-interactions.χ-SCNN then predicts the high-resolution source of each Hi-C interaction using a feature attribution method. We show these predictions recover original Hi-C peaks after extending them to be coarser. We also showχ-SCNN predictions enrich for evolutionarily conserved bases, eQTLs, and CTCF motifs, supporting their biological significance.χ-SCNN provides an approach for analyzing important aspects of genome architecture and regulation at a higher resolution than previously possible.χ-SCNN software is available on GitHub (https://github.com/ernstlab/X-SCNN).

Download Full-text

Distinct regulation of hippocampal neuroplasticity and ciliary genes by corticosteroid receptors

Nature Communications ◽

10.1038/s41467-021-24967-z ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Karen R. Mifsud ◽

Clare L. M. Kennedy ◽

Silvia Salatino ◽

Eshita Sharma ◽

Emily M. Price ◽

...

Keyword(s):

Dna Sequences ◽

Glucocorticoid Receptors ◽

Acute Stress ◽

Circadian Variation ◽

Rna Seq ◽

Physiological Regulation ◽

Behavioural Adaptation ◽

Neuronal Progenitor ◽

Genome Wide ◽

Transcriptional Changes

AbstractGlucocorticoid hormones (GCs) — acting through hippocampal mineralocorticoid receptors (MRs) and glucocorticoid receptors (GRs) — are critical to physiological regulation and behavioural adaptation. We conducted genome-wide MR and GR ChIP-seq and Ribo-Zero RNA-seq studies on rat hippocampus to elucidate MR- and GR-regulated genes under circadian variation or acute stress. In a subset of genes, these physiological conditions resulted in enhanced MR and/or GR binding to DNA sequences and associated transcriptional changes. Binding of MR at a substantial number of sites however remained unchanged. MR and GR binding occur at overlapping as well as distinct loci. Moreover, although the GC response element (GRE) was the predominant motif, the transcription factor recognition site composition within MR and GR binding peaks show marked differences. Pathway analysis uncovered that MR and GR regulate a substantial number of genes involved in synaptic/neuro-plasticity, cell morphology and development, behavior, and neuropsychiatric disorders. We find that MR, not GR, is the predominant receptor binding to >50 ciliary genes; and that MR function is linked to neuronal differentiation and ciliogenesis in human fetal neuronal progenitor cells. These results show that hippocampal MRs and GRs constitutively and dynamically regulate genomic activities underpinning neuronal plasticity and behavioral adaptation to changing environments.

Download Full-text

HiCRes: a computational method to estimate and predict the resolution of HiC libraries

10.1101/2020.09.22.307967 ◽

2020 ◽

Author(s):

Claire Marchal ◽

Nivedita Singh ◽

Ximena Corso-Díaz ◽

Anand Swaroop

Keyword(s):

Expression Patterns ◽

Three Dimensional ◽

Computational Method ◽

Mathematical Concepts ◽

Regulate Gene Expression ◽

Chromatin Interactions ◽

Genome Wide ◽

A Cell ◽

Cell Type Specific ◽

Human And Mouse

AbstractThree-dimensional (3D) conformation of the chromatin is crucial to stringently regulate gene expression patterns and DNA replication in a cell-type specific manner. HiC is a key technique for measuring 3D chromatin interactions genome wide. Estimating and predicting the resolution of a library is an essential step in any HiC experimental design. Here, we present the mathematical concepts to estimate the resolution of a library and predict whether deeper sequencing would enhance the resolution. We have developed HiCRes, a docker pipeline, by applying these concepts to human and mouse HiC libraries.

Download Full-text

Genome-wide profiling of histone 3 lysine 36 trimethylation in clear cell renal cell carcinoma.

Journal of Clinical Oncology ◽

10.1200/jco.2014.32.4_suppl.464 ◽

2014 ◽

Vol 32 (4_suppl) ◽

pp. 464-464

Author(s):

Thai Huu Ho ◽

Jeong-Heon Lee ◽

Rafael Nunez Nateras ◽

Erik P. Castle ◽

Melissa L. Stanton ◽

...

Keyword(s):

Cell Lines ◽

Dna Sequences ◽

Open Chromatin ◽

Sequencing Analysis ◽

Gene Desert ◽

Cell Renal Cell Carcinoma ◽

Chip Sequencing ◽

Genome Wide ◽

A Genome ◽

Genomic Regions

464 Background: Although the von Hippel-Lindau (VHL) tumor suppressor gene is mutated in 60% of ccRCC, deletion of VHL in mice is insufficient for tumorigenesis. Sequencing of ccRCC tumors identified mutations in SETD2, a histone H3 lysine 36 (H3K36) trimethyltransferase. We hypothesize that loss of SETD2 methyltransferase activity alters the genome wide pattern of H3K36 trimethylation (H3K36me3) in ccRCC, and contributes to the cancer phenotype. Methods: To generate a genome-wide profile of H3K36me3 in frozen nephrectomy samples and RCC cell lines, we optimized a chromatin immunoprecipitation (ChIP) protocol for the isolation of DNA associated with H3K36me3. H3K36me3 is associated with open chromatin and an H3K36me3-specific antibody was used for immunoprecipitation of endogenous H3K36me3-bound DNA. ChIP PCR primers were optimized for active genes, such as actin, glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and a “gene desert” on chromosome 12 (negative control). ChIP libraries were then generated from 3 paired uninvolved kidney and RCC and 2 RCC cell lines. In order to identify H3K36Me3 upregulated regions in uninvolved kidney and RCC, reads from the ChIP sequencing were mapped to the human genome using Burrows-Wheeler Aligner and SICER algorithms. Results: Using ChIP PCR, we found that active genomic regions were enriched 15-30 fold over the negative controls indicating that the quality and yield of immunoprecipitated DNA/chromatin complexes from frozen tissue was sufficient for ChIP sequencing. A preliminary ChIP sequencing analysis of RCC cell lines and frozen ccRCC tissue indicates that H3K36me3 enriched DNA sequences were mapped to exons (31.3%) compared to introns (13.5%, p<0.001), consistent with the role of H3K36me3 in transcription. Conclusions: Genomic regions enriched for H3K36Me3 binding were identified from patient-derived tissue and RCC cell lines. Current efforts are focused on comparing the H3K36me3 profiles between matched tumor and uninvolved kidney ChIP libraries to generate a genome wide map of dysregulated H3K36me3 modifications.

Download Full-text

RUNX1-mediated alphaherpesvirus-host trans-species chromatin interaction promotes viral transcription

Science Advances ◽

10.1126/sciadv.abf8962 ◽

2021 ◽

Vol 7 (26) ◽

pp. eabf8962

Author(s):

Ke Xiao ◽

Dan Xiong ◽

Gong Chen ◽

Jinsong Yu ◽

Yue Li ◽

...

Keyword(s):

Pseudorabies Virus ◽

Host Cells ◽

Chromatin Interaction ◽

Open Chromatin ◽

Chromosome Conformation ◽

Dna Viruses ◽

Genome Wide ◽

A Genome ◽

Genome Transcription ◽

Active Transcription

Like most DNA viruses, herpesviruses precisely deliver their genomes into the sophisticatedly organized nuclei of the infected host cells to initiate subsequent transcription and replication. However, it remains elusive how the viral genome specifically interacts with the host genome and hijacks host transcription machinery. Using pseudorabies virus (PRV) as model virus, we performed chromosome conformation capture assays to demonstrate a genome-wide specific trans-species chromatin interaction between the virus and host. Our data show that the PRV genome is delivered by the host DNA binding protein RUNX1 into the open chromatin and active transcription zone. This facilitates virus hijacking host RNAPII to efficiently transcribe viral genes, which is significantly inhibited by either a RUNX1 inhibitor or RNA interference. Together, these findings provide insights into the chromatin interaction between viral and host genomes and identify new areas of research to advance the understanding of herpesvirus genome transcription.

Download Full-text

DiNeR: a Differential Graphical Model for analysis of co-regulation Network Rewiring

10.1101/2020.05.29.124164 ◽

2020 ◽

Author(s):

Jing Zhang ◽

Jason Liu ◽

Donghoon Lee ◽

Shaoke Lou ◽

Zhanlin Chen ◽

...

Keyword(s):

Rna Polymerase Ii ◽

Regulatory Networks ◽

Graphical Model ◽

Computational Method ◽

Genome Wide ◽

Gm12878 Cell ◽

Differential Network ◽

Regulation Network ◽

Gene Expression Alterations ◽

Network Rewiring

AbstractBackgroundDuring transcription, numerous transcription factors (TFs) bind to targets in a highly coordinated manner to control the gene expression. Alterations in groups of TF-binding profiles (i.e. “co-binding changes”) can affect the co-regulating associations between TFs (i.e. “rewiring the co-regulator network”). This, in turn, can potentially drive downstream expression changes, phenotypic variation, and even disease. However, quantification of co-regulatory network rewiring has not been comprehensively studied.MethodsTo address this, we propose DiNeR, a computational method to directly construct a differential TF co-regulation network from paired disease-to-normal ChIP-seq data. Specifically, DiNeR uses a graphical model to capture the gained and lost edges in the co-regulation network. Then, it adopts a stability-based, sparsity-tuning criterion -- by sub-sampling the complete binding profiles to remove spurious edges -- to report only significant co-regulation alterations. Finally, DiNeR highlights hubs in the resultant differential network as key TFs associated with disease.ResultsWe assembled genome-wide binding profiles of 104 TFs in the K562 and GM12878 cell lines, which loosely model the transition between normal and cancerous states in chronic myeloid leukemia (CML). In total, we identified 351 significantly altered TF co-regulation pairs. In particular, we found that the co-binding of the tumor suppressor BRCA1 and RNA polymerase II, a well-known transcriptional pair in healthy cells, was disrupted in tumors. Thus, DiNeR successfully extracted hub regulators and discovered well-known risk genes.ConclusionsOur method DiNeR makes it possible to quantify changes in co-regulatory networks and identify alterations to TF co-binding patterns, highlighting key disease regulators. Our method DiNeR makes it possible to quantify changes in co-regulatory networks and identify alterations to TF co-binding patterns, highlighting key disease regulators.

Download Full-text

ChIAMM: A Mixture Model for Statistical Analysis of Long-Range Chromatin Interactions From ChIA-PET Experiments

Frontiers in Genetics ◽

10.3389/fgene.2020.616160 ◽

2020 ◽

Vol 11 ◽

Author(s):

Yibeltal Arega ◽

Hao Jiang ◽

Shuangqi Wang ◽

Jingwen Zhang ◽

Xiaohui Niu ◽

...

Keyword(s):

Mixture Model ◽

Interaction Analysis ◽

Gc Content ◽

Specific Protein ◽

Chromatin Interaction ◽

New Approach ◽

Chromatin Interactions ◽

Genome Wide ◽

Systematic Biases ◽

Local Enrichment

Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is an important experimental method for detecting specific protein-mediated chromatin loops genome-wide at high resolution. Here, we proposed a new statistical approach with a mixture model, chromatin interaction analysis using mixture model (ChIAMM), to detect significant chromatin interactions from ChIA-PET data. The statistical model is cast into a Bayesian framework to consider more systematic biases: the genomic distance, local enrichment, mappability, and GC content. Using different ChIA-PET datasets, we evaluated the performance of ChIAMM and compared it with the existing methods, including ChIA-PET Tool, ChiaSig, Mango, ChIA-PET2, and ChIAPoP. The result showed that the new approach performed better than most top existing methods in detecting significant chromatin interactions in ChIA-PET experiments.

Download Full-text