scholarly journals ChIA-PIPE: A fully automated pipeline for ChIA-PET data analysis and visualization

2018 ◽  
Author(s):  
Daniel Capurso ◽  
Jiahui Wang ◽  
Simon Zhongyuan Tian ◽  
Liuyang Cai ◽  
Sandeep Namburi ◽  
...  

AbstractChIA-PET enables the genome-wide discovery of chromatin interactions involving specific protein factors, with base-pair resolution. Interpreting ChIA-PET data depends on having a robust analytic pipeline. Here, we introduce ChIA-PIPE, a fully automated pipeline for ChIA-PET data processing, quality assessment, analysis, and visualization. ChIA-PIPE performs linker filtering, read mapping, peak calling, loop calling, chromatin-contact-domain calling, and can resolve allele-specific peaks and loops. ChIA-PIPE also automates quality-control assessment for each dataset. Furthermore, ChIA-PIPE generates input files for visualizing 2D contact maps with Juicebox and HiGlass, and provides a new dockerized visualization tool for high-resolution, browser-based exploration of peaks and loops. With minimal adjusting, ChIA-PIPE can also be suited for the analysis of other related chromatin-mapping data.

2020 ◽  
Vol 6 (28) ◽  
pp. eaay2078 ◽  
Author(s):  
Byoungkoo Lee ◽  
Jiahui Wang ◽  
Liuyang Cai ◽  
Minji Kim ◽  
Sandeep Namburi ◽  
...  

ChIA-PET (chromatin interaction analysis with paired-end tags) enables genome-wide discovery of chromatin interactions involving specific protein factors, with base pair resolution. Interpretation of ChIA-PET data requires a robust analytic pipeline. Here, we introduce ChIA-PIPE, a fully automated pipeline for ChIA-PET data processing, quality assessment, visualization, and analysis. ChIA-PIPE performs linker filtering, read mapping, peak calling, and loop calling and automates quality control assessment for each dataset. To enable visualization, ChIA-PIPE generates input files for two-dimensional contact map viewing with Juicebox and HiGlass and provides a new dockerized visualization tool for high-resolution, browser-based exploration of peaks and loops. To enable structural interpretation, ChIA-PIPE calls chromatin contact domains, resolves allele-specific peaks and loops, and annotates enhancer-promoter loops. ChIA-PIPE also supports the analysis of other related chromatin-mapping data types.


2020 ◽  
Vol 11 ◽  
Author(s):  
Yibeltal Arega ◽  
Hao Jiang ◽  
Shuangqi Wang ◽  
Jingwen Zhang ◽  
Xiaohui Niu ◽  
...  

Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) is an important experimental method for detecting specific protein-mediated chromatin loops genome-wide at high resolution. Here, we proposed a new statistical approach with a mixture model, chromatin interaction analysis using mixture model (ChIAMM), to detect significant chromatin interactions from ChIA-PET data. The statistical model is cast into a Bayesian framework to consider more systematic biases: the genomic distance, local enrichment, mappability, and GC content. Using different ChIA-PET datasets, we evaluated the performance of ChIAMM and compared it with the existing methods, including ChIA-PET Tool, ChiaSig, Mango, ChIA-PET2, and ChIAPoP. The result showed that the new approach performed better than most top existing methods in detecting significant chromatin interactions in ChIA-PET experiments.


2020 ◽  
Vol 48 (21) ◽  
pp. e123-e123
Author(s):  
Tiantian Ye ◽  
Wenxiu Ma

Abstract The recently developed Hi-C technique has been widely applied to map genome-wide chromatin interactions. However, current methods for analyzing diploid Hi-C data cannot fully distinguish between homologous chromosomes. Consequently, the existing diploid Hi-C analyses are based on sparse and inaccurate allele-specific contact matrices, which might lead to incorrect modeling of diploid genome architecture. Here we present ASHIC, a hierarchical Bayesian framework to model allele-specific chromatin organizations in diploid genomes. We developed two models under the Bayesian framework: the Poisson-multinomial (ASHIC-PM) model and the zero-inflated Poisson-multinomial (ASHIC-ZIPM) model. The proposed ASHIC methods impute allele-specific contact maps from diploid Hi-C data and simultaneously infer allelic 3D structures. Through simulation studies, we demonstrated that ASHIC methods outperformed existing approaches, especially under low coverage and low SNP density conditions. Additionally, in the analyses of diploid Hi-C datasets in mouse and human, our ASHIC-ZIPM method produced fine-resolution diploid chromatin maps and 3D structures and provided insights into the allelic chromatin organizations and functions. To summarize, our work provides a statistically rigorous framework for investigating fine-scale allele-specific chromatin conformations. The ASHIC software is publicly available at https://github.com/wmalab/ASHIC.


2020 ◽  
Author(s):  
Huiling Liu ◽  
Wenxiu Ma

AbstractRecent advances in Hi-C techniques have allowed us to map genome-wide chromatin interactions and uncover higher-order chromatin structures, thereby shedding light on the principles of genome architecture and functions. However, statistical methods for detecting changes in large-scale chromatin organization such as topologically-associating domain (TAD) are still lacking. We proposed a new statistical method, DiffGR, for detecting differentially interacting genomic regions at the TAD level between Hi-C contact maps. We utilized the stratum-adjusted correlation coefficient (SCC) to measure similarity of local TAD regions. We then developed a nonparametric approach to identify statistically significant changes of genomic interacting regions. Through simulation studies, we demonstrated that DiffGR can robustly and effectively discover differential genomic regions under various conditions. Furthermore, we successfully revealed cell type-specific changes in genomic interacting regions using real Hi-C datasets. DiffGR is publicly available at https://github.com/wmalab/DiffGR.


2008 ◽  
Vol 5 (4) ◽  
pp. 307-309 ◽  
Author(s):  
Nathaniel D Maynard ◽  
Jing Chen ◽  
Rhona K Stuart ◽  
Jian-Bing Fan ◽  
Bing Ren

2018 ◽  
Author(s):  
Ei-Wen Yang ◽  
Jae Hoon Bahn ◽  
Esther Yun-Hua Hsiao ◽  
Boon Xin Tan ◽  
Yiwei Sun ◽  
...  

AbstractAllele-specific protein-RNA binding is an essential aspect that may reveal functional genetic variants influencing RNA processing and gene expression phenotypes. Recently, genome-wide detection of in vivo binding sites of RNA binding proteins (RBPs) is greatly facilitated by the enhanced UV crosslinking and immunoprecipitation (eCLIP) protocol. Hundreds of eCLIP-Seq data sets were generated from HepG2 and K562 cells during the ENCODE3 phase. These data afford a valuable opportunity to examine allele-specific binding (ASB) of RBPs. To this end, we developed a new computational algorithm, called BEAPR (Binding Estimation of Allele-specific Protein-RNA interaction). In identifying statistically significant ASB sites, BEAPR takes into account UV cross-linking induced sequence propensity and technical variations between replicated experiments. Using simulated data and actual eCLIP-Seq data, we show that BEAPR largely outperforms often-used methods Chi-Squared test and Fisher’s Exact test. Importantly, BEAPR overcomes the inherent over-dispersion problem of the other methods. Complemented by experimental validations, we demonstrate that ASB events are significantly associated with genetic regulation of splicing and mRNA abundance, supporting the usage of this method to pinpoint functional genetic variants in post-transcriptional gene regulation. Many variants with ASB patterns of RBPs were found as genetic variants with cancer or other disease relevance. About 38% of ASB variants were in linkage disequilibrium with single nucleotide polymorphisms from genome-wide association studies. Overall, our results suggest that BEAPR is an effective method to reveal ASB patterns in eCLIP and can inform functional interpretation of disease-related genetic variants.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 1295 ◽  
Author(s):  
Ruben Esse

In recent years, epigenetic research has enjoyed explosive growth as high-throughput sequencing technologies become more accessible and affordable. However, this advancement has not been matched with similar progress in data analysis capabilities from the perspective of experimental biologists not versed in bioinformatic languages. For instance, chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) is at present widely used to identify genomic loci of transcription factor binding and histone modifications. Basic ChIP-seq data analysis, including read mapping and peak calling, can be accomplished through several well-established tools, but more sophisticated analyzes aimed at comparing data derived from different conditions or experimental designs constitute a significant bottleneck. We reason that the implementation of a single comprehensive ChIP-seq analysis pipeline could be beneficial for many experimental (wet lab) researchers who would like to generate genomic data. Here we present ChIPdig, a stand-alone application with adjustable parameters designed to allow researchers to perform several analyzes, namely read mapping to a reference genome, peak calling, annotation of regions based on reference coordinates (e.g. transcription start and termination sites, exons, introns, and 5' and 3' untranslated regions), and generation of heatmaps and metaplots for visualizing coverage. Importantly, ChIPdig accepts multiple ChIP-seq datasets as input, allowing genome-wide differential enrichment analysis in regions of interest to be performed. ChIPdig is written in R and enables access to several existing and highly utilized packages through a simple user interface powered by the Shiny package. Here, we illustrate the utility and user-friendly features of ChIPdig by analyzing H3K36me3 and H3K4me3 ChIP-seq profiles generated by the modENCODE project as an example. ChIPdig offers a comprehensive and user-friendly pipeline for analysis of multiple sets of ChIP-seq data by both experimental and computational researchers. It is open source and available at https://github.com/rmesse/ChIPdig.


2019 ◽  
Author(s):  
Yizhou Zhu ◽  
Yousin Suh

AbstractThe resolution limit of chromatin conformation capture methodologies (3Cs) has restrained their application in detection of fine-level chromatin structure mediated by cis-regulatory elements (CREs). Here we report two 3C-derived methods, Tri-4C and Tri-HiC, which utilize mult-restriction enzyme digestions for ultrafine mapping of targeted and genome-wide chromatin interaction, respectively, at up to one hundred basepair resolution. Tri-4C identified CRE loop interaction networks and quantifatively revealed their alterations underlying dynamic gene control. Tri-HiC uncovered global fine-gage regulatory interaction networks, identifying > 20-fold more enhancer:promoter (E:P) loops than in situ HiC. In addition to vasly improved identification of subkilobase-sized E:P loops, Tri-HiC also uncovered interaction stripes and contact domain insulation from promoters and enhancers, revealing their loop extrusion behaviors resembling the topologically-associated domain (TAD) boundaries. Tri-4C and Tri-HiC provide robust approaches to achieve the high resolution interactome maps required for characterizing fine-gage regulatory chromatin interactions in analysis of development, homeostasis and disease.


2017 ◽  
Author(s):  
Ruben Esse ◽  
Alla Grishok

AbstractBackgroundIn recent years, epigenetic research has enjoyed explosive growth as high-throughput sequencing technologies become more accessible and affordable. However, this advancement has not been matched with similar progress in data analysis capabilities from the perspective of experimental biologists not versed in bioinformatic languages. For instance, chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq) is at present widely used to identify genomic loci of transcription factor binding and histone modifications. Basic ChIP-seq data analysis, including read mapping and peak calling, can be accomplished through several well-established tools, but more sophisticated analyzes aimed at comparing data derived from different conditions or experimental designs constitute a significant bottleneck. We reason that the implementation of a single comprehensive ChIP-seq analysis pipeline could be beneficial for many experimental (wet lab) researchers who would like to generate genomic data.ResultsHere we present ChIPdig, a stand-alone application with adjustable parameters designed to allow researchers to perform several analyzes, namely read mapping to a reference genome, peak calling, annotation of regions based on reference coordinates (e.g. transcription start and termination sites, exons, introns, 5′ UTRs and 3′ UTRs), and generation of heatmaps and metaplots for visualizing coverage. Importantly, ChIPdig accepts multiple ChIP-seq datasets as input, allowing genome-wide differential enrichment analysis in regions of interest to be performed. ChIPdig is written in R and enables access to several existing and highly utilized packages through a simple user interface powered by the Shiny package. Here, we illustrate the utility and user-friendly features of ChIPdig by analyzing H3K36me3 and H3K4me3 ChIP-seq profiles generated by the modENCODE project as an example.ConclusionsChIPdig offers a comprehensive and user-friendly pipeline for analysis of multiple sets of ChIP-seq data by both experimental and computational researchers. It is open source and available at https://github.com/rmesse/ChIPdig.


Sign in / Sign up

Export Citation Format

Share Document